This was my talk at Data Day 5.0 at Carleton University on 5/Jun/2018.
One of the applications of Big Data and E-commerce that we have been working on is a product recommender. While the Universal Recommender is independent of e-commerce platform, we tested it with Oracle ATG Web Commerce and SAP Hybris Commerce.
The Universal Recommender models any number of product properties and user properties, as well as user actions in relation to products such as 'view-product', 'add-to cart' and 'purchase'.
Often in e-commerce, only a portion of product catalogues change frequently, like price and stock, while new shopper events appear gradually. One of the things we would like to explore is reinforcement learning, trying to model deltas and events for better use of computation resources.
One problem in e-commerce that may be common to other domains in Machine Learning is that of data modeling. While the Universal Recommender leverages product and user features, the data representation cannot leverage the inherent hierarchical property of product categories, for instance.
Another challenge with data representation is that of base products versus variants. For example, a shirt can be a base product, but the buying unit, or SKU, may be a large green shirt. And products can also be configurable, like a hamburger.
I'd like to list other functionality where Big Data and Machine Learning can help e-commerce systems. Search results can be seen as the output of a recommender system. When APIs are exposed on the Internet, hackers can try to steal login information or reward points or abuse of promotions. Machine Learning can be used to detect malicious requests. This problem is not specific to e-commerce: the approach can be used in micro-services in general. Machine Learning can also be used to better detect credit card fraud. Another challenge in e-commerce is the verification of product reviews. Detecting cart abandonment is another application of Machine Learning. In Product Information Management, classification algorithms and NLP can automate product categorization, specially for large catalogues and B2B sites integrating external vendor catalogues. Classification algorithms and NLP with sentiment analysis can help Customer Care with case tagging, case prioritization and dispute resolution.
A recent development in the area of e-commerce and legislation is The European Union General Data Protection Regulation is designed to protect the data privacy of all European residents and requires website operators, among other requirements, to consider any applicable notice and consent requirements when collecting personal data from shoppers (for example, using cookies). GDPR not only applies to organizations located within Europe, but it also applies to organizations located outside of it, if they offer goods or services to European shoppers. It came into effect May 25th of this year and it defines fines for non-compliance. GDPR in my view includes a bigger scope of rights and protections when compared with the Personal Information Protection and Electronic Documents Act that we currently have in Canada. Thanks to GDPR, for example, we now know that PayPal shares personal data with more than 600 third party organizations. Of particular interest to Machine Learning are the rights in relation to automated decision making and profiling. Quoting Mr. Andrew Burt: "The GDPR contain a blanket prohibition on the use of automated decision-making, so long as that decision-making occurs without human intervention and produces significant effects on data subjects." One exception is when the user consents explicitly. One issue is the interpretation of what can be called "rights to explainability", that is, rights to explain the algorithm, its model and reason considering ML algorithms and models can be difficult to explain. Another challenge is the right to erasure. Should the ML model be retrained after the user asks to delete her or his data? The Working Party 29, an official European group involved in drafting and interpreting the GDPR, understands that all processing that occurred before the withdrawal remains legal. However, a more critical view could argue that the model is directly derived from the data and can even reveal the data with overfitting in some cases.
A recent development in the area of e-commerce and legislation is The European Union General Data Protection Regulation is designed to protect the data privacy of all European residents and requires website operators, among other requirements, to consider any applicable notice and consent requirements when collecting personal data from shoppers (for example, using cookies). GDPR not only applies to organizations located within Europe, but it also applies to organizations located outside of it, if they offer goods or services to European shoppers. It came into effect May 25th of this year and it defines fines for non-compliance. GDPR in my view includes a bigger scope of rights and protections when compared with the Personal Information Protection and Electronic Documents Act that we currently have in Canada. Thanks to GDPR, for example, we now know that PayPal shares personal data with more than 600 third party organizations. Of particular interest to Machine Learning are the rights in relation to automated decision making and profiling. Quoting Mr. Andrew Burt: "The GDPR contain a blanket prohibition on the use of automated decision-making, so long as that decision-making occurs without human intervention and produces significant effects on data subjects." One exception is when the user consents explicitly. One issue is the interpretation of what can be called "rights to explainability", that is, rights to explain the algorithm, its model and reason considering ML algorithms and models can be difficult to explain. Another challenge is the right to erasure. Should the ML model be retrained after the user asks to delete her or his data? The Working Party 29, an official European group involved in drafting and interpreting the GDPR, understands that all processing that occurred before the withdrawal remains legal. However, a more critical view could argue that the model is directly derived from the data and can even reveal the data with overfitting in some cases.
The first international beauty contest judged by AI accepted submissions from more than 6,000 people from more than 100 countries. Out of the 44 winners, nearly all were white women. In the United States, software to predict future criminal activity was found to be biased against African-Americans. In these cases, as in others, societal prejudices made their way into algorithms and models. Let's not forget that software systems are built in certain contexts and deployed into contexts. The international beauty contest reveals a naïveté: as if software could reveal a culturally neutral and racially neutral conception of beauty. A coalition of human rights and technology groups at the RightsCon Conference a few weeks ago drafted what is called "The Toronto Declaration". RightsCon covered a broad list of important subjects related to AI and society. The Toronto Declaration emphasizes the risk of Machine Learning systems to "intentionally or inadvertently discriminate against individuals or groups of people". It reads:
"Intentional and inadvertent discriminatory inputs throughout the design, development and use of machine learning systems create serious risks for human rights; systems are for the most part developed, applied and reviewed by actors which are largely based in particular countries and regions, with limited input from diverse groups in terms of race, culture, gender, and socio-economic backgrounds. This can produce discriminatory results."
Standford's University One Hundred Year Study on Artificial Intelligence is a long-term investigation of AI and its influences in society. Section 3 of its latest report - "Prospects and Recommendations for Public Policy" - calls to "Increase public and private funding for interdisciplinary studies of the societal impacts of AI.
As a society, we are underinvesting resources in research on the societal implications of AI technologies."
Thinking of societal issues raised by AI and automation, I chose to quickly mention the issue of employment. One good article by Dr. Ewan McGaughey titled "Will Robots Automate Your Job Away? Full Employment, Basic Income, and Economic Democracy" argues that robots and automation are not a primary factor of unemployment. He writes: "once people can see and understand the institutions that shape their lives, and vote in shaping them too, the robots will not automate your job away. There will be full employment, fair incomes, and a thriving economy democracy. "
My best friend is a professor of Machine Learning. He's funny and at times even sarcastic. One day he described to me a hypothetical conversation with a Philosopher. "When I submit applications for grants, I ask for hardware and software for my lab. What do you ask for? A chair?" The Philosopher Thomas Khun could have replied: 'New science gets accepted, not because of the persuasive force of striking new evidence, but because old scientists die off and young ones replace them.' The enveavour of modeling and interpreting zeros and ones is epistemic and hermeneutic. The time is not for isolation, but for collaboration. More than ever we need to engage the Social Sciences and the Humanities to help us understand what we do, how we do it and why we do it.
No comments:
Post a Comment