Recommender engines based on machine-learning algorithms have become a mainstay of e-commerce. Twenty years after Netflix began using recommender systems, 80 percent of its users’ streaming time is driven by its ever-improving proprietary recommender algorithm (Chong, 2020). A similar recommender system has resulted in most YouTube videos being watched for more than 10 years (Davidson, 2010).
For most of the last 20 years, recommender systems have been based on numerical ratings of users-to-items that enabled matrix-based factorization (MF) and collaborative filtering (CF). To improve their predictiveness, developers then began trying to incorporate additional source information besides just frequency of use (Koren, 2010) (Bell, 2009) (Su, 2009). Today, these efforts to incorporate heterogenous information about people’s digital habits are the vanguard of recommender systems in an approach named “joint representation learning” (JRL).
The data inputs to a recommender system are:
- User preferences
- Item features
- Historic customer-product interactions
- Temporal-sequence-awareness
- Spatial or point-of-interest data (Zhang, 2018).
However, when first implementing a new recommender system, firms usually must limit their inputs to the available data. The same applies to limited financial or technical resources. Firms with those limitations often implement more basic recommenders until such a time as they have the resources to develop future iterations with increased predictiveness.
The five classes of recommender systems that every e-commerce company should know, in order of their complexity and evolution, are: (1) collaborative-filtering (CF) recommenders; (2) content-based (CB) recommenders; (3) hybrid (CF-CB) recommenders; (4) deep-learning recommenders, of which there are about a dozen major subtypes; and, (5) ensemble-product recommenders, such as node-wise graph neural networks (NGNN) (Luellen, 2020).
Collaborative-filtering (CF) Recommenders
CF recommenders rely on historic user-item interactions to try ensuring users see only the things that might interest them most – a solution to ‘over choice’ (Madhukar, 2014). These historic interactions can be explicit (e.g., how users’ rate items) or implicit (e.g., items a user has searched for) (Jannach, 2010). The workflow of a CF system typically consists of three steps: (1) a user expresses experiences or preferences via some type of rating system (e.g., stars, etc.), which the recommender system infers as a quantification of the user’s interest or perceived utility of the item; (2) the system matches to other users who have rated the same product similarly; and, (3) the system juxtaposes items that one user has purchased that the other has not and recommends the not-yet-purchased items to the similar shoppers that are missing those items in their product-purchase history (Luellen, 2020).
Content-based (CB) Recommenders
CB recommenders search through Internet-search histories, texts, emails, and website visits to identify products or items and recommend similar ones. These CB recommenders are likely the source of social media ads that appear in feeds after someone has searched for an item in Google or Amazon or exchanged emails or texts about a class of products or services (Jannach, 2010). CB recommenders often struggle with accuracy. For example, if one searches to find a Jaguar XJ8 vehicle to buy, the CB recommender can infer interest in that product and recommend ads for such vehicles, but lack important elements that are discerning to prospective buyers, such as budget, mileage, color, equipment, location, etc. (Luellen, 2020).
Hybrid Recommenders
Technically, a hybrid recommender is any algorithm that is an ensemble, one that combines algorithms to improve predictiveness and diminish the inherent weaknesses of each category of recommender engines. Arguably, most recommender systems are now hybrids or ensembles. The most common combination is a CF algorithm supplemented by CB methods, which identifies products a user is likely to desire, then directs them to users who have expressed an interest in a genre of product (Madhukar, 2014).
Deep-learning Recommenders
The central feature of the 11 sub-types of deep-learning recommenders is their use of multiple levels of abstraction of data to try gleaning additional insights regarding user behaviors that will improve the productiveness – or follow-through – on what they recommend (Luellen, 2020). From a technical perspective, a deep-learning model is any one that maximizes a differentiable and objective function using some type or variant of stochastic gradient descent (SGD) (Zhang, 2018).
At the highest order, deep-learning recommender system fall into two classes: (1) those based on neural networks; and (2) those that are hybrids or ensembles. Recommender systems based on neural building blocks include multi-layer perceptron’s (MLPs), auto-encoders (AE), recurrent neural networks (RNNs), convolutional neural networks (CNNs), restricted Boltzmann machines (RBMs), neural autoregressive distribution estimators (NADEs), deep reinforcement learning (DRL), etc. Recommender systems based on ensembles include RNN+CNN, AE+CNN, RNN+AE, etc. (Zhang, 2018).
Deep-learning recommender systems have four primary strengths. One, they have the capability to model non-linear data (He, 2017). Two, deep-learning recommenders are capable of representative learning, or discovering underlying explanatory factors and representations from any input data. This yields two advantages: (1) it automates the traditionally labor-intensive process of feature engineering; and (2) it expands the scope of allowable input data types or heterogeneity (e.g., texts, images, audio, video, etc.). A third advantage is deep-learning recommenders enable sequential modeling for machine translation and natural language processing (NLP) in chatbots – especially via CNNs with time-sliding filters and RNNs with internal memory states. Four, deep-learning recommenders are highly flexible because they are built modularly. Most deep-learning platforms (e.g., Caffe, Keras, PyTorch, TensorFlow, Theano, etc.) thusly are built to include and have the added advantage of robust on-line crowdsourced help and user groups (Luellen, 2020) (Zhang, 2018).
Node-wise Graph Neural Networks (NGNNs)
One primary use challenge with recommender systems is how to suggest a combination of products for a user to buy, such as an outfit of corresponding clothing articles or accessories. Historically, recommending combinations of items to purchase was attempted by mapping items into a vector space and estimating the proximity of distance between them. Images were translated into style or compatibility vectors by Siamese CNNs or low rank Mahalanobis transformations (LMT) (Bala, 2015) (Veit, 2015). This evolved into recommending combinations by representing items to buy together (e.g., clothing articles or accessories in an outfit) as a sequence and applying RNNs (He, 2016) (Shih, 2018). Similarly, items that could be combined could be represented with bidirectional sequences and giving them a specific order using the bidirectional long-short term memory (Bi-LSTM) algorithm (Han, 2017).
The primary weakness of the pair representation approaches is they fail to account for the complexity in the number of options when combining different items into one complementary basket of purchases. The primary weakness of the sequence approaches is they fail to recognize the relationships between complementary items and the collection. This is difficult to accurately ordinally rank because each item may have relationships with many other items (Luellen, 2020).
Node-wise graphical neural networks (NGNNs) can be used to address and solve these weaknesses found in the representation and sequence approaches. The NGNN algorithm works by first constructing a “fashion graph” wherein combinations of products are represented as subgraphs (Cui, 2019). The NGNN then models node interactions to learn the nodes’ representations. Third, it predicts a “compatibility score” via an attention layer that results in the NGNN graph output. NGNNs also can accept inputs from images or text (Luellen, 2020). In the published proof, a case study of apparel items combined into outfits resulted in an image-based NGNN with an area under the receiver operator characteristic (AUROC) curve score of .9600, a text-based NGNN scored .9716, and a multi-mode image-text NGNN scored .9722 (Cui, 2019).
Joint Representation Learning (JRL)
The future of recommender systems is found in the fifth major type – joint representation learning (JRL). The goal is to combine users’ characteristics, demands, preferences, and auxiliary information that is heterogenous in origin (e.g., location, historical behaviors, Internet-of-Things data, etc.) across modalities to maximize predictive accuracy. In the JRL algorithm, every type of information is analyzed to learn corresponding item-user representations based on deep-representation learning architectures (Zhang, 2018).
References
Bala, K., Bell, S. (2015). Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (TOG), 34(4): 98.
Bell, R., Koren, Y., Volinksy, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8): 42-49.
Chong, D. (2020, April 30). Deep Dive into Netflix’s Recommender System. Retrieved from Towards Data Science: https://towardsdatascience.com/deep-dive-into-netflixs-recommender-system-341806ae3b48
Cui, Z., Li, K., Wu, S., Zhang, X., Wang, L. (2019). Dressing as a whole: Outfit compatibility learning based on a node-wise graph neural network. The World Wide Web Conference (WWW’19) (pp. 307-317). San Fransisco: ACM.
Davidson, J., Liebald, B., Liu, J., Nandy, P., Van Vleet, T., Gargi, U., & Gupta, S., et al. (2010). YouTube video recommender system. ACM Recommender System, 293-296.
Han, X., Wu, Z., Jiang, Y., Davis, L. (2017). Learning fashion compartibility with bidrectional LSTMs. In the Proceedings of the 25th ACM International Conference on Multimedia (MM’17) (pp. 1078-1086). Mountain View, CA: ACM.
He, R., Packer, C., McAuley, J. (2016). Learning compatibility across categories for getrogenous item recommendation. In the IEEE 16th International Conference on Data Mining (ICDM) (pp. 937-942). Barcelona, Spain: IEEE Computer Society.
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T. (2017). Neural collaborative filtering. International World Wide Web Conference Committee (pp. 173-182). Perth, Australia: ACM.
Jannach, D., Zanker, M., Felfernig, A., Friedrich, G. (2010). Recommender systems: An introduction. Cambridge, UK: Cambridge University Press.
Koren, Y. (2010). Collaborative filtering with temporal dynamics. Communications of the Association of Computing Machinery, 53(4): 89-97.
Luellen, E. (2020). Recommender Systems. In E. Luellen, Beating Amazon: The machine learning guide to win the e-commerce war (pp. 91-122). Sheridan, Wyoming: Social Justice Press.
Madhukar, M. (2014). Challenges and limitations in recommender systems. International Journal of Latest Trends in Engineering and Technology, 4(3): 138-142.
Shih, Y., Chang, K., Lin, H., Sun, M. (2018). Compatibility family learning for item recommendation and generation. In the Proceedings of the 32nd AAAI Conference on Artificial Intelligence (pp. 2403-2410). New Orleans: AAAI.
Su, X., Khoshgoftaar, T. (2009). A survey of collaborative filtering techniques . Advances in Artificial Intelligence, Article ID; 421425, 1-19.
Veit, A., Kovacs, B., Bell, S., McAuley, J., Bala, K., Belongie, S. (2015). Learning visual clothing style with heterogeneous dyadic co-occurences. Advances in Neural Information Processing Systems (NIPS) (pp. 4642-4650). Santiago, Chile: IEEE Computer Society.
Zhang, S., Yao, L., Sun, A., Tay, Y. (2018). Deep learning based recommender systems: A survey and new perspectives. ACM Computing Surveys, 1(1): 1-35.