The Future of Visual Recommender Systems: Four Practical State-Of-The-Art Techniques

Original article can be found here (source): Deep Learning on Medium

The authors “propose a novel neural network framework,neural outfit recommendation(NOR), that simultaneously provides outfit recommendations and generates abstractive comments. NOR consists of two parts: outfit matching and comment generation. For outfit matching, we propose a convolutional neural network with a mutual attention mechanism to extract visual features…For abstractive comment generation, we propose a gated recurrent neural network with a cross-modality attention mechanism to transform visual features into a concise sentence”.

RecSys based on CNN is powerful, but it can be hard to interpret the output. There have been separate attempts to visualize the CNN gradients by Utku Ozbulak, generate image comments by Donahue, J. et al. Still, it is not easy to combine both techniques and apply it to the context of RecSys. If we look at the resultant image above, we see that it is the first step in understanding the recommendations, with “great denim look”, “love the red and white” being good examples of explaining why the outfits are recommended. However, in the negative cases in the last row, we can see that the comment generation itself is not perfect; sometimes it is describing something not found within the image or is completely out of context. Note that the authors build the comment generating model using data from Polyvore, a community-powered social commerce website.

Nonetheless, explainable AI (XAI) is a critical piece in understanding, evaluating, and deploying deep learning solutions in production. For more on XAI, Feifeife has gathered an impressive collection of XAI materials.

Private Personalized RecSys (2020)

Increasingly, RecSys are being deployed in privacy-sensitive domains like healthcare, education, and finance. We want the benefits of personalized healthcare/education/financial plans, but at the same time, the fear of giving up our data and then losing them to a hack is real. It seems oxymoronic — how can we build a personalized RecSys while maintaining user privacy?

Back in 2009, McSherry & Mironov from Microsoft Research explored this issue in their paper Differentially Private Recommender Systems with a simple idea. In essence, we can add noise to the item ratings and the item-item covariance matrix in line with ε-Differential Privacy (the actual mathematics behind this is non-trivial). In other words:

  • We mask away the identifying traits of any particular user (user A buys pink shirts on the first Monday of every month).
  • To obtain general trends (segment X of users likes to buy pink shirts)
  • The privacy loss is mathematically proven to be bounded by a factor of ε.
  • While differential privacy is a useful metric to measure risk internally when designing a RecSys, it is not intuitive to explain to users, nor does it guarantees that data is secured.

A new paper by Ribero et al. extends the idea with Federating Recommendations Using Differentially Private Prototypes. Federating learning is a modern approach to distributed machine learning.

  • Instead of training massive models on centralized servers, we send out small (megabyte sized) models to users’ devices.
  • The models are trained on the user’s device with their data during the device idle time.
  • We only send training results back to a centralized server.

You can see an illustration of this process illustrated by Google’s comic strip. By combining differential privacy and federated learning, Ribero et al. propose a novel approach to tackling the issue of private, personalized RecSys.

“Most federated learning methods require multiple rounds of communication between entities and a central server, which poses a problem for differential privacy requirements. Specifically, we can think of each round of communication from the entities to the server as a query sent to the individual entities, which has potential to leak information…(hence) we constrain the communication to only two rounds, back and forth” (Ribero et al., 2020).

Of course, the problem is a challenging one. To cut down the number of rounds to only two, the team needs to come up with a novel way to compress the data and then save it in an accessible form. They name these data structures as “prototypes”:

These prototypes are designed to: a) contain similar information as Xh, thus allowing construction of an accurate item representation; b) be of low dimension relative to Xh, hence minimizing communication load; and c) maintain differential privacy with respect to the individual users.

The paper is a challenging read on a rapidly evolving and important topic. If you are interested to learn more about federated learning and differential policy, lee-man is collecting a list of readings, tools, and code on their Github post.

Next steps…

Visual RecSys is an exciting field, and I hope you enjoyed the various techniques we discussed today. For more cutting edge stuff on RecSys, you can explore