Source: Deep Learning on Medium
In this article I’m going to build a product recommendation service using C#, ML.NET, and NET Core.
ML.NET is Microsoft’s new machine learning library. It can run linear regression, logistic classification, clustering, deep learning, and many other machine learning algorithms.
And NET Core is the Microsoft multi-platform NET Framework that runs on Windows, OS/X, and Linux. It’s the future of cross-platform NET development.
The first thing I need for my product recommendation app is a data file with hundreds of thousands of product purchases. I can use the SNAP Amazon Co-Purchasing Network which was created by crawling the Amazon website in 2003. The dataset is based on the ‘Customers Who Bought This Item Also Bought’ feature of the Amazon website and contains every product combination discovered by the crawler.
The dataset contains a list of 1,234,878 product combinations and looks like this:
It’s a very simple TSV file with only two columns:
- The ID of a purchased product
- The ID of a second product that was often bought by customers who also bought the first product
I will build a machine learning model that reads in each set of Product IDs, and then predicts popular product combinations for every product in the dataset.
Let’s get started. Here’s how to set up a new console project in NET Core:
$ dotnet new console -o Recommender2
$ cd Recommender2
Next, I need to install the ML.NET base package and the recommender extensions:
$ dotnet add package Microsoft.ML
$ dotnet add package Microsoft.ML.Recommender
Now I’m ready to add some classes. I’ll need one to hold a product combination record, and one to hold my model’s predictions.
I will modify the Program.cs file like this:
The ProductInfo class holds one single product combination. Note how each field is tagged with a Column attribute that tell the TSV data loading code which column to import data from.
I’m also declaring an ProductPrediction class which will hold a single product combination prediction.
Now I’m going to load the training data in memory:
This code uses the method LoadFromTextFile to load the CSV data directly into memory. The class field annotations tell the method how to store the loaded data in the ProductInfo class.
Note that I only have a single data file, so I split the data into a training and a testing partition using the TrainTestSplit method. I use 80% of the data for training and 20% of the data for testing.
Now I’m ready to start building the machine learning model.
I am building a recommendation service so I’ll need to use the Matrix Factorization algorithm to generate my predictions. But my data records only contain two product IDs, and nothing else. How do I set up my algorithm?
Well, it depends on what’s in the dataset. Check out this mind-map:
If I have two IDs and a rating, then I should use Matrix Factorization. That’s not the case here because I don’t have any ratings in my dataset.
But do check out my other Medium article where I demonstrate how to build a movie recommendation service, with ratings!
If I have ratings and I want to include other fields too, I will need a Field-Aware Factorization Machine. That’s also not applicable here because I don’t have any extra fields either.
Which only leaves the final case: if my dataset only contains two IDs, then I should use One-Class Matrix Factorization.
The MLNET machine learning library has support for all three algorithms. Here’s how to set up one-class matrix factorization:
The trick is to provide a LossFunction when setting up the matrix factorization. I use the SquareLossOneClass function which is designed for this scenario: a recommendation model that only uses two ID values.
Note that I also have to provide a Label column. The label is what I’m trying to predict with my model: the ‘other’ product ID. So I simply refer to the CombinedProductID column.
Also note that I’m tweaking the Alpha and Lambda hyper-parameters of the factorization algorithm to speed up training and boost the accuracy.
Machine learning models in ML.NET are built with pipelines, which are sequences of data-loading, transformation, and learning components.
My pipeline has the following components:
- MapValueToKey which reads the ProductID column and builds a dictionary of unique ID values. It then produces an output column called ProductIDEncoded containing an encoding for each ID. This step converts the IDs to numbers that the model can work with.
- Another MapValueToKey which reads the CombinedProductID column, encodes it, and stores the encodings in output column called CombinedProductIDEncoded.
- A MatrixFactorization component that performs matrix factorization on the encoded ID columns and the ratings. This step calculates the movie rating predictions for every user and movie.
With the pipeline fully assembled, I can train the model on the training partition with a call to Fit(…).
I now have a fully- trained model. So now I need to grab the validation data, predict all product combinations, and calculate the accuracy metrics of my model:
This code uses the Transform(…) method to make predictions for every product combination in the test partition.
The Evaluate(…) method compares these predictions to the actual combinations and automatically calculates three metrics for me:
- Rms: this is the root mean square error or RMSE value. It’s the go-to metric in the field of machine learning to evaluate models and rate their accuracy. RMSE represents the length of a vector in n-dimensional space, made up of the error in each individual prediction.
- L1: this is the mean absolute prediction error, expressed as a rating.
- L2: this is the mean square prediction error, or MSE value. Note that RMSE and MSE are related: RMSE is just the square root of MSE.
To wrap up, let’s use the model to make a prediction.
I’m going to focus on a specific product, let’s say product number 3, and check if it’s often bought together with another product, number 63.
Here’s how to make the prediction:
I use the CreatePredictionEngine method to set up a prediction engine. The two type arguments are the input data class and the class to hold the prediction. And once my prediction engine is set up, I can simply call Predict(…) to make a single prediction on a ProductInfo instance.
Let’s do one more thing and predict the top-5 products often bought together with product number 3:
This code enumerates over every unique product ID (all 262,111 of them) and creates a prediction how well this product combines with product number 3, sorts the predictions by score in descending order, and takes the top 5 results.
With the code all done, it’s time to check the predictions. Here’s the code running in the Visual Studio Code debugger on my Mac:
Here’s the app again running in a zsh shell:
After 20 epochs of training, my final RMSE on training is 128,509. That might seem a little high, but keep in mind that I used the CombinedProductID column as the label.
My model tries to predict product IDs and interprets the numerical difference between its predictions and the actual IDs as the error. So this error value has no meaning. It’s not a percentage or dollar amount or anything.
Looking at the results, we see that the model assigns a score of 0.37 to a combination of product 3 and product 63. So is this good or bad?
To find out, I have to check the top-5 products that go together with product 3. They are: 99, 8, 481, 18, and 303, with scores ranging from 0.51 to 0.63.
In this light, a score of 0.37 is not very good, so I can conclude that products 3 and 63 are not a good match together.
You can get the full source code for the article from Github.
So what do you think? Are you ready to start writing C# machine learning apps with ML.NET?