In the previous articles, we have explored how recommender systems make recommendations to users using various methods.
Recommendations are made based on user ratings and similar products, these types of recommendation systems focus on the similarity of products and historical data about ratings, but they do not focus on the user preferences
In this article, we discuss how to make recommendations based on the User’s previous preferences. We use previous preferences to recommend something close to what they want
Later on, we discuss the advantage of using a hybrid recommender system that makes a recommendation based on both ratings and content functionality. It makes a recommender system more effective in recommending the best product to a user
First, let’s have a look at personal recommender systems
Personal recommender systems
This type of systems use additional information(content & context to build more robust recommender systems
Personal recommender systems are two types; these include
A. Content-based filtering
Based on the User’s previous activities or detailed reviews, content-based filtering uses item features to suggest other products close to what they want. It makes recommendations based on the User’s previous preferences. The performance of our system is a prediction of whether the consumer likes or dislikes the object.
Implementing content-based filtering
We’ll look at content-based filtering for a movie recommender model in this post. For the implementation, we’ll use Python as the programming language. Finds movie tags, the project uses the publicly accessible MovieLens data collection. This data set contains a collection of tags such as actors, genres, moods, incidents, and directors for each movie. User-contributed material, such as ratings and textual reviews, is used to create these tags. The data set contains 178 tags for the film Toy Tale, including the following:
- Pixar animation
- bullying
- fun
- unusual plot structure
- happy ending
- action
- space
- destiny
- 3d
- loneliness
We’ll use the frequency-inverse document frequency (TF-IDF) encoding scheme as an example. It assigns a weight to a word based on its value in the paper. The more often a word appears, the more weight it holds. It illustrates words that are relatively uncommon but crucial to the material at hand.
The formula used to calculate TF-IDF weight I in document j is:
1 | w[i,j] = tf[i,j]*log(N/df[i]) |
TF is the term frequency, df is the document frequency, and N stands for the Dataset’s total number of documents.
array([ 1., 0.35036753, 0.15428608, ..., 0.39104403, 0.46214058, 0.33219143])
An element in the vector represents each TF-IDF weight associated with a term in a document.
Recommending content
Guessing whether a user would like the recommended content, buy an item, or watch a movie is part of recommending content.
On recommender systems, there is a wealth of methods and literature available. • Similarity-based Methods are a popular technique.
- One-class SVMs
- Matrix Factorization is a method of calculating the number of factors in a
- Supervised Learning
- Deep Learning
Cosine similarity
We’ll use a simple similarity-based approach called cosine similarity since it’s simple to understand and demonstrates the basic principle of making recommendations well.
For example, I’ll use Python and the Numpy numerical library, where x and y are two documents describing the function vectors implemented in Step 1
1 2 | x = [2,0,1] y = [2,0,1] |
Vectors have a magnitude and a direction. We can measure the angle between two vectors as a result of this. The cosine of this angle, computed as follows, is a common measure in data science:
cos(x,y) = dot(x,y)/|x||y|
When the vectors are parallel (pointing in the same direction), this calculation equals 1, and when the vectors are orthogonal, it equals 0. Vectors pointing in the same direction are more similar to orthogonal vectors.
We can see how this can be beneficial to us now: Toy Story and Monsters, Inc have a cosine similarity of 0.74. We anticipated a high degree of similarity between these two films. Toy Story and Terminator 2 have a cosine similarity of 0.28, which is significantly lower than predicted. Using the cosine similarity, we can now recommend movies based on what a consumer has already watched or scored. We’d suggest movies that are the most similar to those that the consumer has already given high ratings.
Generating user preference profiles
We might try to create profiles of the users’ tastes instead of recommending movies based on particular movies that a user has already seen.
It allows us to get a broad picture of the users’ tastes and then suggest content to them based on their behaviour over time, without outliers skewing the results.
Let’s look at the Dataset’s first consumer. films scored by this User on a scale of 1 to 5, with 1 disliked and 5 liked
title | rating | |
0 | Braveheart (1995) | 1 |
1 | Basketball Diaries, The (1995) | 4.5 |
2 | Godfather, The (1972) | 5 |
3 | Godfather: Part II, The (1974) | 5 |
4 | Dead Poets Society (1989) | 5 |
5 | Breakfast Club, The (1985) | 4 |
6 | Sixth Sense, The (1999) | 4.5 |
7 | Ferris Bueller’s Day Off (1986) | 5 |
8 | Fight Club (1999) | 4 |
9 | Memento (2000) | 4 |
10 | Donnie Darko (2001) | 5 |
11 | Igby Goes Down (2002) | 5 |
12 | Batman Begins (2005) | 4 |
13 | Superbad (2007) | 3.5 |
14 | Dark Knight, The (2008) | 4 |
15 | Iron Man (2008) | 5 |
16 | Star Trek (2009) | 5 |
17 | Avengers(2010) | 4 |
18 | Sherlock Holmes (2009) | 5 |
19 | Focus(2010) | 3 |
20 | Real steel(2012) | 5 |
How do we build a profile of this User’s preferences?
The preference profile generates in several ways. I’ll use the weighted mean of the user reviews and the respective movies’ TF-IDF vector representations for convenience. The User’s preference profile generates using this primary weighted mean.
To find their similarity, all we have to do now is take the cosine similarity between the user profile vectors and the material vectors. Now we can suggest things that are the most similar.
User #1’s top recommendations are as follows:
- The Shawshank Redemption
- Logan
- Stand by Me
- American Beauty
- 11.22.63
- City of God
- The Usual Suspects
- Goodfellas
These tend to be healthy suggestions for this User based on the User’s previous movie ratings.
B. Context-aware systems
Context is two ways: representational and complex. Completely measurable factors are more comfortable to control than those that are not, and static is easier to manage than dynamic. There is a correlation between representational views and static variables between interactional and dynamic models. A picture of the paper illustrates three different ways of incorporating meaning into a conventional recommender system model.
3. Hybrid models recommender systems
Hybrid Recommender Systems incorporate Collaborative Filtering and Content-based models to create models that use both ratings and content functionality. Hybrid recommender systems are more reliable than Collaborative Filtering or Content-based Models on their own. They can better solve the cold-start issue because they can use the User or Item metadata to predict user or item ratings. The following tutorials cover hybrid recommender systems.
Conclusion
How about A recap of what we have learned about recommender systems in this article
- Based on the User’s previous activities or detailed reviews, content-based filtering uses item features to suggest other products.
- Context is two ways: representational and complex. Completely measurable factors are more comfortable to control than those that are not. Static is easier to manage than dynamic. There is a correlation between representational views and static variables between interactional and dynamic models.
- Hybrid Recommender Systems incorporate Collaborative Filtering and Content-based models to create models that use both ratings and content functionality
- Hybrid recommender systems are more reliable than Collaborative Filtering or Content-based Models on their own. They can use the User or Item metadata to predict user ratings
References
- https://medium.com/@fenjiro/recommender-systems-d0e597424a98
- https://www.cs.carleton.edu/cs_comps/0607/recommend/recommender/itembased.html
- https://predictivehacks.com/item-based-collaborative-filtering-in-python/
- https://towardsdatascience.com/alternating-least-square-for-implicit-dataset-with-code-8e7999277f4b
- https://www.offerzen.com/blog/how-to-build-a-content-based-recommender-system-for-your-product
- https://pub.towardsai.net/item-based-collaborative-filtering-in-python-58f21d959c1
- https://www.geeksforgeeks.org/user-based-collaborative-filtering/
- https://www.kaggle.com/dasmehdixtr/user-based-collaborative-filter-example
- https://buomsoo-kim.github.io/recommender%20systems/2020/09/14/Recommender-systems-collab-filtering-10.md/
- https://www.kaggle.com/amiralisa/context-aware-recommender
- https://github.com/yadavgaurav251/Context-Aware-Recommender