An introduction to recommendation systems.
We're launching a four-part series on recommendation systems, starting with the basics, terms to expect, and examples of recommendation systems we use everyday.
Today’s newsletter is the first installment of a four part series covering recommendation systems. Over the course of the next month, we’re going to talk about some of the recommendation systems we interact with everyday, the technical aspect of building a recommendation system, and what we see in the future for recommendation systems.
what's a recommendation system?
A recommendation system (also called recommendation engine) is mathematical process that is analyzes data to make predictions about items for users. These systems are most simply thought of as “I am user U and the system recommends item I based on some criteria”. These systems are everywhere, embedded into tech users engage with everyday.
We’ll cover more in depth the technical aspects of what it takes to build a recommendation system in future articles, but let’s walk through some background to get started.
the basics of a recommendation system
Recommendation systems typically have users and items.
The two main approaches to building a recommendation model are content-based and collaborative filtering
Content-based algorithms use features based on the item to recommend other items similar to what the user likes.
Imagine you’re looking to buy a water bottle. Content based algorithms will use ratings or other feedback you’ve given to other water bottles or similar items to recommend which water bottle you may want to buy.
Collaborative filtering algorithms use information about a given user and information on similar users simultaneously with information on a given item to make recommendations
Imagine you’re looking to buy a water bottle. A collaborative filtering algorithm will take the approach of recommending water bottles that users similar to you have bought (based on similar past rating history or items viewed).
Some datasets have been popularized when building recommendation systems. You’ll see these everywhere.
Some recommendation systems are better than others at dealing with the cold start problem, which is how “good” predictions are when a new user enters the system (i.e. someone signs up for a new platform) or when a new item enters the system (i.e. a new product is launched on a website).
Interested in a technical walkthrough for beginners? Google Learning has a great course on recommendation systems here. Otherwise, the next few posts of Day to Data will cover technical elements as well. Stay tuned!
twitter’s recommendation system
In a blog post just two days go, Twitter released its recommendation system to the world, sharing the open-source code that selects which tweets show up on your For You timeline. The platform uses a variety of techniques, including clustering similar users and engagement from followers, to source the tweets you see on your page. The source code has been shared on Gitlab here and here.
amazon’s product recommendation
As one of the OG’s in the space, Amazon has been building out their product recommendation engine for their massive dataset (over 300 million active users and millions of products). In 2003, researchers wrote a paper discussing Amazon.com recommendations using item-to-item collaborative filtering that explores the foundations of their powerful recommender engines. As of 2019, over 35% of Amazon sales came from cross-sales, or some form of item-to-item recommendation. The value recommendations has brought to Amazon is incredible.
netflix’s top picks
An estimated 75% of what people watch on Netflix comes from some sort of recommendation. 75%!! That’s an incredible number. Back in 2006, Netflix stressed the importance of this problem with the announcement of the Netflix Prize, a competition for anyone to try to beat the accuracy of their Cinematch system by at least 10%.
A year into the competition, the Korbell team won the first Progress Prize with an 8.43% improvement. They reported more than 2000 hours of work in order to come up with the final combination of 107 algorithms that gave them this prize. And, they gave us the source code. — by Xavier Amatriain and Justin Basilico (Personalization Science and Engineering) written on Medium
Netflix’s personalization systems have evolved tremendously since 2006, but the Netflix Prize and winning group is a testament to the value that can come from a company acknowledging that maybe someone else out there has a solution that can trump their own.
spotify’s discover weekly
As I’ve mentioned in posts before, Spotify’s bread and butter over the years has been their ability to generate personalized playlists for users that contain music that fits our tastes so well. Recommendation at Spotify continues to boom with the recent launch of their AI DJ.
hinge’s most compatible
I’ll let the article below speak for yourself — it’s one I love (pun intended).
so what’s to come?
Over the next month, I’ll continue to walk through topics mentioned today. We’ll go over the technical elements of content based and collaborative filtering algorithms and continue to discuss societal implications of the heavy use of recommendation systems by millions of users every day.
I’m excited to continue unpacking this topic and eager to hear thoughts from other readers. Thanks for reading! See you next week.