An introduction to recommendation systems.

We're launching a four-part series on recommendation systems, starting with the basics, terms to expect, and examples of recommendation systems we use everyday.

Apr 03, 2023

Today’s newsletter is the first installment of a four part series covering recommendation systems. Over the course of the next month, we’re going to talk about some of the recommendation systems we interact with everyday, the technical aspect of building a recommendation system, and what we see in the future for recommendation systems.

what's a recommendation system?

A recommendation system (also called recommendation engine) is mathematical process that is analyzes data to make predictions about items for users. These systems are most simply thought of as “I am user U and the system recommends item I based on some criteria”. These systems are everywhere, embedded into tech users engage with everyday.

We’ll cover more in depth the technical aspects of what it takes to build a recommendation system in future articles, but let’s walk through some background to get started.

the basics of a recommendation system

Recommendation systems typically have users and items.
The two main approaches to building a recommendation model are content-based and collaborative filtering
- Content-based algorithms use features based on the item to recommend other items similar to what the user likes.
  - Imagine you’re looking to buy a water bottle. Content based algorithms will use ratings or other feedback you’ve given to other water bottles or similar items to recommend which water bottle you may want to buy.
- Collaborative filtering algorithms use information about a given user and information on similar users simultaneously with information on a given item to make recommendations
  - Imagine you’re looking to buy a water bottle. A collaborative filtering algorithm will take the approach of recommending water bottles that users similar to you have bought (based on similar past rating history or items viewed).
Some datasets have been popularized when building recommendation systems. You’ll see these everywhere.
- MovieLens dataset consists of millions of ratings and is a standard benchmark when building new recommendation models.
- The Netflix Prize dataset (read more below) consists of over 100 million ratings from 1996 to 2005.
- Yahoo Music dataset consists of over 10 million artist ratings.
Some recommendation systems are better than others at dealing with the cold start problem, which is how “good” predictions are when a new user enters the system (i.e. someone signs up for a new platform) or when a new item enters the system (i.e. a new product is launched on a website).

Interested in a technical walkthrough for beginners? Google Learning has a great course on recommendation systems here. Otherwise, the next few posts of Day to Data will cover technical elements as well. Stay tuned!

twitter’s recommendation system

In a blog post just two days go, Twitter released its recommendation system to the world, sharing the open-source code that selects which tweets show up on your For You timeline. The platform uses a variety of techniques, including clustering similar users and engagement from followers, to source the tweets you see on your page. The source code has been shared on Gitlab here and here.

Aakash Gupta 🚀 Product Growth Guy @aakashg0

Twitter revealed its algorithm to the world. But what does it mean for you? I spent the evening analyzing it. Here’s what you need to know:

amazon’s product recommendation

As one of the OG’s in the space, Amazon has been building out their product recommendation engine for their massive dataset (over 300 million active users and millions of products). In 2003, researchers wrote a paper discussing Amazon.com recommendations using item-to-item collaborative filtering that explores the foundations of their powerful recommender engines. As of 2019, over 35% of Amazon sales came from cross-sales, or some form of item-to-item recommendation. The value recommendations has brought to Amazon is incredible.

An example of a recommendation system surfacing predictions on the front page of a user’s Amazon.com.

Recommendations surfaced when a user is on an item page and may be looking for similar items.

netflix’s top picks

An estimated 75% of what people watch on Netflix comes from some sort of recommendation. 75%!! That’s an incredible number. Back in 2006, Netflix stressed the importance of this problem with the announcement of the Netflix Prize, a competition for anyone to try to beat the accuracy of their Cinematch system by at least 10%.

A year into the competition, the Korbell team won the first Progress Prize with an 8.43% improvement. They reported more than 2000 hours of work in order to come up with the final combination of 107 algorithms that gave them this prize. And, they gave us the source code. — by Xavier Amatriain and Justin Basilico (Personalization Science and Engineering) written on Medium

Netflix’s personalization systems have evolved tremendously since 2006, but the Netflix Prize and winning group is a testament to the value that can come from a company acknowledging that maybe someone else out there has a solution that can trump their own.

A classic example of collaborative filtering is Netflix’s Top Picks.

spotify’s discover weekly

As I’ve mentioned in posts before, Spotify’s bread and butter over the years has been their ability to generate personalized playlists for users that contain music that fits our tastes so well. Recommendation at Spotify continues to boom with the recent launch of their AI DJ.

day to data

In tune with the data - how Spotify uses data to know your music tastes better than you do.

I spend a significant amount of time on Spotify – curating playlists, discovering new music, and seeing what my friends are listening to. Spotify has done a great job of engaging their users by surfacing data in an interesting way, whether it be an ever-expanding set of personalized playlists at our fingertips or my favorite holiday of the year Spotify …

2 years ago · gaby lorenzi

hinge’s most compatible

I’ll let the article below speak for yourself — it’s one I love (pun intended).

day to data

the algorithm for finding love.

I took my first algorithms class during my junior year and learned of an algorithm that has since become one of my favorite pieces of knowledge acquired during college. This algorithm has a beautiful story, including a Nobel Prize, medical residents, and the dating app, Hinge. This algorithm is known by the name of its two creators…

2 years ago · gaby lorenzi

so what’s to come?

Over the next month, I’ll continue to walk through topics mentioned today. We’ll go over the technical elements of content based and collaborative filtering algorithms and continue to discuss societal implications of the heavy use of recommendation systems by millions of users every day.

I’m excited to continue unpacking this topic and eager to hear thoughts from other readers. Thanks for reading! See you next week.

Day to Data