AMT Lab @ CMU

View Original

Streaming Service Algorithms are Biased, Directly Affecting Content Development

Written by Sandra Martinez

Despite the Covid-19 pandemic, 2020 was a big year for the entertainment industry. Many entertainment corporations reevaluated their approach to content distribution and accessibility for viewers. However, this change was not emerging technology, as it had started a little over a decade ago, when Netflix launched its streaming service in 2007. At its onset, the platform only contained 1000 different license titles. Now, the company is known for its wide range of original content and ever-growing library.

The Power of Streaming

Following Netflix’s growing success early on, major networks realized the potential behind this technology – all attempting to create a unique platform for their own content distribution. This pivot to the streaming world became an advantage at the onset of the pandemic.  While networks still had the challenge of transitioning movie theater releases to the streaming scope, the ones who were accustomed to this were the least affected.  Regardless, pivoting to a streaming focus was largely beneficial for these organizations.  The pandemic generated a notable increase in streaming platform subscriptions, as well as an increase in how many different service subscriptions a household maintained.

A visualization of the growth in streaming service subscriptions by the millions. Source: Statista

Streaming services existed before the pandemic and its massive contribution to the shift to at-home digital entertainment. The prominent streaming services that had prior experience in the industry found that creating original content was the tactic to increase their subscription base. This is exemplified through streaming giants such as Netflix and Amazon Prime Video, spending $16 billion dollars and $6 billion dollars respectively on original content in 2020 alone. However, this spending is not based on haphazard guessing or creating pilots on a whim like networks used to do on primetime TV. These corporations use vast amounts of user data in order to determine new content and its target audience.

Streaming Service Data Acquisition

Netflix prides itself as a longstanding data-driven company. In 2006, it held a competition for developers to improve the accuracy and precision of their algorithm to make it more efficient; the prizewinner would receive a prize of one million dollars as an incentive. This is proof that Netflix is cognizant of data acquisition being imperative to industry success. This importance was further recognized through a 2012 statement by Jonathan Friedland, Netflix’s ex-communication officer. He noted that the organization’s data collection facilitated the ability to determine the size of an audience based on their viewing habits with high degree of accuracy, and that Netflix is continually improving on selecting content that garners high viewer engagement.

Netflix is open about their data acquisition, as the basic breakdown of its algorithm mechanics is available to the public through its website.  First, user interaction is gathered, which is categorized through viewing history and ratings.  This information is juxtaposed with other users’ interactions with similar interests and the relationships between the content that is viewed.  Next, the algorithm analyzes the day and time of activity, how long the activity lasts, and from which device the platform is being accessed.  Once this information (and some additional) is collected, it is synthesized and creates the list of recommendations, ranked by title in each row, on the platform homepage.

The nature of this algorithm explains that 80% of the TV shows that are watched on Netflix were discovered through the recommendations page. These results exemplify how this machine learning technology is an industry disruptor, as well as how data science and analytics directly affects content consumption.  Despite the positive impacts the algorithm has had in the entertainment sphere, its limitations should be critiqued.  While not publicly addressed by Netflix or other major streaming services, there is bias rooted in this algorithm code. 

Algorithm Bias

Even with an overarching knowledge about how Netflix has utilized machine learning and algorithms to generate recommendations, they are not without fault.  The nature of the algorithm that determines recommendations creates a feedback loop. For example, if a show is highly recommended to a certain user, the possibility of that user watching it and providing a positive endorsement to the content might lead to it being recommended to broader audiences, influencing their content choices. However, Netflix is known for promoting their own content over indie or lesser known movies and tv shows. This is why, unsurprisingly, that Netflix’s daily top movies or shows bear the distinctive N in the top corner of the thumbnail.

The lack of movie recognition based on algorithm bias can be exemplified through Chung Mong-hong's “The Sun,” a Taiwanese Crime Epic that was shown at the Toronto International Film Festival, won the most prestigious movie award from its country of origin, received a simultaneous worldwide release, yet, despite its global success, still managed to escape the attention of American critics. Possibly surprisingly, this award-winning and widely-recognized film can be found on Netflix.  But despite its accolades, it will never be recommended to users because it lacked promotion, thus never garnering the popularity to enter the algorithm-generated feedback loop. This poses a question of how much good content are audiences missing out on by relying on these AI-generated recommendations? David Ehrlich, IndieWire journalist, writes, “Movies have never been more accessible, and they’ve never been harder to find.”

The problems with algorithm-based curation have already been addressed by the entertainment industry. Distinguished director Martin Scorsese explains that the content curation on streaming platforms is based on the viewing habits of an individual or a collective group of people. He also poses that by omitting human curation from the recommendation system, movies become oversimplified through their AI-driven placement into a genre or broad category. This directly contrasts with movie and show recommendations before the algorithmic method: someone recommending a movie based on something they loved or that inspired them – ergo,  the art behind it. The algorithm bias has made it more difficult for niche and subject-heavy shows to pass platform filters, subsequently creating what is known as “Ambient TV.”

 Ambient TV is content that aims to produce little to no provoking thoughts or feelings for the viewer, especially those that are disruptive. Additionally, it is meant to serve as background noise – if a viewer becomes distracted for a period of time, they can continue to watch the show, not having missed anything due to its predictability.  The BBC discusses how Emily in Paris is a great example of Ambient TV.  While the show was highly criticized, it could still be found in the top of Netflix’s watch list for a few weeks after its debut. Thus, the Netflix algorithm associates this show’s perceived success and popularity through its predictable storylines.

Another form of algorithm bias is reliance on the past.  Algorithms rely on past trends in order to generate future predictions. Superhero movies are a good example in explaining this. There is a wealth of information to base assumptions on how a new superhero film will perform, although those assumptions may not always be true. Before the rise in popularity of superhero movies in the 2000s, they were not considered an immediate blockbuster success. But if content consumption was control by algorithms at this time, this notion of little success would be heavily relied on by the software, causing it to likely not recommend this style of movie. Therefore, the public might never discover the excitement felt now about superhero movies.  This exemplifies that if sole reliance for content consumption is placed on algorithmic suggestions, there is a chance that people will not discover the groundbreaking exploration of a new film genre.

Impact Beyond Streaming

 Although not publicly recognized by major studios, reliance on data through algorithms and data mining techniques in order to drive new content creation has increased over the last few years. Two of these data-driven methods are Cinelytic and ScriptBook. Cinelytic uses data analysis to determine the optimal outcome for a film through the assessment of most appealing lead actor, the best studio for distribution, and what countries in which it will more likely have a larger impression and generate more revenue. ScriptBook analyzes the script of a film and predicts how likeable the characters will be, its estimated revenue (with up to an 86% accuracy), and how audiences will feel during viewing.  The accuracy of this information indicates why studios rely on the determinations of this software.  However, while stable and profit-generating for studios, it reinforces the problems of relying on machine learning in creating new movies; there is a dominating concern with revenue generation, throwing the art and sentiment of film to the wayside.

Another impact of popular new releases, whether determined by data, algorithm, or a break-through in the AI, is culture.  When a new show or movie is popularized, it affects the surrounding society through the generation of conversation topics that the content broaches.  This is reflected through the increase of search for “big cats” after Tiger King was released, or the steadily growing interest in chess with the popularity of The Queen's Gambit.

Visualization of the growth in internet usage of term “big cats” following the release of Tiger King. Source: Pulsar

Visualization of the growth of global interest in chess in following the release of The Queen’s Gambit. Source: Pulsar

What Comes Next?

It is increasingly clear that streaming services will continue to exert evolutionary change on the entertainment industry, so what can be done to combat the flaws of popular streaming algorithms?

 First, users can diversify their streaming platforms, incorporating some like Mubi. It is a streaming service that shows only 30 movies at the time and is solely curated by humans. The goal of the service is to let its subscribers discover unique, acclaimed movies.  Mubi has a wide range of films, from old classics to new discoveries.  It also houes movies from all over the world, allowing for a culturally rich and diverse experience. 

Another example on the more mainstream spectrum is HBO Max.  Although part of a large TV network, HBO wants to avoid Netflix’s path in content curation.  It tries to create a more accurate and personalized home page for their users by directly involving humans in the process.   While HBO uses common data acquisition to target certain shows to specific audiences, the playlist of its algorithm-generated content first must go through an approval process directly executed by (human) staff.

Regardless of the questions and problems posed about the algorithms used by streaming services, Netflix and other dominant platforms will continue to use them – and continue to work on optimizing the algorithm to eliminate at least the most egregious biases.  However, the results will be as flawed and the humans writing the code and the average user will be mostly unaware if their data is being used accurately. It is, however, up to the industry to hold itself accountable and ensure that the sanctity of the art of TV and film over the algorithm-based entertainment world.

See this content in the original post

+ Resources

Amatriain , Xavier. “Medium.” The Netflix Tech Blog, netflixtechblog.com/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429.

Baldwin, Roberto. “Netflix Gambles on Big Data to Become the HBO of Streaming.” Wired, Conde Nast, 3 June 2017, www.wired.com/2012/11/netflix-data-gamble/.

Chayka, Kyle. "'Emily in Paris' and the Rise of Ambient TV." The New Yorker, Condé Nast. November 16, 2020, https://www.newyorker.com/culture/cultural-comment/emily-in-paris-and-the-rise-of-ambient-tv.

“Cómo Funciona El Sistema De Recomendaciones De Netflix.” Centro De Ayuda, help.netflix.com/es/node/100639.

Ehrlich, David. “Buried on Netflix, Taiwanese Crime Epic 'A Sun' Demands Serious Oscar Consideration.” IndieWire, IndieWire, 16 Dec. 2020, www.indiewire.com/2020/12/consider-this-a-sun-oscar-taiwan-netflix-1234604603/.

Goldberg, Matt, and Matt Goldberg (14922 Articles Published).“Martin Scorsese Advocates for Human Curation over Algorithms.” Collider, 17 Feb. 2021, collider.com/martin-scorsese-why-streaming-services-need-human-curation/.

Hak, Andrea. "How Netflix Shapes Mainstream Culture, Explained by Data." The Next Web, The Next Web B.V. December 1, 2020, https://thenextweb.com/news/how-netflix-shapes-mainstream-culture-explained-by-data?fbclid=IwAR1Xt2iWI2JSOR28Vogk922kvfW8x9dGK6SXCdsaR6ReeJOfhA1-84Pnxyg.

“Hulu Revenue and Usage Statistics (2021).” Business of Apps, 29 Mar. 2021,www.businessofapps.com/data/hulu-statistics/.

Klebnikov, Sergei. “Streaming Wars Continue: Here's How Much Netflix, Amazon, Disney+ And Their Rivals Are Spending On New Content.” Forbes, Forbes Magazine, 22 May 2020, www.forbes.com/sites/sergeiklebnikov/2020/05/22/streaming-wars-continue-hereshow-much-netflix-amazon-disney-and-their-rivals-are-spending-on-new-content/?sh=1c0fee37623b.

McFadden, Christopher. “The Fascinating History of Netflix.” Interesting Engineering, Interesting Engineering, 4 July 2020, interestingengineering.com/the-fascinating-history-of-netflix.

“Netflix Revenue and Usage Statistics (2021).” Business of Apps, 9 Mar. 2021,www.businessofapps.com/data/netflix-statistics/.

Plummer, Libby. “This Is How Netflix's Top-Secret Recommendation System Works.” WIRED UK, WIRED UK, 21 Aug. 2017,www.wired.co.uk/article/how-do-netflixs-algorithms-work-machine-learning-helps-to-predict-what-viewers-will-like.

Sabanoglu, Tugba. “U.S. Amazon Prime Subscribers 2019.” Statista, 1 Dec. 2020, www.statista.com/statistics/546894/number-of-amazon-prime-paying-members/#:~:text=Amazon%20Prime%20is%20constantly%20growing,e%2Dretail%20platform%20per%20year.

Taylor, Alex. "Are Streaming Algorith,s Really Damaging Film?" BBC News, BBC, February 20, 2021, https://www.bbc.com/news/entertainment-arts-56085924.

Vincent, James. "Hollywood Is Quietly Using AI to Help Decide Which Movies to Make." The Verge. Vox Media, LLC. May 28, 2019, https://www.theverge.com/2019/5/28/18637135/hollywood-ai-film-decision-script-analysis-data-machine-learning.