What Will You Love?

Michael Barnwell   March 13, 2009


Netflix origami: extending the experience beyond the flick. (via)

Being able to predict human behavior is a real talent that deserves praise and rich rewards. Netflix agrees and since 2006 has been holding an ongoing competition to improve the accuracy of its movie recommendations to members, handing over one million dollars to the team who can deliver a 10% increase in the accuracy of its “world-class movie recommendation system” Cinematch. This, of course, will require a complex algorithm and an equally complex judging standard involving something called the RSME (root mean squared error) of a data set. Put simply, competitors are vying to predict how likely Customer Sarah is to give “Sleepless in Seattle” a 5-star rating.

Setting aside the quibble about the notion of an increase in accuracy (nearly accurate isn’t actually accurate is it?), competitors are closing in on the prize. As of the end of January, team “BellKor in BigChaos” is within reach of the seven-figure prize, posting a 9.63% improvement. To give Netflix credit, Cinematch isn’t bad. After I submitted ratings on a couple hundred movies, including a 5-star rating on The Coen brother’s movie “Fargo,” it told me that I’d also love their movies “Barton Fink” and “The Big Lebowski.” Amazing! It also told me I’d love several dozen other movies, some of which I certainly would (who wouldn’t love “Duck Soup” and “Vertigo”), others I would certainly not, in diminishing degrees. Netflix is confident enough about its predictive abilities that it offers a site area labeled “Movies You’ll Love.” Come-hither marketing language aside, I’m confident enough about my own predictive abilities that it would be 100% more accurate to label it “Movies You Might Love.”

I’m all for a service that could weed out the junk from a storehouse of 100,000 titles, but I wonder if so much effort and a complex RSME calculation is really needed. Wouldn’t it be simpler to rely on the more transparent workhorse trinity of taxonomy, tracking, and tagging? In short, if absolute accuracy is an impossible dream and being mostly on target is good enough, wouldn’t it make sense to just opt for a lo-fi method? This would require just marshalling the metadata (some are inherent and filter-like, such as title, director, actors), tracking the traffic (what does my history say that I’ve rented with the same director or actors?), and by all means incorporating the tagging, mine and yours (not only did I rent it, I liked it, and several thousand other people liked it, too). All in all, it would be a recommendation with a bit of the machine and a bit of the all-important human. This would quickly get me a reduced set of choices. I can take it from there. By going the lo-fi route, Netflix could redirect future prize money (not to disparage the brilliance of the competing teams) toward increasing the number of movies it allows users to stream, which, by the way, greatly reduces the risk and aggravation of making the wrong choice of movie upfront. Don’t like it? Stream another.

Netflix isn’t the only one trying to hone its ability to predict human behavior. To mention just a couple, last.fm has its own recommendation engine and so does iTunes. Like NetFlix, they’re both pretty good. At the moment, most of the public attention to recommendation engines is going to entertainment, but I’m wondering what Netflix’s ambitions have to say to the corporate model of document retrieval. If I’m faced with finding the gem in a pile of research and development documents, how can I find the equivalent of that perfect dark comedy with heaps of bloody carnage and wit, made by a Coen-esque brothers team and starring John Goodman? By limiting the number of relevant documents by a factor of ten or even a hundred through taxonomy, tracking, and tagging, I’m very likely to get that small set of worthy choices from which I can pluck the prizewinner.

Tags: ,

One Response

  1. i says:

    I’m sure the Coen brothers and John Goodman would completely agree with you. I fail to see how tracking what people rent is better than tracking how much they like what they rent. And I suspect that the algorithms are somewhat more complex than RMSE, which really isn’t complicated at all. At any rate, it strikes me that tracking ratings is much better way to introduce one to new things rather than relying commonalities in the metadata.

Leave a Reply

Razorfish Blogs


  • SXSW Interactive

    March 7 – 11, Austin, TX
    Several of our contributors will be speaking this year. If you’re going, say hi to Rachel, Robert, & Hawk.

  • Confab Minneapolis

    May 7-9, Minneapolis, MN
    The original Confab Event. Rachel will be there doing her Content Modelling workshop with Cleve Gibbon. Get details and we’ll see you there!

  • Intelligent Content Conference Life Sciences & Healthcare

    May 8-9, San Francisco, CA
    Call for Presenters, now open:


  • Confab for Nonprofits

    Jun 16, Chicago, IL
    Another new Confab Event! Early Bird pricing until March 7:  http://confabevents.com/events/for-nonprofits

  • Content Strategy Forum

    July 1-3, Frankfurt, Germany
    International Content Strategy workshops & conference: csforum2014.com Call for speakers now open!

Search scatter/gather

What is this site, exactly?

Scatter/Gather is a blog about the intersection of content strategy, pop culture and human behavior. Contributors are all practicing Content Strategists at the offices of Razorfish, an international digital design agency.

This blog reflects the views of the individual contributors and not necessarily the views of Razorfish.

What is content strategy?

Oooh, the elevator pitch. Here we go: There is content on the web. You love it. Or you do not love it. Either way, it is out there, and it is growing. Content strategy encompasses the discovery, ideation, implementation and maintenance of all types of digital content—links, tags, metadata, video, whatever. Ultimately, we work closely with information architects and creative types to craft delicious, usable web experiences for our clients.

Why "scatter/gather"?

It’s an iterative data clustering operation that’s designed to enable rich browsing capabilities. “Data clustering” seems rather awesome and relevant to our quest, plus we thought the phrase just sounded really cool.

Privacy Policy | Entries (RSS) |     © Razorfish™ LLC All rights reserved. Company Logo.