Welcome to The Valve
Login
Register


Valve Links

The Front Page
Statement of Purpose

John Holbo - Editor
Scott Eric Kaufman - Editor
Aaron Bady
Adam Roberts
Amardeep Singh
Andrew Seal
Bill Benzon
Daniel Green
Jonathan Goodwin
Joseph Kugelmass
Lawrence LaRiviere White
Marc Bousquet
Matt Greenfield
Miriam Burstein
Ray Davis
Rohan Maitzen
Sean McCann
Guest Authors

Laura Carroll
Mark Bauerlein
Miriam Jones

Past Valve Book Events

cover of the book Theory's Empire

Event Archive

cover of the book The Literary Wittgenstein

Event Archive

cover of the book Graphs, Maps, Trees

Event Archive

cover of the book How Novels Think

Event Archive

cover of the book The Trouble With Diversity

Event Archive

cover of the book What's Liberal About the Liberal Arts?

Event Archive

cover of the book The Novel of Purpose

Event Archive

A Dirty Dozen Sneaking up on the Apocalypse

ADD: Drugs Don’t Work Long Term

More Fishy Business

Fish Argues Against Interpretation Via Digital Humanities

The Conversation Continues: What is Graffiti?

Listening is All

As Actors Prepare, so Should Critics Learn

Animal, Vegetable, or Mineral: What is Graffiti?

The Peregrinations of Agency vis-à-vis the Text

OOO is Very Abstract, but so is KR

Russell Hoban: Disappearances

Alenka Pinterič

Community Bands in America

New coinage: “Assholocracy”

Tank Tankoro, by Gajo Sakamoto

Bill Benzon on The Sins of Steven Pinker: Or, Let’s Get on with It

Robert Sheppard on Occupy Wall Street: America HAS a Ruling Class

John S Wilkins on Occupy Wall Street: America HAS a Ruling Class

William Ray on That Shakespeare Thing

GeoX on That Shakespeare Thing

Bill Benzon on The Sins of Steven Pinker: Or, Let’s Get on with It

roger on The Sins of Steven Pinker: Or, Let’s Get on with It

Joe Black on One Candle, a Thousand Points of Light: Moretti and the Individual Text

Bill Benzon on Vitalism, Computation, and Mechanism

CT on Vitalism, Computation, and Mechanism

Bill Benzon on Disney Agonistes: Night on Bald Mountain

Nate Whilk on Disney Agonistes: Night on Bald Mountain

Bill Benzon on Q: Why is the Dawkins Meme Idea so Popular?

John S Wilkins on Q: Why is the Dawkins Meme Idea so Popular?

Russ on Juggling: What to do?

Advanced Search

Articles
RSS 1.0 | RSS 2.0 | Atom

Comments
RSS 1.0 | RSS 2.0 | Atom

XHTML | CSS

Powered by Expression Engine
Logo by John Holbo

Creative Commons Licence
This work is licensed under a Creative Commons License.

 


Blogroll

2blowhards
About Last Night
Academic Splat
Acephalous
Amardeep Singh
Beatrice
Bemsha Swing
Bitch. Ph.D.
Blogenspiel
Blogging the Renaissance
Bookslut
Booksquare
Butterflies & Wheels
Cahiers de Corey
Category D
Charlotte Street
Cheeky Prof
Chekhov’s Mistress
Chrononautic Log
Cliopatria
Cogito, ergo Zoom
Collected Miscellany
Completely Futile
Confessions of an Idiosyncratic Mind
Conversational Reading
Critical Mass
Crooked Timber
Culture Cat
Culture Industry
CultureSpace
Early Modern Notes
Easily Distracted
fait accompi
Fernham
Ferule & Fescue
Ftrain
GalleyCat
Ghost in the Wire
Giornale Nuovo
God of the Machine
Golden Rule Jones
Grumpy Old Bookman
Ideas of Imperfection
Idiocentrism
Idiotprogrammer
if:book
In Favor of Thinking
In Medias Res
Inside Higher Ed
jane dark’s sugarhigh!
John & Belle Have A Blog
John Crowley
Jonathan Goodwin
Kathryn Cramer
Kitabkhana
Languagehat
Languor Management
Light Reading
Like Anna Karina’s Sweater
Lime Tree
Limited Inc.
Long Pauses
Long Story, Short Pier
Long Sunday
MadInkBeard
Making Light
Maud Newton
Michael Berube
Moo2
MoorishGirl
Motime Like the Present
Narrow Shore
Neil Gaiman
Old Hag
Open University
Pas au-delà
Philobiblion
Planned Obsolescence
Printculture
Pseudopodium
Quick Study
Rake’s Progress
Reader of depressing books
Reading Room
ReadySteadyBlog
Reassigned Time
Reeling and Writhing
Return of the Reluctant
S1ngularity::criticism
Say Something Wonderful
Scribblingwoman
Seventypes
Shaken & Stirred
Silliman’s Blog
Slaves of Academe
Sorrow at Sills Bend
Sounds & Fury
Splinters
Spurious
Stochastic Bookmark
Tenured Radical
the Diaries of Franz Kafka
The Elegant Variation
The Home and the World
The Intersection
The Litblog Co-Op
The Literary Saloon
The Literary Thug
The Little Professor
The Midnight Bell
The Mumpsimus
The Pinocchio Theory
The Reading Experience
The Salt-Box
The Weblog
This Public Address
This Space: The Fire’s Blog
Thoughts, Arguments & Rants
Tingle Alley
Uncomplicatedly
Unfogged
University Diaries
Unqualified Offerings
Waggish
What Now?
William Gibson
Wordherders

Wednesday, August 01, 2007

A crowd’s job of work

Posted by John Holbo on 08/01/07 at 08:55 AM

This Whimsley post is pretty interesting.

Online DVD rental outfit Netflix caused a real buzz last October when it announced the competition. If anyone can come up with a recommender system for predicting customer DVD preferences that beats its own algorithm (Cinematch) by a certain amount, Netflix will hand over $1million. The prize got a lot of attention because it exemplifies the idea of crowdsourcing. Not only does Netflix rely on crowdsourcing of DVD ratings (user ratings of DVD titles) but the competition itself is an attempt to use crowdsourcing to develop the algorithms to make the most of those ratings. Instead of doing the work itself, or hiring specialists, Netflix lets whoever anyone enter their competition and pays the winner. The competition is still in progress: Netflix says it will run until at least 2011. So now the initial buzz has died down, what can we learn from the Netflix Prize?

It seems as though academic critics ought to have something to say about this - yes, even if they hate the term ‘crowdsourcing’. (Franco Moretti’s book maybe could have been Maps, Graphs, Trees, and Crowds.)

Also, this is awesome:

Customer 2270619 has rated 1975 titles. 1931 were given a 5, 31 were given a 4, 10 given a 3, 2 given a 2 (Grumpy Old Men and Sex In Chains) and a single title was given a 1. That title? Gandhi, which has an average rating of over 4 and which less than 2% of those who watch it give a 1.

It’s a curious feature of the contest that the challenge doesn’t involve figuring out how to throw out, or massage, obviously weird cases but, instead, to predict what weird people will say as well as what possibly sane people will say. It’s like that old Far Side cartoon with the lab coats looking through the glass: “Yes, of course they’re idiots, but what KIND of idiot?”

Another thing that’s strange - the author of the post, Tom Slee, notes as much - is that it seems rather obvious you could get significant improvement by working to improve the other end of the system: namely, the point where people click for the 1-5 stars. Build an even slightly more fine-grained system and you would surely see a hell of a lot better than 10% improvement in the final predictions (he says boldly).  Nor is this something extra that you could always go and do later, after you’ve got your spiffy new algorithms.  Because, as it stands, mathematicians are pulling their hair out, trying to wangle heuristic routes to answers to questions that could be asked. They are trying to clean one end of the pipe from the other end, with tools that can barely reach, if at all. Example (which Tom Slee himself mentions): accounts with multiple renters will give you bad data in the form of false positive linkages (he likes war movies; she likes “Sex In the City”; little Timmy only watches “Little Einsteins”.) So ask how many users on the account, then when a rating submitted, prompt to specify for which user.

On a more elevated note, you could ask people to register whether they are wearing their Anton Ego critic hat, as it were, or just their Popcorn Id hat. Most people sort of get the difference. Find some intuitive way to register it - hell you could do it with iconic ‘reflective critic’ vs. ‘regular guy yukking it up on the couch’ icons. Let people rate under one or the other heading, or both. It seems as though you could, by offering small incentives, get people to make the few extra clicks that would amount to some rather interesting data, even if the questions are still pretty crude.

It’s not that I think this project is vital to the strength of the republic of letters. But when I see someone slap down a cool million for an answer to what is basically a literary critical question - and when the question is put so crudely, even as the answer promises to be so marvelously sophisticated, I can’t help thinking an opportunity is being lost. The company could do better, gathering data that might fuel any number of quite fascinating, Moretti-style studies. And it would actually make good business sense for them to do so (if this present exercise makes sense.)

Oh, wise crowd of Valve readers: what slightly less painfully simple rating question(s) could Netflix ask customers, such that genuinely interesting (and predictive) data would potentially result?

I’m not a Netflix user. Sadly, that doesn’t work in Singapore.


Comments

Add a comment:

Name:
Email:
Location:
URL:

 

Remember my personal information

Notify me of follow-up comments?

Please enter the word you see in the image below: