Netflix Announces Second Data Mining Contest 56
John Snodgrass writes "Neil Hunt, Chief Product Officer at Netflix, has announced on the Netflix Prize Forums that they are planning to hold a new data mining competition. The second competition will have some twists and is expected to be shorter in duration. It will feature two grand prizes, to be awarded in a 6 and 18 month time frame. A previous competitor still active on the board has already dubbed it: 'The Sparse Matrix: Reordered' and 'The Sparse Matrix: Factorizations.'"
Re:Contests (Score:5, Insightful)
> Why do tens or possibly hundreds of thousands of dollars worth of work just for the chance that you might get payed? It seems absurd.
Challenge and notoriety.
For that matter, just about everything you do has a chance of failure, so why do anything?
=Smidge=
Re:Contests (Score:5, Insightful)
Re:Contests (Score:5, Insightful)
Most of these are research groups that would publish their results and research anyway. This gives them a practical application and a chance for some fame and money -- the research still gets done and published.
Re:Contests (Score:5, Insightful)
I'll add one more thing. Netflix has done the community a favor by providing a large dataset for testing algorithms. Data mining requires data. It requires more than just raw data. It is really difficult to know how well your algorithm works without data that has known answers to compare to. A good test dataset lets you compare your results to other results.
Re:Contests (Score:2, Insightful)
Re:Usefullness? (Score:3, Insightful)
Apparently recommendations are important, otherwise they wouldn't put that much money towards it. There are tens of thousands of movies you have never heard of, but chances are you might like some of them.
Re:Usefullness? (Score:3, Insightful)
I may pick Movie X to watch because the wife and I each had a hard week, but Movie X may be something that we'd never view under any other circumstance. A discrete system has a very hard time categorizing something as fluid as mood and could easily be led to make very inaccurate recommendations on the whole.
If it has a hard time categorizing it, it's because you gave it bad data with your ratings. If the movie's a one-off thing, either don't rate it, or rate it down.
That said, a sophisticated rating system should be able to recognize multi-modal distributions. I like some dumb comedies, some cerebral Science Fiction, and some action thrillers. A good system should pick out my trends amongst each of these to make suggestions within each genre, some crossovers, and really wouldn't be affected by the one oddball movie I'll never watch again.
It doesn't seem too out of line to assume that if you watch enough "mood movies", a good system will make several suggestions for the next time your wife is in a mood for a similar movie, as well as suggestions for your movie watching other times. They aren't mutually exclusive. It's just up to you to look at the appropriate "Because you liked Romantic Comedies/Foreign Films/Summer Blockbusters/Erotic Thrillers" suggestion box for the mood you're in. The system isn't just telling you "Watch this one movie now, I know more than you", but it can give you a much refined set of choices.
Re:Contests (Score:3, Insightful)
lazy people who just want cash for their clunker and to be left aloe to play Halo
Well, to be fair, aloe does help with the chaffing after 48 hours of non-stop Halo.