Netflix Announces Second Data Mining Contest 56
John Snodgrass writes "Neil Hunt, Chief Product Officer at Netflix, has announced on the Netflix Prize Forums that they are planning to hold a new data mining competition. The second competition will have some twists and is expected to be shorter in duration. It will feature two grand prizes, to be awarded in a 6 and 18 month time frame. A previous competitor still active on the board has already dubbed it: 'The Sparse Matrix: Reordered' and 'The Sparse Matrix: Factorizations.'"
Re:Contests (Score:5, Insightful)
Most of these are research groups that would publish their results and research anyway. This gives them a practical application and a chance for some fame and money -- the research still gets done and published.
Re:Contests (Score:5, Insightful)
I'll add one more thing. Netflix has done the community a favor by providing a large dataset for testing algorithms. Data mining requires data. It requires more than just raw data. It is really difficult to know how well your algorithm works without data that has known answers to compare to. A good test dataset lets you compare your results to other results.
Re:Contests (Score:5, Interesting)
It allows the researchers to "cheat" a bit too via an argument by authority, which is not always good, but does at least make the researcher's job easier. A big issue in data mining is that it isn't purely a technical field, but one with both conceptual and technical issues. The over-arching goal is something like, "get useful and/or interesting information out of data". But what is "useful", what is "interesting", and how do we measure when we've gotten it or not? Usually you have to defend why your problem is the right one, why your metric is the right way to measure success on it, etc. Working on the Netflix competition lets you sidestep all that, because Netflix has already decreed exactly what the goal is, and what performance metric will be used to judge success at that goal, leaving only the technical problems.
Re: (Score:2)
I agree, but I still think it is still useful. I see three main requirements for data mining research: data, algorithms, evaluation criteria. (Note: I don't do data mining myself, but know many people in the field and have studied it some).
There are lots of algorithms, but they cannot be evaluated in a vacuum. In fact, the algorithms used tend to be highly customized and tuned for any specific problem. Really, the data, algorithms, and evaluation are a package deal.
Getting all of the necessary component
Re: (Score:2)
Yeah, I agree with that. I probably shouldn't have used a pejorative-sounding word like "cheat", even in scare-quotes. I meant just that it lets the researcher get for free some of the things they'd usually have to argue for. From a researcher's perspective, this is a real win: there are many technically solid papers that get rejected from conferences because the reviewers thought the problem wasn't interesting enough ("maybe I believe you solved this, but why?"), or the metric wasn't the right one. Nobody'
Re:Contests (Score:5, Insightful)
> Why do tens or possibly hundreds of thousands of dollars worth of work just for the chance that you might get payed? It seems absurd.
Challenge and notoriety.
For that matter, just about everything you do has a chance of failure, so why do anything?
=Smidge=
Re: (Score:2)
Obviously the odds do matter. At one end of the extreme is communism, where everybody gets the same regardless of what they accomplish. At the other extreme is "winner takes all," where somebody gets everything, even if they are only 0.01% better than the field. Between those extremes is quite a lot. Which is most motivating? I wouldn't claim to know, and I'm sure it depends on the situation, but it's an awfully
Re:Contests (Score:5, Insightful)
Re: (Score:2, Insightful)
Re: (Score:3, Insightful)
lazy people who just want cash for their clunker and to be left aloe to play Halo
Well, to be fair, aloe does help with the chaffing after 48 hours of non-stop Halo.
Re:Contests (Score:4, Funny)
Why do tens or possibly hundreds of thousands of dollars worth of work just for the chance that you might get payed? It seems absurd.
Perhaps you have become lost on these internets.
I suggest trying this [myspace.com] website.
or perhaps this [icanhascheezburger.com] one.
You will likely find them much more aligned with your interests than slashdot.
Usefullness? (Score:1)
Re: (Score:2, Interesting)
As someone who used to watch 3 movies a day for about 3 years straight, I still found the system to be useful.
I thought I'd seen everything that was worth watching but if you're really dedicated to finding more quality films then any help is good help, and this is one of the better systems for finding new films (more accurate than trawling imdb but maybe not quite as fun)
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Insightful)
I may pick Movie X to watch because the wife and I each had a hard week, but Movie X may be something that we'd never view under any other circumstance. A discrete system has a very hard time categorizing something as fluid as mood and could easily be led to make very inaccurate recommendations on the whole.
If it has a hard time categorizing it, it's because you gave it bad data with your ratings. If the movie's a one-off thing, either don't rate it, or rate it down.
That said, a sophisticated rating system should be able to recognize multi-modal distributions. I like some dumb comedies, some cerebral Science Fiction, and some action thrillers. A good system should pick out my trends amongst each of these to make suggestions within each genre, some crossovers, and really wouldn't be affected by the one oddba
Re: (Score:2)
Re: (Score:2)
True, but it's possible for a sufficiently advanced algorithm to guess which movies might have shared tags (without naming them), between yourself and others, and make them recommendations. It could even be better than letting humans fill out the tags (which results in tag-bombing, such as on Amazon), assuming a sufficiently large data set.
Think like this, you like Cerebral Lovestories such as ESOTSM. ESOTM is rated highly by yourself and 49 other people and low or not at all by everyone else. If 25 of
Re: (Score:2)
Re: (Score:2)
According to these guys [blogspot.com], movie data's usefulness recedes as more sophisticated data mining algorithms are implemented.
Since they are part of the winning team, there's a good chance they're right. (They could also be lying about it to throw off the competition, but I believe they are required to publish their method so we will find out.)
Yes, this is an appeal to authority, but I only did it because the authority in question claims to have access to strong evidence.
Re: (Score:1)
Re:Usefullness? (Score:4, Interesting)
Re: (Score:2)
Everybody thinks they're unique. Really though, the range of human behaviour isn't all that wide. You can think of groups of people like circles in a venn diagram...Even very different people can have a great deal of overlap.
Re: (Score:3, Insightful)
Apparently recommendations are important, otherwise they wouldn't put that much money towards it. There are tens of thousands of movies you have never heard of, but chances are you might like some of them.
Re: (Score:2)
Then again watching a diverse selection of films isn't terribly advantageous to them. While keeping you happy is concern #1, offering a diverse and perhaps unusual selection isn't any better than keeping you happy with a string of blockbusters.
Re: (Score:1)
I'm not sure what the purpose of these data mining contests are. However, as a member who prefers instant streaming over my XBox 360 over waiting for the mailman to drop off a DVD, I hope the contest yields a better selection of instant playback material. Instant playback on Netflix currently suffers from a mediocre selection of obsolete, boring, B-grade movies. One can only watch Dolph Lundgren's "Retrograde" so many times before questioning whether or not the Netflix membership is even worth it.
Re: (Score:2)
the streaming selection on netflix is limited because netflix has to ink an often complex 'boradcast rights' deal with the studio for each movie in order to steam it. netflix has to compete with other broadcast companies (mostly the tv networks, but also companies like hulu, iTunes), and in many cases the deals are for exclusive rights over a given time period. none of this applies to shipping DVDs.
I believe that adding movies to your 'watch instantly' queue allows netflix to prioritize which movies it shou
Human reaction machines. . . (Score:4, Interesting)
There's nothing at all wrong with studying how the human automatic processes work, but "Psychology for Prizes" does have a very Neil Stephenson feel to it.
The public eagerly jumping for the chance to teach corporate bodies how to better advertise to them seems a little preposterous. In a world where everybody's objective is openness and self-study for the betterment of humankind, this sort of thing would be laudable, but here it's a bald-faced attempt to fine-tune manipulation techniques.
What would be cool would be if Netflix, upon offering you a suggestion, would also explain what reasoning they used to offer that suggestion to you. Open-source advertising. If every billboard had an explanation of the psychology behind it, we could learn much more about ourselves. The amount of free will that we use every day versus automatic behavior can only increase when the illusion of free will is broken down and examined.
-FL
Re: (Score:2)
The only reasoning that is used is "You liked [movie group A], other people who liked [movie group A] also like [movie B] so maybe you will too". There may be something in there to make the groupings by genre but I doubt it, when the first contest started Netflix reluctantly made genre information available after a couple teams asked for it.
Re: (Score:1)
There's a difference between getting people to make impulsive, thus irrational, decisions and providing targeted advertising that might actually be something the viewer wants and was not aware of. The latter arguably benefits the consumer, making people happier. I don't really see how Netflix is doing anything but this. I do agree on opening up ad psychology. Only the manipulative advertising stands to lose.
Re:Human reaction machines. . . (Score:4, Informative)
There's always going to be an argument which makes a manipulative and self-serving action sound benign and cheerful. I remember watching a news piece about one of the top McDonald's CEO types heading over to Russia to try to establish the golden arches there. In a candid shot, he described McDonald's as a sort of angelic entity whose mission was to bring hungry children meat, bread and milk. I wondered if he had really convinced himself of that horseshit or if it was just a face he put on for others.
It's all about spin. The problem is that when profit is the primary motivator, then you cannot ever trust a seemingly friendly face put forth by a company. They don't want to be your friend. They want you to think that they are your friend in order that you might feel comfortable in giving them your trust, money, time and energy.
Now, if the Netflix guys are actually motivated not by profit, but by an over-riding love of film and the desire to share film with the world, then it's a whole different story. You do see this sometimes. I've known several owners of private shops who really love what they sell, but when you scale things up past a certain number of employees, even a founding love takes a back seat to the corporate need to grow profit share and absorb wealth. It's almost like a company only has a single soul which is shared by every participant in the company and thus gets stretched thin.
-FL
Re: (Score:2)
I know that I personally *HATE* it when a company can offer me a service that I really want.
Something like this really pisses me off, though, because netflix is coming into my house and forcing me to watch their "ads"...err..I mean "suggestions".
I mean, it isn't like I signed up for the service and pay them a monthly fee exactly because they have a huge library of movies to me! The worst part is that they charge me every time I rent another movie! A move like this is just an attempt to get me to rent more
Re: (Score:2)
Really? Makes perfect sense to me.
You might carry around your Minority Report-inspired retinal-scanning tinfoil hat, worried about the evils that faceless corporations can inflict upon us if they know our buying habits and personal preferences. I'm a bit more pragmatic: they're going to try to make money, and selling me things I want is a pretty good way to do that.
Here's the thi
Re: (Score:2)
I'm a consumer.
Yeah? I'm a person.
-FL
Re: (Score:2)
Consumers and People (Score:2)
Definitions again. Allow me to clarify. . .
I tend to think that to call oneself a "consumer" is the result of a stupendous and multi-generational maneuver of marketing which reduces the human to the status of a mindless eating machine with no other virtues or qualities of significant value. Sadly, for the most part, this is an accurate state of affairs, but I choose to deviate from that model. I refuse to see my purpose in the world as being simply to desire and work relentlessly towards the acquisition
Re: (Score:2)
The Need for This Contest (Score:5, Funny)
What better way (Score:2)
to take advantage of hordes of unemployed technologists than to get them to provide months of free work?
A Portia D/Olivia W luv scene w/2 hr tension bldup (Score:1)
> Netflix Announces Second Data Mining Contest
Oh thank god I've got another chance!
I was gonna solve the previous one challenge, but never quite got around to it.
Advertising value (Score:1)
Seems like the only reason they keep coming up with such contests is their advertising value. Just my 2 cents.