Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Programming Media Movies IT Technology

Netflix Prize Contest Ends, Down To the Wire 100

suraj.sun updates us on the Netflix Prize now that the competition has officially closed. We discussed the new leader with one day to go in the contest: The Ensemble, taking the lead from long-time leader BellKor's Pragmatic Chaos, the first contestant to submit an entry that broke the 10% barrier. In the contest's final day, BellKor re-took the lead with 20 minutes to go, then The Ensemble apparently pulled a Michael Phelps with 4 minutes to go, squeaking ahead by 0.01%. At least so the leaderboard claims — but those numbers are posted by the competing teams. The NY Times reports that an official winner will not be named until September — Netflix needs that much time to pore through the complex entries and read the code. Netflix contacted BellKor on Sunday to tell them the team remained in first place; The Ensemble has had no such notification.
This discussion has been archived. No new comments can be posted.

Netflix Prize Contest Ends, Down To the Wire

Comments Filter:
  • by Anonymous Coward on Tuesday July 28, 2009 @03:57PM (#28858857)

    They realized that all movies starring Matthew McConaughey and Kate Hudson were actually the same movie. The compression on that alone was enough.

  • by jeffmeden ( 135043 ) on Tuesday July 28, 2009 @04:01PM (#28858931) Homepage Journal
    What they need to start is a contest to improve their incredibly lousy on-demand service, the Silverlight player is beyond terrible. All this effort (and money) over getting 10% more accurate guesses that the same guy who liked "Terminator" will like "Terminator 2" is nice and all, but it's a bit of a time waster don't you think?
    • Re: (Score:3, Informative)

      Its not about the algorithm for movie sorting. Imagine situations where having a 10% more accurate guess would actually count for something important. Now imagine licensing and patenting that algorithm and building revenue from that.
    • Except some of us also like movies that may not have had huge releases or made before we had internets...

      If it wasn't for the internet I wouldn't have heard of Equilibrium (which was pulled back because of 9/11).

      Right now I go through nzbmatrix and open the IMDB links and see what I really like and give it a shot.

      Go, The Man who Knew Too Little, Equilibrium, Dark City, etc are all movies I've seen that I hadn't heard of because I was either in Middle School or they had limited releases.

      • And if it weren't for you, I wouldn't have heard of Equilibrium. Thanks!

      • I heard Equilibrium had limited release because they blew through their budget on actors & effects, then couldn't afford marketing. Do you have a source?

        NB: I don't claim to have a source for *my* assertion, only that the movie is clearly NOT low-budget, but only ran for a short time in my local theaters and has one of the most cheaply-made DVD releases I think I've ever seen. (My buddy's brother made liner notes for it with a chapter listing because there was NOTHING in the box with the disc.)

        Also I
        • Well they certainly didn't blow through it on wardrobe and props. They rocked motorcycle helmets and unbadged Cadillacs in that twisted dystopian future state...
      • Considering it's such a terrible, terrible movie, is missing out on Equilibrium such a bad thing?

    • by furrer ( 29920 )

      Upgrading to Silverlight 3 improved the on-demand image quality and fixed the annoying screen lock issue (at least for me on 64-bit Vista). Now if they could just get some more good movies on there...

      • by lefiz ( 1475731 )
        Apparently Netflix is stuck on this side of the service (the on-demand shizz) due to the arcane Hollywood studio system that has contracts with cable, premium stations, and others that lock up the movies for literally decades leaving new service providers like Netflix with no options. See discussion here: http://slate.com/id/2216328/ [slate.com]
      • Am I the only one on slashdot who refuses to ever install silverlight?

        Yes, I'm an anti MS fanboy, but after the things that happened during the OOXML ISO approval I just cannot support that company anymore. In any way whatsoever.

        If you too (speaking to the general audience here) feel this way about MS, I hope that you do not support them by installing silverlight or in any other way really.

    • Re: (Score:1, Interesting)

      by Anonymous Coward

      No joke.

      I was watching a show last night (Leverage, which Netflix suggested for me at a 4.6 stars and they were right, I love it.). 3-4 times per hour episode it would reset and tell me that it was adjusting things. Often, it never came back until I hit Refresh. I noticed that it would only buffer a small amount ahead (30 seconds) and then try to keep it there, calling an incredibly laggy site called something like controls.netflix.com or something like that. Once that happened, it had about a 25% chanc

    • by peipas ( 809350 ) on Tuesday July 28, 2009 @07:17PM (#28860845)

      I, for one, think the Silverlight player is phenomenal.

      I have limited Internet options-- even though I'm living urban I am not close enough to a CO to get decent DSL speeds (the max Qwest offers is 1.5Mbps). Cable is not an option because my complex has a contract with the television provider who wired the buildings at construction, which is good for those who watch any TV since you get 50+ channels of cable television for free, but bad for Internet options.

      Long story short, my Internet connection has a very high bit error rate percentage because I am getting my DSL over Qwest's line but from an ISP (AT&T via Covad) willing to boost the artificial limit of 1.5Mbps Qwest imposes to 3Mbps, at the expense of a quality signal. This results in being able to truly realize the faster speeds, but also in having a very burst-y connection.

      I find the new Silverlight player to be far superior with its buffering saving the day, allowing me to watch Netflix streaming at maximum quality. The fact that the Silverlight player adjusts quality on the fly is outstanding as well-- when I first start streaming content it may look like shit at first but after a short time it is crystal clear, it realizing my connection can support the data load with a little buffering.

      By contrast, with the old player, even before I had this error-ridden Internet connection, I would find myself initiating an instant streaming session only to find the stupid player would decide my connection was slow and give me piss poor video quality. I would have to click the "Back to Browsing" button and reinitiate the streaming several times sometimes in order for it to give it to me in high quality.

      The new player also provides a great new feature when seeking through the content, where it will scroll past freezeframes of the content as you scroll forward or backward, which is perfect for skipping the intros for TV shows, for example.

      I only wish it would "back buffer" a little because currently when I rewind a little bit, rather than replaying it from memory it rebuffers altogether, as if I hadn't just watched those few seconds prior.

      • by adolf ( 21054 ) <flodadolf@gmail.com> on Wednesday July 29, 2009 @12:03AM (#28862461) Journal

        I, for one, think that the Silverlight player is crap.

        I have a dual-head machine with a very nice 1600x1200 IPS-panel NEC LCD as my primary monitor, and a nice (but far lesser) 24" 1920x1080 TN-panel Asus LCD as the secondary.

        I want to pop up a Netflix show on the secondary monitor, full-screen, and continue to do stuff like read Slashdot on the other display. Silverlight has no problem putting good-quality video up, full-screen, on the second display -- but as soon as I click outside that window (ie, to browse Slashdot), it shrinks back down to windowed mode. My dual-head computer is therefore retarded into being effectively a single-head machine for the duration of the film, unless I either want to watch it in a little window or soak up a couple of cores worth of CPU power zooming in with Ctrl-+.

        Allegedly, if I had Media Center on my computer, this could be worked around. But with Netflix + Silverlight, it cannot be accomplished. Of course, this situation works fine if I'm playing a DVD on my own computer -- it just doesn't work with Netflix's streaming service.

        It is therefore retarded (in a very literal sense of the word).

        I'd like also to note that Flash seems to have the same difficulty, and that its behavior is similarly inexcusable and retarded.

        The best I can do, if I want to watch a film in my office and occasionally fuck around on the Web, is fire up my 4-year-old laptop and use that to browse with instead of my badass dualhead desktop rig. Which, also (and obviously) is retarded.

        I've complained to Microsoft Silverlight developers directly about this, and the best they ever say is something like "You're right. It is retarded. Maybe we'll fix it some day. *harumph*" while months/years pass by and it's still an issue.

        • Mod parent +999 ... this is a serious PIA and it would be nice if this was not the behavior.
        • Re: (Score:3, Informative)

          I get this same behavior with every desktop video player I've used. The flaw is with Windows, not Netflix, not Silverlight, not VLC or any other video player.
          • It's user error if you're having the issue with VLC. Try moving the controller window to the other monitor.
          • by adolf ( 21054 )

            You read poorly.

            I specified that it works fine with a DVD on my own computer. And that it works fine with Media Center, which is just another player (though rather more far-reaching than most).

            If it's possible with some applications but not others, then it is plainly an application problem and not a Windows problem.

        • by Tablizer ( 95088 )

          For all this bickering about whether Silverlight is any good, in my experience offering multiple viewers is the better way to go: each viewer will work better on different machines. If they have a choice of say 3 viewers, then one is bound to work.

      • Of course it is great, when all your know is Flash. Flash is one of the worst things that ever happened to internet video.
        I hope browser-video-integration (like in Firefox 3.5, but with all formats, like H.264, XviD, MP3, AC3, Matroska containers, etc.) will change this for the better, and kill off Silverlight too.

        Look at this (with Firefox 3.5), if you don't think it can kill Silverlight: http://people.mozilla.com/~prouget/demos/ [mozilla.com]

    • Re: (Score:3, Interesting)

      Comment removed based on user account deletion
    • Speak for yourself. I use the silverlight service at home at it works great for me. I output from my computer to my LCD TV and it looks good to me. As a bonus since it's not flashed based I can full screen it on my TD and still do stuff on my monitor.

  • team a makes algorithm improvement b

    team c takes algorithm improvement b and makes algorithm improvement b(+d)

    team e takes algorithm improvement b(+d) and makes algorithm improvement b(+d)->f

    the guy who squeaked out the extra 0.01% did that on top of someone else's code that eked out 0.05%, etc., ad nauseum

    so how do you ascertain who won? all the teams won

    they should take the final prize money and try to fractionate each incremental improvement in the algorithm and proportionally dole out the money that aways. anything else is unfair

    • by AlXtreme ( 223728 ) on Tuesday July 28, 2009 @04:27PM (#28859285) Homepage Journal

      so how do you ascertain who won? Netflix won

      There, fixed that for you.

    • by johnsonav ( 1098915 ) on Tuesday July 28, 2009 @04:30PM (#28859333) Journal

      so how do you ascertain who won? all the teams won

      No, they didn't, at all.

      Any bozo can get 5% improvement. It's the last 5% that's tough. And, of that last 5%, the first 2.5% is cake, compared to the last 2.5%.

      they should take the final prize money and try to fractionate each incremental improvement in the algorithm and proportionally dole out the money that aways. anything else is unfair

      As someone who participated, but did not win, the first place team deserves the entire million (if not more). This was a race, and second place is the first loser.

      • by DNS-and-BIND ( 461968 ) on Tuesday July 28, 2009 @11:09PM (#28862195) Homepage
        Uh, that's a pretty disgustingly American viewpoint of the issue. Can't we all agree that if you didn't come in first, then you can still be a winner? This has been taught in schools for a long time now, it still hasn't been internalized?
        • by ryturner ( 87582 )

          Uh, that's a pretty disgustingly American viewpoint of the issue. Can't we all agree that if you didn't come in first, then you can still be a winner? This has been taught in schools for a long time now, it still hasn't been internalized?

          No, this was a contest to see who could improve the algorithm the most. There can be only a single winner. If the contest was to improve the algorithm by x%, then there might be multiple winners. If you start calling everyone a "winner", you just cheapen the experience for the true winner.

        • Uh, that's a pretty disgustingly American viewpoint of the issue. Can't we all agree that if you didn't come in first, then you can still be a winner?

          No, we can't; not unless you want to redefine the word "winner" in such a way that it loses all meaning.

          That's not to say that the people who didn't win, don't get anything out of the experience. I learned a hell of a lot about recommender systems. But, I still did not "win". I'm okay with that.

          This has been taught in schools for a long time now, it still hasn't been internalized?

          And I hope it never will be.

        • It's this childish view that makes people think they should take up tasks they are not suited for, boldly proclaim "facts" they don't know anything about and in general behave unresponsibly.

          Kids/People need to know that life isn't always fair, not everyone is a winner and not people just aren't equal (in an individual sense).

          Telling them this stupid crap just makes them have unrealistic expectations and gives them a sense of entitlement that won't do them any good and quite probably will cause problems duri

        • Actually, that's a pretty disgustingly realistic viewpoint of the issue. I'm sorry that you think Americans are competitive bastards who only care about winning, but you're wrong: if you don't come in first, then you lose. Pretty simple (even for a competitive bastard).

          Now, I agree that whether I win or lose (note there's only two categories there) is irrelevant when it comes to the experience. I can still enjoy a competition thoroughly and lose miserably.

          There have to be winners and there have to b
          • Why is it us intellectual superiors are the rich, powerful, successful people; but the people who are edging by are the happy ones with exciting lives and goals?
      • If you ain't first, yer last!

    • by flynt ( 248848 )

      The code is not shared unless they wanted it to be. You can't take the current leader's code and make an improvement unless the team gives you their code. This was not how the contest was run.

      • Exactly.

        I do find it interesting that one of the winning teams (BellKor's Pragmatic Chaos) is a merger of three other top ten teams, and I wouldn't doubt it if you told me The Ensemble is a merged team as well (especially given its name). I certainly can't fault them for merging. Why risk winning nothing, when you can split a cool million with two other teams and have a much better chance of winning?

        What would be funny is if the last two teams merged at the last minute ^_^ (Not feasible, I'm sure, but it

        • BTW, "Ensemble" is a particularly cute name for a team in this contest since, in machine learning, ensemble methods [wikipedia.org] are compositions of other machine learning algorithms to achieve (hopefully) higher classification algorithm than any of the component algorithms alone.
        • Yes, The Ensemble is a merged team as well in an attempt to beat BellKor as the clock was winding down.
      • Unfortunately, this is not the case. Each year, the winner of the Progress Prize WERE obliged to publish details of their algorithms to date ... and it was very clear that all the other competitors jumped on these details.

        The contest was very clearly defined by these periods (sometimes of 11 months) where no real progress was made except by 1 or 2 top teams, and then 1 month right after publication of the latest ideas and algos, which caused everyone to catch up significantly by clearly copying and tweaking

        • > Unfortunately, this is not the case. Each year, the winner of the Progress Prize WERE obliged to
          > publish details of their algorithms to date ... and it was very clear that all the other competitors
          > jumped on these details.

          Unfortunately? :) What's wrong with it? They win money, and NetFlix get them to disclose the details.

          This is just great for humanity as a whole.

    • Re: (Score:3, Informative)

      by gurps_npc ( 621217 )
      Not exactly. Team A makes algorithm improvement Z Team B hears about Team A's work and creates something SIMILAR. They then fine tunes it to Algorithm Y Team C hears about A & B's work. They try and duplicate it, but can't. Instead they fine they come up with X, something they thought was Y but really wasn't. Team D hears about A,B and C. They find a way to combine Y and X, making W. It is clearly original, but is based on similar ideas for Z/Y and X.
  • So The Ensemble ripped bong with 4 minutes to go which gave them the creativity to squeak ahead by %0.01?

  • by kilraid ( 645166 ) on Tuesday July 28, 2009 @04:36PM (#28859415)
    Netflix calculates the score shown on the leaderboard from a set of rating predictions submitted by a team. The team does not, and will not, know the correct answers. For testing their algorithms, the teams use another dataset. The two datasets, part of the package made available to the competitors, are known as "qualifying" and "probe".
  • Go Bellkor!!! Sorry, I'm biased :)
  • by Spy Hunter ( 317220 ) on Tuesday July 28, 2009 @04:46PM (#28859539) Journal

    The reason BellKor is still first is that the published scores are irrelevant. The scores that matter for the prize are based on an unpublished data set known only to Netflix (to prevent people submitting answers that are optimized for the challenge data and work poorly on everything else). On this secret data set, BellKor's algorithm apparently performs better than The Ensemble's.

    • I don't think this is quite clear. The teams submit their predictions on a qualifying set of about 1.8 million user/movie pairs. Half of these are used for the published leaderboard rankings, and the other half secretly scored for the actual prize. What the teams don't know is which user/movie ratings are in which half. The teams aren't submitting programs that Netflix runs though, they're submitting predictions on the whole quiz set.
  • I don't use netflix, i'm a blockbuster guy cause we happen to have one close to our house. But What is 10% of zero? in all seriousness, what is their accuracy now? How is it determined?

    • # of suggested matches people actually say they like / Total # of suggested matches.

      Was it really that hard to guess?

      • I don't guess when it comes to statistics. Because there are usually a ton of variables and long algorythms. It's not usually just x/y. Often, it's more like X(a+b-sqrt(z)(2))/Y(.8(z^2)).

        That can only work for people who vote. It's a selection bias or something if I remember right from stats. For instance, I don't want to fill out an effing survey everytime I buy a product. I just want the thing in my hands as soon as possible so I can use it. When it's use is up, rentals get returned, disposables ge

        • I don't guess when it comes to statistics.

          Fair enough.

          Because there are usually a ton of variables and long algorythms. It's not usually just x/y. Often, it's more like X(a+b-sqrt(z)(2))/Y(.8(z^2)).

          That can only work for people who vote. It's a selection bias or something if I remember right from stats.

          It isn't necessarily a selection bias. It can create one, but it doesn't have to. If "people who vote" are representative of netflix customers in general, there is no selection bias. But you are right that selection bias is a concern.

          For instance, I don't want to fill out an effing survey everytime I buy a product. I just want the thing in my hands as soon as possible so I can use it. When it's use is up, rentals get returned, disposables get recycled, and Blackbutte porter gets turned into Bud "Lite".

          So 1. that means that people have to rent the selection suggested, two actually have watched it when they return it, 3rd vote on how well they liked it. Is it a 1-10 scale? or a yes no?

          Come on this is /. I expected at least a little more detail.

          You must be new here.

    • by Wildclaw ( 15718 ) on Wednesday July 29, 2009 @12:53AM (#28862713)

      They used root square mean [wikipedia.org] for the competition.

      Basically, the difference between the guess and the real answer for each vote is squared giving a value between 0 and 16 (as the biggest error is 4 when you guess 5 on a vote that is 1 or vice versa). This is summed up for each vote in the test and then divided by the number of votes in the test. Finally you take the root of that.

      The winner score in the competition is around 0.855. Which is smaller than 0.9514*0.9 score. Where 0.9514 was the result scored by the netflix algorithm.

      I hope that explains everything.

      • I always wondered why squaring is always used for distance measurements. It seems it would over-magnify the influence of non-matches and outliers compared to what the "real world" would want. I realize it makes the math easier, but having easier math and having the best answer may not be the same thing. I've asked math experts about this, but it seems it's an under-studied question.

        • by Wildclaw ( 15718 )

          Well. You pretty much answered the reason why, just by saying "measuring distance". Root Mean Square is the formula for measuring distance in an n-dimensional euclidean space.

          But you are right to question if it is the best solution to some real world problems. Why should you represent a person's opions about movies, as a location in an n-dimensional euclidean space?

          • by Tablizer ( 95088 )

            Perhaps I misunderstood the original message. I'm mainly thinking about regression formulas.

  • ...I'm sitting here wondering how stable these algorithms are over long periods of time. I'm assuming that the "practice" data set and the "test" data set are equal in terms of time distribution (date of movie release; date of review). But 10 years from now, 20 years from now, I see the RMSE numbers slowly drifting upwards as the algorithm was optimized to the 2000-2009 data set, not the 2000-2020 data set or tahe 2000-2030 data set. But this is not my area of expertise so I'm wondering what others have
    • by jdigital ( 84195 )

      Each year that the competition was running NetFlix awarded a progress prize to the best ranked team at the end of the year. Part of the requirements of winning this prize is publication of scientific papers describing key elements of their algorithms. BellKor's Yehuda Koren presented a paper at SIGKIDD in July describing the improvements they made to their algorithm to take advantage of predictable temporal dynamics of ratings. Check out the paper here [yahoo.com]

      From an article about the paper:

      While movies themselve

    • Ideally it wouldn't be make any difference.

      The idea behind the solution is that it can be applied to any set of data where people make a preference regardless of 'type'. That could be cars, politicians or slashdot comments.

      "Based on the way you previously have moderated slashdot comments and the way other moderators have previously ranked similarly to you we think these comments will be +5 insightful to you."

  • No fair! You changed the outcome by measuring it!
  • If ((MovieReleaseDate > 2000) OR (MovieProducer = "Bay, Michael")) AND (CustomerAge > 20) Then
      Score = Score * .5

    If (MovieFABRating = "Nudity") AND (MovieLeadActor != "Cohen, Sasha Baron")
      Score = Score * 10

  • http://www.netflixprize.com/community/viewtopic.php?id=1498 [netflixprize.com]

    "Thanks. In fact, this is a very happy day for us - our team is top contender for winning the Grand Prize, as we have a better Test score than The Ensemble. (Probably this is the first post revealing this in the forum smile)"

    Also, Yehuda Koren is at Yahoo now, not AT&T.
  • I was hoping one of these groups would invite some new and powerful algorithm for categorisation. Its a nearest match/ similarity problem, so any algorithm would be very useful else where. But they just seem to have taken existing many techniques and merged them, by combining the votes/scores from each algorithm, in a way thats probably fits just the movie matching problem and not anything else.

    ---

    Information Retrieval [feeddistiller.com] Feed @ Feed Distiller [feeddistiller.com]

Trap full -- please empty.

Working...