Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
Stats Supercomputing

Mystery MLB Team Moves To Supercomputing For Their Moneyball Analysis 56

An anonymous reader writes "A mystery [Major League Baseball] team has made a sizable investment in Cray's latest effort at bringing graph analytics at extreme scale to bat. Nicole Hemsoth writes that what the team is looking for is a "hypothesis machine" that will allow them to integrate multiple, deep data wells and pose several questions against the same data. They are looking for platforms that allow users to look at facets of a given dataset, adding new cuts to see how certain conditions affect the reflection of a hypothesized reality."
This discussion has been archived. No new comments can be posted.

Mystery MLB Team Moves To Supercomputing For Their Moneyball Analysis

Comments Filter:
  • by mutantSushi ( 950662 ) on Saturday April 05, 2014 @06:20AM (#46668395)
    "Supercomputer, is baseball still boring as fuck?" "YES, DAVE."
    • Re: (Score:3, Insightful)

      by brxndxn ( 461473 )

      It's boring to those that see it but don't understand it.

      • by willoughby ( 1367773 ) on Saturday April 05, 2014 @06:51AM (#46668469)

        Ahhh... so it's like cricket, then?

        • Ironically your nick is the surname of a well known South African cricket player.
        • Cricket isn't really about the game. It's more about the small hamper of cream buns, punnets of strawberries, the chilled champagne, the lounging around in the sun with your girlfriend/boyfriend. That and being frightfully English.

          • In that case why not go have your picnic somewhere where you're not surrounded by a bunch of frightfully boring cricket fans?
            • by Anonymous Coward

              Because his one of them :)

          • That and being frightfully English.

            Cricket might not even be the second most popular sport in England. (Association) Football a clear first, then either cricket or Rugby (but which kind of rugby??)

            Vastly more popular in India, Pakistan, Australia and the West Indies / Carribean

        • by bullgod ( 93002 )

          No. Despite the apparent similarities (ball, bat, runs, etc.) cricket and baseball are very different games.

          In baseball, there will be (at least) nine innings, each of which last until 3 outs. It's a competition between pitcher and batter that little can interrupt. I don't think many baseball fans realise that cricket is much more to about about managing resources than a gladiatorial contest.

          So, for example, in a 5-day match every ball you decide to face is one less opportunity for you to dismiss the opposi

        • Not quite.

          Sadly the problem with baseball is it doesn't really give you enough time...

          I mean - At least with cricket I get the prerequisite 4 or 5 days to get which is _just_ about enough time to get properly drunk (and eat cheese sarnies)

      • I can take it or leave it, but a minor league game from wooden bleachers is a much better time for me.
        What I find amusing is the obsession with statistics, considering the randomness of any particular game. But then I don't follow any particular team, its the spectacle of seeing it done. (And the main thing I appreciate is how unimaginable it is relative to my own abilities.)

      • It's boring to those that see it but don't understand it.

        Negative. It is boring to hackers since we're likely to think a sport is something you do, not something you watch. []

      • by idioto ( 259918 )

        I understand it, and that is what makes it boring. Most of the people who watch it do not understand baseball. I used to love baseball, but the more years pass, the less I like it. Like most people in the stadium can see whether a pitch is a ball or a strike, or a curveball or a fastball, you can't see it unless you have perfect seats. I will beat 9/10 baseball fans in baseball trivia, I will beat 9/10 baseball fans in a baseball video game, i will beat 9/10 of baseball fans in whiffleball (because 9/1

    • The million dollar question is that why spend time looking at the past when they could use this to forecast the next winning Powerball ticket?
    • by jonwil ( 467024 )

      Just be glad you yanks broke away from the motherland all those years ago otherwise you would probably be doing what us Aussies are doing and playing the one team sport on this earth MORE boring to watch than Baseball, Cricket.

  • I'm sorry, but not even robots could make this game interesting. I say we sell it to the Japanese while they're still game.
  • Sorry, I RTFA, it was unavoidable.
    Looks like they might actually use the horsepower.

  • by techsoldaten ( 309296 ) on Saturday April 05, 2014 @07:20AM (#46668539) Journal

    My best guess is it's the Cubs.

    They are looking for minority investors in the club right now, and the cost of ballpark improvements is a smoke screen for taking on the cost of big data. Theo has not been the same without Tessie, and it's not cheap to recreate the analysis that system is capable of performing.

    I really wonder what the value of such a system is compared to updating / refining Nate Silver's PECOTA odds to play out hypothetical teams and transactions over a 5 year period. There is so much data available about players at this point, it's almost possible to predict regressions on a macro level.

    • My best guess is it's the Cubs.

      It's a good guess. I'd also say that the Red Sox and A's make the short list of teams that have people in the front office who might see some value in this, although the A's are run on the cheap so it is a little hard to think they'd pay for this.

      • It's not the Cubs, Red Sox, or A's.

        The original story [] about this said that it was "an organization that many might not expect." None of those, or the other teams who've shown marked interest in analytics or who have GMs known to be friendly to advanced analytics (off the top of my head that's the Yankees and Mets, Cleveland, Tampa, Baltimore, Toronto, Seattle, and Arizona to start with) would be particularly surprising. The other thing to note is that "buy a supecomputer!" is the sort of response that a
        • by sconeu ( 64226 )

          The last team you'd expect to be into analytics would be the Angels, given Scioscia's tendencies.

          And Arte may be panicking after the horrendous start.

          • by SpzToid ( 869795 )

            Here's a thought: Maybe the Angels are so loathsome to fire everyone's favorite Mike Scioscia after his World Series win waaay back in 2002, even if maybe his time has come and gone, the management is considering analytics to micromanage Mike's calls?

            For example, player X at bat against Pitcher Y, 2 men on, no outs, count is 2 and 2. Mike says to bunt for the sacrifice, but what do you say DAVE?

        • Seattle? You've got to be kidding. They have an analytics department, but pretty much every decision made by the GM shows he doesn't listen to the analytics department.

          Now three or four years ago... Perhaps. But they fired their main stats guy and the GM has an entirely different group around him now.

        • Well, it says 5 years ago, they would not be a team you would expect. I still say it's the Cubs, and yeah, this is just a guess. But I really can't think who else would have a reason to do it.

          When I go down the list, here's the teams that have a front office with a strong, expressed interest in Big Data.

          - Athletics
          - Red Sox
          - Cubs
          - Padres (Jed Hoyer legacy)

          Here are the clubs that are known to have been investing in advanced metrics previously, in some cases at a limited scale.

          - Nationals
          - Dodgers
          - Rays
          - Phil

      • Yeah, and the Red Sox has Tessie. I don't think they are in the market for a replacement.

    • they don't need to spend all of that you get to get the out put of "NEXT YEAR"

    • Maybe it would be cheaper if they just obtained a copy of "The Cardinal Way". http://www.baseballprospectus.... []

  • they've been buying wins for almost 20 years now, nothing new
  • Why bother with moneyball? If your stadium is more than 10 years old, just whine you need a new one to provide the revenue to be competitive. You can threaten to leave for another city, promise to get an All-Star game, or just quit spending money on decent players for a while to convince the fan base that you really aren't competitive.

    The Twins did a combination of all these things, but of course, the owners decided that more money in their pockets was the real goal as the new money from their shiny, taxp

  • by Anonymous Coward
    It is likely the Boston Red Sox. There was talk of this at the Analytics conference in Boston a month ago.
  • by gatkinso ( 15975 ) on Saturday April 05, 2014 @08:39AM (#46668785)

    ...why haven't they been doing this from the start?

    • It was probably hard to find a super computer in 1876.
    • ...why haven't they been doing this from the start?

      Among baseball people, there is still great resistance to "nerds" and their "spreadsheets". I've heard it stated, on many occasions by many players and managers, you can't understand how to win baseball games if you didn't play the game yourself at the major league level. Last season, Seattle's manager Eric Wedge made more than one sneering comment about spreadsheet users who hadn't played the game since Little League, for example.

      There are a handful of players who DO look at advanced metrics in an attempt

  • "They are looking for platforms that allow users to look at facets of a given dataset, adding new cuts to see how certain conditions affect the reflection of a hypothesized reality."

    Hypothesized reality? Oh you mean if a coach wanted to give a player performance enhancing drugs that they know they can hide to analyze the wins, or do you mean simulating reduced gravity because you plan to bilk the entire nation in taxes to pay for the next baseball stadium on the moon?

    I don't think baseball needs a supercomputer to analyze just how bored I am watching men be paid millions of dollars to stand around 90% of the time in a grassy field, especially when that cost translates to the average A

  • They need to calculate what to do when players go on paternity leave.
  • by Rogue Haggis Landing ( 1230830 ) on Saturday April 05, 2014 @10:02AM (#46669191)
    One of the great pleasures of baseball is that it generates a vast amount of data for the analytically minded to use and abuse to their heart's content.

    This purchase is presumably related to MLB's recent announcement of a new system [] that will constantly track and measure the movement of the ball and every player on the field. Supposedly this is going to generate several terrabytes of information each game, and some team has decided to buy a Cray as a way of processing all that data. Whether that's a better idea than the proverbial Beowulf cluster I don't know, but that seems to be this team's thinking.

    Most, maybe all, baseball teams have been doing some variant of advanced analytics for quite some time now. Most of this work is proprietary and secret, but there's been a lot of "open source" (or at least publicly available) work that's probably along the same lines. Sabermatricians (baseball stat people -- from "SABR', the Society for American Baseball Research) have gotten very good at measuring offense, and reasonably good at predicting hitters' future numbers. Nate Silver's PECOTA system is the most famous, but there are others that work about as well (ZiPS and Cairo being the ones I've spent time with, plus the "dumb as the monkey on Friends" system called Marcel). Pitching numbers are understood pretty well, at least as they relate to the Three True Outcomes, which are the results or a batter v. pitcher matchup that don't involve any defensive players (i.e., walks, strikeouts, and home runs).

    The next great frontier of analytics is defense. There's been a lot of work in this field over the last decade, but the problem has always been in getting good data. If a ball is hit towards the shortstop and the shortstop doesn't get to it, why is that? Is it because the ball was hit too hard? Is it because the shortstop was badly positioned by his coaches? Is it because the shortstop isn't very good? Data that's not much more than "groundball to shortstop" can't really answer that question, but the new tracking system promises to answer that sort of question in full by precisely measuring reaction times, routes to the ball, and so forth. This in turn might lead to greater and greater changes in defensive positioning, different emphases in player acquisition, maybe even in-game changes based on small changes in wind patterns or whatever.

    Some of what we're already learning about defense is very surprising. For example, there has been a lot of work done recently on catcher's ability to "frame" pitches, that is to make a borderline pitch look good. The most current results [] suggest that the pitch-framing difference between the best and worst catcher might be worth something on the order of 5 wins. That's roughly the difference between having a random scrub and an All-Star as your right fielder, and all from a catcher's ability (or inability) to fool the umpire. It's shocking.

    As for what team this is, when the news first broke it was claimed that the purchasing team "would surprise most people". That rules out the teams that are well-known to be friendly to advanced analytics -- starting with the Red Sox, Yankees, Cub, and A's. The best guess I've seen is that it's the Phillies -- they have tons of cash and seem to be very behind on analytics, and seem likely to just go out and buy a supercomputer rather than have the MIT grads in their analytics department jerry-rig a bunch of Debian boxes into something cooler and weirder.
    • by tjb ( 226873 )

      Yup, defense is where it is at. The SF Giants won two WS in 3 years by (accidentally, I think) putting together a team that was focused on pitching and defense while everyone else was focused on offense.

      While offense is WAY more important, it is too well understood now to gain any advantage over other teams. In 2002, Billy Beane could flip a guy with a great swing or subjectively good defense for someone with better OPS+ and generate wins because everyone else valued the scout's opinions and not the numbe

    • by mea2214 ( 935585 )
      The problem with baseball analytics is more in the misuse of mathematics than lack of processing power. Few of what they call "peripheral" stats like FIP and BABIP have any mathematical proofs whatsoever. Other stats try and normalize all external conditions to level the field for all players for evaluation. This introduces ambiguous concepts like "stadium effect" being calculated based upon another set of stats and assumptions. All assumptions along the way add up to introduce bias which is why there
  • by Anonymous Coward

    Psychohistory called, and they want their 5% of the profits.

  • A boffin to explain how these blinkly light things work and if they can run hadoop on the item of searing white hot technology (a LEO III) they have in the basement. In the hope that it can stop the English Cricket Team losing to the Dutch!
  • I thought they already knew which teams are going to win, like wrestling?

"Never face facts; if you do, you'll never get up in the morning." -- Marlo Thomas