Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Google

Google Launches a Data Prediction API 70

databuff writes "Google has released a data prediction API. The service helps users leverage historical data to make predictions that can guide real-time decisions. According to Google, the API can be used for prediction tasks ranging from product recommendations to churn analysis (predicting which customers are likely to switch to another provider). The API involves three simple steps: upload the data, train the model, then generate predictions. The API is currently available on an invitation-only basis." Google also recently announced several other API additions, including Buzz, Fonts, and Storage.
This discussion has been archived. No new comments can be posted.

Google Launches a Data Prediction API

Comments Filter:
  • I Predict ... (Score:2, Insightful)

    ... that Google will do their own analysis on your data. They're nothing if not thorough.
  • 1. Upload the data.
    2. Train the model.
    3. Generate predictions.
    4. PROFIT!!!!
  • Psychohistory ? (Score:3, Interesting)

    by Vapula ( 14703 ) on Thursday May 20, 2010 @09:05AM (#32278624)

    What about feeding it with historical events, train with the outcome from these events and try to get a glimpse at which way the future will evolve ?

    • Re:Psychohistory ? (Score:4, Interesting)

      by 0100010001010011 ( 652467 ) on Thursday May 20, 2010 @09:14AM (#32278776)

      Or use the last half of your data set as blind data. Train the model on 1900-1990 and see if it can predict 1990-2000.

      How far can you predict? 1%, 10%, 50%?

      If you want to really see how good it is feed it stock market data and see how well it predicts that.

      • Stock market data by itself is insufficient to predict the stock market because of all the external variables. It would be impossible to predict the post 9-11 crash for instance because there is nothing in the markets that changed leading up to it. It would be difficult to predict the more recent meltdown because it was caused by a combination of lax oversight, repealed laws, semi-legal trading techniques, and a culture of over borrowing. It's possible that you may be able to predict the minute to minute

        • Re: (Score:3, Interesting)

          by Kilrah_il ( 1692978 )

          The nice thing about the stock market is that when everything is fine the analysts say that their models are great, but when something unexpected happens they go all "but we couldn't have foreseen that. Except for this unexpected incident, our models are great!". The problem is that these "unforeseen incidents" are what drives most of the extreme changes in the stock market, and more generally, in our entire society.
          Just look at 9/11 (to use your example): It not only affected the economy, it affected (and

          • Re: (Score:3, Interesting)

            by fusiongyro ( 55524 )

            +1! "Past performance is not a predictor of future success." Taleb is my hero. Everyone should read Fooled by Randomness, which I didn't find repetitive at all.

          • Re: (Score:3, Insightful)

            by S-100 ( 1295224 )
            Like I heard a seasoned stock trader once say: "Technical analysis works great, until it doesn't".
    • There is a 97.5% chance that this could all end badly.

      Good book :)

    • Nothing new there. The risk analysis used by most of the wall street firm to calculate their risk exposure is doing just that. And we all know how that turned out to be.
  • by cntThnkofAname ( 1572875 ) on Thursday May 20, 2010 @09:06AM (#32278634)
    Given my family history... is there a girl for me?
  • by psbrogna ( 611644 ) on Thursday May 20, 2010 @09:15AM (#32278778)
    I can't wait to take my Droid to Vegas once this launches!
  • Taking into account they released it, and they probably used it to predict its own success; this will either:

    - work, and be a success
    - not work, and fail.

    The future is here!

  • Data mining (Score:3, Informative)

    by JayJayEm ( 220851 ) on Thursday May 20, 2010 @09:18AM (#32278842)

    When I used to work in the financial services industry we used to call this "data mining". The result is usually at best worthless and at worst dangerous as it is so often misused.

    It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

    • Re:Data mining (Score:4, Interesting)

      by LizardKing ( 5245 ) on Thursday May 20, 2010 @09:57AM (#32279536)

      It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

      A friend of mine works as a quant at one of the big investment banks. He admitted that the models his team creates are useless at predicting the unexpected (as you'd probably expect). Adding in a degree of randomness rarely produces better models, as there are too many possible sources of such unpredictability and the reactions to them depend on many unquantifiable forces. This results in models that are OK at telling traders what they want to know - that they're doing the right thing by all doing the same thing. As soon as something undesirable or unexpected happens, then all hell breaks loose and the traders panic. Having mulled this over for a bit, I suggested his job was pointless, to which he agreed, but pointed out that the pay's great. So much wasted mathematical genius.

    • It's absolutely data mining, but it's far from worthless.

      Every time you go to Amazon and it recommends something to you, guess what, that's data mining using basically the same techniques that this service will use. And as you might expect, that equates to big $$$ for them (or else they wouldn't be bothering).

      Many many fields use the technology, particularly the medical fields for analyzing the relationships between a large number of input variables (which may or may not be correlated) and some desired out

      • Re: (Score:2, Insightful)

        by JayJayEm ( 220851 )

        OK - I'll admit it - I did engage in a little bit of hyperbole.

        But you have to admit that "at best worthless" has a better ring to it than "at best, when combined with a qualitative analysis of the model itself, and some testing with out of sample data, can be a useful tool in decision making".

        You are right that no investment bank will go anywhere near this.

    • Re: (Score:3, Interesting)

      Often misused, definitely. But that does not invalidate the importance of any tool, including data mining.

      One good example is Netflix recommendation engine. I know it's far from perfect (as there is nothing perfect about prediction), but is it useful? Hell yeah. It's the best recommendation engine I have used and have benefited greatly from.

      Problem is when it's applied to areas where stacks are higher - like risk analysis by the investment banks.

      And that brings me to mention an interesting (old) and r
    • It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

      I was just thinking that this automation will save unscrupulous scientists all the trouble of fudging the models to make the prediction fit their expected results.

    • by afeeney ( 719690 )

      I prefer the more direct: "Numbers are like people. Torture them enough and they'll say whatever you want to hear."

      More seriously, though, a solid predictive system usually needs both the qualitative and the quantitative analyses. These tools can inform decision-making, but can't make the decisions for anybody, unless the decisions are in the same discrete closed system. There aren't that many entirely closed systems in the world.

    • The way one of my co-workers puts it: If you torture the data long enough, it'll tell you anything you want to hear.
  • Google require you to have a current Storage-For-Developers account, which is only available for US parties currently.

  • I predict that within the next year someone's blog or the Wall Street Journal will feature a cage match between Google's Prediction API, a chimp with a dartboard, and a magic 8-ball.

  • I know that word is "churn" but the first 3 times I read it as "chum" Anyway, is this similar logic to how google is able to advertise based on what is discussed in your email?
  • What could possibly go wrong?

    • Given that actively managed funds often outperform the market at large, I would actually expect you to do decently with this strategy, even if Google only gives you random stocks.
    • Well, you could lose everything and kill yourself ... worst case.
  • Please post your privacy concerns in the form of an outraged screed. :-)

  • predict(data)
    {
        delete data
        prediction = random()
        return prediction
    }

    • Quick!

      Patent it in Germany!

      Actually, change the last line to "return whatClientWantsToHear" and you've really got something!
  • Google probably wants to use the data for their own analysis. So, I suggest all of Slashdot team together and forge a large volume of the most bullshit data that will convince Google that, without a doubt, they need to make every first search result named "Frosty P1ss!" linked to goatse in order to make their customers happy.
  • Now I see why the Amazon Cloud people have been so insistent on people in Hacker Dojo's machine learning class run problems on their "cloud".

    This stuff is actually fairly routine by now. It's much the same technology that's behind spam filters.

  • the post made me think of Asimov's old foundation series books.

  • The API predicts that will be an empty niche/opportunity in a day, then everyone that uses it jump there, so the prediction fails because becomes overcrowded. Is very easy to turn predictions for everyone to predictions for none if all try to take advantage of that knowledge.
  • It's interesting to see this coming, as in google becoming a digital Harry Seldon. But while it's good to have plenty of info to which base decisions on, it's becoming what in the Army is referred to as "paralysis by analysis". At some point, you need to trust your instincts, and do it. Pouring over the amount of data google can provide, filtering what is relevant (google isn't perfect), and then deciding what to do would likely take longer than going with your gut, or the smaller amount of available data
  • Well? Can it predict the rest? :)
  • The use of the word "predict" is for ease-of-understanding for the business market and those not familiar with machine learning. Many of the comments here are getting lost in that word. The algorithms behind the API are most likely the same basic ones that have been around for a long time: naive bayes, svm, knn, etc. The actual novelty of this service is that it puts these methods in easy reach for people who otherwise wouldn't know where to start looking, or wouldn't know how to use one of the many availab
  • that's actually a pretty nice idea. The thing seems to have some caveats though: only categorical labels are allowed, training sets are limited to 100mb and no sparse features can be used. There's also no info on whether things like cross-validation are done and what algorithm will be chosen. I also wonder about how fast the prediction phase will be. Still pretty neat.
  • I asked the Google Prediction API what the next Google API would be, and it said "Google Prediction API".

  • Past performance does not guarantee future results.

An adequate bootstrap is a contradiction in terms.

Working...