New Google Tool To Find Trend Correlations 76
Kilrah_il writes "In 2008 Google found correlation between seasonal flu activity and certain search term, a finding that allowed it to track flu activity better and more rapidly than previous methods. Now, Google is offering a new tool, Google Correlate, that allows researched to do the same for other trends. 'Using Correlate, you can upload your own data series and see a list of search terms whose popularity best corresponds with that real world trend.' Of course, Google reminds us that correlation does not imply causation."
Never Forget: Heisenberg Rules (Score:3)
This is a wonderful tool. In the short term, it should allow a lot of people to track interesting trends.
In the long term, though, Heisenberg Rules. If I may paraphrase, "Knowledge of the model, invalidates the model."
Want a real world example today? Stock market. This is why automated make-money tools don't work nearly as well as they should.
Re: (Score:1)
Re: (Score:2)
Hell he almost has me convinced as well.
Re: (Score:3)
Are you suggesting coconuts migrate?
Re: (Score:3)
Sure they migrate. Up to approx 100', mostly vertically.
Re: (Score:2)
Re: (Score:1)
This is a wonderful tool. In the short term, it should allow a lot of people to track interesting trends.
In the long term, though, Heisenberg Rules. If I may paraphrase, "Knowledge of the model, invalidates the model."
Want a real world example today? Stock market. This is why automated make-money tools don't work nearly as well as they should.
They work precisely as well as they should. It's just that because they're not adding any information, they will do about as well as the market.
If you use automated tools and are satisfied with relatively low growth, you can split your money into a portfolio that will make a tidy return over time. That's what mutual funds and so forth basically do. You can even steadily shift your money to lower risk funds as you approach retirement. You can make plenty of money this way for the effort you put in, it just t
Re: (Score:1)
Don't drop a bomb like that and then leave us in suspense!
How well should they work?
US only? (Score:3)
Unfortunately the service appears to be limited to US search data. Hopefully this will be extended in the future.
Re: (Score:3)
Other places have privacy laws that google isn't ready to lawyer about.
Re: (Score:2)
Google == free stuff! (Score:3, Interesting)
I'm really starting to like this company. Free web browser, free word processor (and spreadsheet?), free language translation, free nudie pics, free scanned books, free email, free Usenet reader, and now this cool Dataset research tool.
Still not sure I want to store my documents on the internet though. (1) Not secure. (2) Government can review the documents without having to ask a judge for a warrant.) But overall I guess Google is a decent company. Why pay for stuff you can get for free and legal?
Re: (Score:2, Informative)
“If you are not paying for it, you’re not the customer; you’re the product being sold.”
Re: (Score:2)
That's odd, I don't feel sold. I don't owe Google or its advertisers anything. No slave traders are knocking on my door.
Perhaps you were searching for a less alarmist term?
Re: (Score:2)
And if the ads don't make you buy things, then your cost is zero. Meaning the stuff you get in return for looking at the ads is really free.
Re: (Score:2)
Nope, you are still the product. Just because you do or don't click the ads doesn't mean much. Your demographic information is their largest product. Everyone who uses Google is a data point that is used for this new product. Where do you think they come up with these cool associations?
Re: (Score:3)
>>>Those things you mention are not Google's product and you are not Google's customer. YOU are in effect Google's product. They're selling you to advertisers and "paying" you with those things.
Same is true with free TV and radio.
Your statement is 100% true, but
I don't worry about it. (shrug)
Re: (Score:3)
>>>Explain how is local storage would be more secure than remote storage?
HUGE difference. It requires a warrant to enter my house and obtain the files. A warrant requires probable cause (we suspect he's a murderer, because we smell dead bodies), and review by an impartial judge to approve the warrant.
Remote storage is subject to random snooping by a bored FBI agent browsing through Google's or Apple's or Microsoft's servers. (Thanks to the Patriot Act.)
Re: (Score:3)
With local storage, you have choices. Who can use my computer? Do I use an encrypted volume? Do I use Windows, Linux/*nix, or Mac? What program(s) do I do it with?
With Google docs, your spread sheet is in their format. Your letter is in their format. You can export it, print it, and whatever else makes you feel good. They retain your browsing and activity history. They have every email you've sent and received. In theory it's all yours privately. In reality, it's yours, and
Re: (Score:2)
Free in terms of cash yes, but cash isn't the only form a payment possible. With Google you barter your personal information and habits for all those 'free' things.
Re: (Score:2)
Re: (Score:1)
Blah blah blah blah blah blah blah. Blah blah blah, blah blah blah (blah blah?), blah blah blah, free nudie pics, blah blah blah, blah blah, blah blah blah, blah blah blah blah blah blah blah.
what?!? where?!?
Re: (Score:2)
Re: (Score:3)
So when do they release the next product: Google Causation?
There is actually a very strong possibility of this. Because sites are ranked by popularity of selection Google itself could well amplify trends. If you do a search for "family health" and the top results are news reports on increased rates, what would your next search be? If you get sick what are you likely to put it down to?
Correlation is causation. (Score:1)
Just think of all the things I'll be able to prove with this!
Re: (Score:2)
dear god... it's only slightly less correlated than babies near airports!
Re: (Score:2)
Shouldn't there be a 9 month period between the cause and effect? It should be that Excel causes people to make babies. :)
Re: (Score:1)
I agree with parent's sarcasm - while this is pretty cool, I can't help but feel that it's more of a toy than a tool...
More simply put (Score:2)
From TFA: "like Google Trends but in reverse."
Misleading name? (Score:2)
This tool finds an association between categorical data, namely a search word and counts for searches of that word. "Correlation" refers to a special type of association, i.e. between two quantitative data, which, correct me if I'm wrong, this tool does not measure. Am I being pedantic here? Or should we take a stand for correct and precise useage of statistical terms?
Re: (Score:2)
No, it's correlation. You have one data set (numbers of searches through time for the inputted term) and it compares with other data sets (the number of searches thorough time for each term available).
If you click on "Search By Drawing" you can see the two lines - data sets - in the graph: the one you draw and the one with the best correlation from their search terms.
Re: (Score:2)
Correlation does not imply association; but correlation is interesting nonetheless. If you can predict one thing via another, more predictable or measurable thing, then you have a way to track elusive data. That it isn't a cause or an associated thing is immaterial.
Think about paid time off. Sick leave is associated with the flu; but paid time off is a condensation of sick leave, vacation time, etc. So people take a vacation day for a vacation, or a vacation day for being sick... now it's no longer as
xkcd (Score:2)
Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.
http://xkcd.com/552/ [xkcd.com]
Re: (Score:2)
Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.
http://xkcd.com/552/ [xkcd.com]
Actually, correlation does IMPLY causation. However, correlation does not EQUAL causation.
Take the following as an example:
I mow my yard with out my shirt on. The neighbor lady sees me and pukes. We can IMPLY from the given data that seeing my giant uni-moob (gut), my white, saggy, hairless chest, and my man-scaped back hair landing strip caused my neighbor to lose her lunch.
As it turns out, she had a stomach bug and had been bio-matter from both ends all day, but without all of the data, we were left wi
Assumptions out the window (Score:3)
Correlations are one of those simple statistical terms that lots of non-technical people like to throw around without actually knowing what it means. It's a wonderful tool that Google has provided for everyone but people need to remember what the basic assumptions are of correlations, namely a relatively normal distribution of scores and independence of observations. Independence is especially important if you're tracking search engine results because if you were to look at how many times people Google'd Randy Savage's name the day he died it would influence the subsequent day, ultimately biasing whatever other variable you decided to correlate it with.
Re: (Score:2)
Ooh, and I'm sure Google's neat linky home page logos on special days weigh on the results too!
Re: (Score:1)
Yep:
http://correlate.googlelabs.com/search?e=Sir+Arthur+Conan+Doyle's+birthday&t=weekly# [googlelabs.com]
Look through the past doodles (US/Global ones) here: http://www.google.com/logos/ [google.com] and then search for them here: http://correlate.googlelabs.com/ [googlelabs.com] to see how powerful the doodle is at generating search volume.
Re: (Score:1)
http://correlate.googlelabs.com/search?e=Alfred+Hitchcock&t=weekly# [googlelabs.com] is another good example, especially given that this one is just his name and doesn't include the term "birthday" and that searches for his name probably number quite highly anyway.
Correlation vs Causation? (Score:2)
Correlation != Causation (Score:2)
Re: (Score:2)
So I guess there is some truth in the age old saying, Correlation != Causation.
That's true as far as it goes, but in this case it's because they're selecting from a very large ensemble of data series to find the one with the highest R. If you account for the number (and sizes) of data series they evaluated to do that, you can estimate the "true" significance level in a realistic way. Statistics leads us astray only when we fail to apply it properly.
Correlation of a preliminary kind may not "imply" causation, but it can certainly suggest it, sometimes very strongly. A repeatable correl
Weird science (Score:4, Insightful)
Trends in online web search query data have been shown useful in providing models of real world phenomena. However, many of these results rely on the careful choice of queries that prior knowledge suggests should correspond with the phenomenon.
Yes, that is how science is done; hypothesis, predict, test, evaluate.
Here, we present an online, automated method for query selection that does not require such prior knowledge. Instead, given a temporal or spatial pattern of interest, we determine which queries best mimic the data. These search queries can then serve to build an estimate of the true value of the phenomenon.
So we have a backwards type of science: Evaluate, test, predict, hypothesis. Cuz hey, if there's a correlation, there must be a relation, and if there's a relation, we can build an estimate of the value of the relation, right? The marketing manager is gonna LOVE this....
Re:Weird science (Score:5, Insightful)
Absolutely correct, this is going to swamp us in false positives. Remember, in order for science to work the way it's supposed to we have to report the negative results as well as the positive results. If 20 groups do the same experiment and only one gets a result significant at p=.05, that "positive" result doesn't mean anything. p=.05 means there's a one in 20 chance of the correlation being random.
It's the same thing here. If Google goes out looking for positive results, and ignores all the negative results this is going to be so skewed as to be worthless.
Re: (Score:1)
It's not designed to answer questions. It's designed to help you form hypotheses. Correlation is not causation but it *might* imply a relation. That's why a human being with a brain looks at the results and goes "hmm, that's interesting".
As an example, "google" correlates well with "kratom", a plant used for herbal remedies. The correlation is very high, too, 0.98! I don't blindly assume they are related, though. But I could parse down to see if any other connections make sense and could then test tha
Re: (Score:1)
Re: (Score:2)
Re:Weird science (Score:5, Insightful)
Yes, that's true, except there's a step before hypothesis. Observe. You're not allowed to use data from your observations that generated the hypothesis to support it, but you are allowed to use data to build a hypothesis in the first place.
As their comic points out, property values correlate to liposuction searches. That's an interesting fact that you might make a socioeconic hypothesis based on. You could then turn to other avenues of research to validate your hypothesis.
Not everything in science is a race to conclusions.
Re: (Score:2)
That entirely depends on the precision of the IQ number, doesn't it?
What if your IQ number is a float number with a non terminating non repeating decimal component? I suppose then people would have an infinite number of digits in their IQ. Maybe we could draw a correlation between the number of digits people mention when reporting their IQ and how anal they are?
Re: (Score:2)
Then it's not a float. That's an infinite precision number. Floating point numbers have a fixed precision, but not a fixed magnitude.
Re: (Score:2)
How do you know the SIZEOF() my float?
Are you saying *your* computer doesn't have arbitrary precision floats? Step into the 25th century!
Correlate the Correlation (Score:2)
This is great! Now we can finally analyse what people are correlating in Google Trends that tells us what people are searching, then we can use this correlation search data to build Google Correlate Correlate, then we can use this to analyse what people are correlating on things that other people are correlating, then.. then the thing goes on and on and on..
1) Google Search
2) Google Trends
3) Google Flu Trends
4) Google Correlate
5) Google Correlate Correlate
6) Google Correlate Correlate Correlate
7) ???
8) Prof
Google Causation? (Score:1)
Possible cause of recession found (Score:1)
Highest normalized weekly search term volume? (Score:1)
Here's a quick game. Try and find a term with the highest weekly search volume when normalized against the usual search volume for that term.
Here are a few that I tried:
http://correlate.googlelabs.com/search?e=inauguration&t=weekly# [googlelabs.com] - 19.637
http://correlate.googlelabs.com/search?e=Michael+Jackson&t=weekly# [googlelabs.com] - 14.537
http://correlate.googlelabs.com/search?e=Olympics&t=weekly# [googlelabs.com] - 11.656
http://correlate.googlelabs.com/search?e=new+year's+eve&t=weekly# [googlelabs.com] - 8.355
Also, check out the "Search by Drawing"
Correlation is evidence of causation (Score:2)
A proof that correlation is evidence of causation,
even though correlation does not imply causation:
http://kim.oyhus.no/CorrelationAndCausation.html [oyhus.no]
Autism and Christmas (Score:2)
We see people, not topics (Score:2)
just for fun (Score:1)