Google Snaps Up Stats Tool from Swedish Charity 106
paulraps writes "A stats program that began as a teaching aid for a university lecture has just been bought by Google for an undisclosed sum. The statistics tool, Trendalyzer, was developed by a professor and his son at Stockholm's Karolinska Institute. Unfortunately for the developers, the project has been run under the auspices of a charity, Gapminder, and financed over the last seven years by public money. Maybe that seemed smart at the time, but the professor, admitting that he won't see a dime of Google's cash, now seems regretful. As for what Google has purchased: 'Public organizations around the world invest 20 billion dollars a year producing different kinds of statistics. Until now, nobody has thought of collecting all the information in the same place. That should be possible with Trendalyzer, which will be able to present that quantity of data in a clear way as well as giving the user the ability to compare many different kinds of information.'"
huh... (Score:2, Insightful)
Re: (Score:1)
Google is buying non profit orgs.... must make them evil!!!.....nah nah ok
I for one welcome our non profit buying overlords!
that works a little better..... on a serious note, all the developers who were working on this for free are now kicking themselves
Re: (Score:1, Troll)
Re: (Score:1)
Ouch (Score:1, Redundant)
Major bummer.
Re: (Score:1, Offtopic)
Re: (Score:1, Insightful)
Any Regrets? (Score:2, Interesting)
At least he can be content to know that Google will be the bestest, most very perfect company ever, since they come right out and say, at every opportunity, that their policy is "don't be evil".
And since they say they won't be evil, we know they can't be lying! (Please ignore how they help totalitarian right-wing regimes to identify people who speak out against them, and empower governments to clamp down on free speech)
Wait a minute.. (Score:4, Insightful)
And how did this software get under the control of the non-profit? Is the prof getting a salary from them?
That the summary says Google "snapped up" the software seems to suggest that Google snatched it out of their hands or something. I've got a feeling that money changed hands somewhere along the line. Somebody got paid, and I'm betting it was a bundle. Anybody who's smart enough to write an important bit of software ought be able to read a contract before he signs it. And if he thought that just because an organization is non-profit it means that it's not looking to get a pile of cash then maybe he's been vacationing on Pluto for the past few decades or doesn't read the business section of the newspaper. If he didn't write the software to make money, then he shouldn't cry because he didn't make money. If he wanted to make money from his software, then he should have asked a few questions before releasing the project.
I'm among the most anti-big corporation commentors around here, but I'm more intrigued by what's not in this article as by what's there. I'm not ready to hang an evil jacket on Google just for buying something that was for sale.
actually... (Score:1)
Re: (Score:2)
Ulterior Motives (Score:5, Insightful)
Google, I dig you for now, but I'm not really sure that I care for the idea of having google own nearly all of the search data for every search done by every individual around the planet in the history of google and beyond combined with all of the world-wide traffic analysis data.
And as someone who would be targeted for this service -- why would I bother? There are plenty of free open source utilities out there that provide every ounce of data you could ever want and they're incredibly easy to configure and deploy.
No, the benefit here seems to be less for the end-users deploying the service and more for whoever google then turns around and sells the massive amounts of correlated information to. For instance, let's see every bit of data about a specific user so we can see everything from each search he does to his entire browsing trail. Bet we could sell that for a lot of money!
Hopefully you will still have a simple way as a user to prevent google from collecting this information just like you can do with their stupid Urchin service (by blocking it). And, sadly, people will still continue to use this new service because they'll sell out their mother's medical history and offer up a sample of their own blood and cholesterol ratings if it means getting something "for free".
Re:Ulterior Motives (Score:5, Informative)
"Targeted" advertisements are still group-based efforts. Your individual browsing history is only valuable up to the point where you can be lumped into a marketing stereotype.
About ten years ago, I went online searching for prices on printer ribbons for an IBM Proprinter II. The email address I supplied one website is still receiving spam from that one encounter, not for Proprinter ribbons, not for dot matrix supplies, but for inkjets and toner cartridges. I got lumped into a "shops for printer supplies online" marketing group; nobody's ever sent me an offer for supplies for my Proprinter II. (Though, once he found out I had a use for it, a guy handed me a box of 8.5"x11" tractor feed paper yesterday.)
But get the groups down to enough detail... (Score:2)
* late 20s/early 30s
* female
* elementary school teacher
* teaches English/ESL
* looking for an activity for teaching sight words
* needs it for this Friday
* searching during either her lunch hour, a prep period, or from home after she puts the kids to bed
I sell Bingo Card Creator (http://www.bingocardcreator.com), which conveniently has sight words bingo built into i
Re: (Score:2)
Do you have any idea why gender and need ("by this Friday") are such important factors, or are these simply the results of your data collection?
Do you have enough data to tell how much of an influence each variable is on demand? Can you say, for example, that a female with all of those attributes is twice as likely to buy as a male with the same attributes, or a 50-year-old teacher is 1/10th as likely to buy as a 25-year-ol
Re: (Score:1)
Google have made their billions based on the idea of per-individual marketing. Amazon have grasped the "long tail" firmly and are still around to sell books that I would like, not neccessarily those that the larger public would want.
I see this as another feather to Google's cap, one that they can make wads of cash from, selling not only individuals' clicks, but also the generic trends too. Google is becoming
Re: (Score:1)
Re: (Score:1)
I was countering your comment that it's not worthwhile trying to market to individuals, and providing Google and Amazon as evidence.
Re: (Score:1)
Re: (Score:2)
Anyways, I was just thinking the other da
Re: (Score:2)
It's heresy, I know, but perhaps Google is beginning to deserve a borg icon?
Re: (Score:1)
Given the attitude of the general public to Google, I think it would be apt to make it the Emperor from Star Wars:
"This is how [the internet] dies. With thunderous applause..."
So we can look forward to more accurately.. (Score:1, Offtopic)
Re: (Score:2)
huge waste of space, and I block it. If they're friendly little links and actually interesting stuff, I let them be.
Few sites survive the crap-test, since I'm an intolerant asshole
But oh, so nice it is once the ads are gone - you can actually see content then
Re: (Score:3, Insightful)
Now it seems t
2400 baud, and I feel fine. (Score:3, Funny)
Back before Google, or even Yahoo. Back when a T1 cost $1500/mo or more, making entry in to the ISP business difficult. Back before multimedia content (shareware games) pushed your average home user's bandwidth above 2400 baud.
Yeah, commercialization of the Internet really destroyed its value.
Re: (Score:3, Funny)
$30K? ouch. I was spending that kind of money on my web service also, until I managed to negotiate a volume discount with the escort service I use. I even had enough money left over to buy a much better webcam, and a professional-grade fireman costume.
Re: (Score:2)
What does it do? (Score:5, Informative)
Re: (Score:1)
I think we all presumed it was some sort of web data collecting tool. But maybe it's about collecting random stats like the political party makeup of each country and how many people in each country own what kind of car or have a certain carbon footprint. But who knows. Either way, that is hardly new either. As fo
Re:What does it do? (Score:5, Interesting)
Don't dismiss this without knowing anything about it.
Re: (Score:2)
Re:What does it do? (Score:5, Informative)
Re:What does it do? (Score:4, Funny)
So now it's lies, damn lies, and Pacman on acid?
see Rosling demonstrate it himself (Score:4, Interesting)
Re: (Score:2)
Prof Rosling gives an entertaining lecture and casually explains the state of the world today with a few animated visualization tools - fabulous stuff. His personal mission seems to be to enable the connection of masses of statistics available all over the world to his neat visualization tool set to help the world 'see'. That is to help anyone who is interested study numerical data from public domain databases be they entrepreneurs or academics. Rock on tommy!
Google gets its slice of the act
A nice visualization tool (Score:2)
It crams five axes into a single window, using the "usual" two (x & y) axes, plus color, size and animation for the other three axes. Works fine when you use something size related for size, time for animation, and something discrete for color, as in the example.
Developers (Score:4, Insightful)
From the Gapminder site [gapminder.org]:
To me, this seems to imply that the professor and his son were the original developers, not the maintainers. Or perhaps just his son is going to Google?
Hopefully they'll hire him (Score:1)
Re: (Score:1)
Re:Hopefully they'll hire him (Score:4, Interesting)
sure, like everyone would love to work for google (Score:1, Insightful)
Nobody thought about it before? (Score:1)
Re: (Score:2, Interesting)
Check out This video [google.com] as can be found on one of Zonk's links. [gapminder.org]
The idea is NOT to collect all the data of the world centrally, it is to link to the pre-existing data and display it in a useful way. The software looks incredibly innovative, I doubt there is anything similar for two reasons (1) Google wouldn't' have bought it (2) TV stations here in Australia would be showing trends with the software just as they now show various parts of the earth with Googl
If this was developed with public money... (Score:5, Insightful)
Re: (Score:1, Interesting)
Because it would be a violation of privacy.
Re:If this was developed with public money... (Score:5, Interesting)
The law regarding software and publicly-funded inventions has not always been as it is now. It used to be the case that most significant publicly-funded software HAD to be in the public domain, which AFAIK is why we have the BSD license today. Also witness early versions of Gaussian (quantum chemistry).
These days lots of 100% publicly-funded software is not automatically released to the public domain but instead held ransom by the author or university with a separate license permitting unlimited government use. This directly affects me: essentially ALL of the current quantum chemistry code that produces publishable results is no longer free for everyone to use. Though most programs come with source (the have to for some of the systems we need to run it on), their license restrictions are very onerous for developers: only the PI can register to download it, or it costs 5000 euros per seat, or it cannot be ported to other platforms, etc. One program even revokes licenses from academics who use competing software in the same domain! And this almost ALL software written by tenured professors and their graduate students funded from government grants.
I think we all did much better with the old formula. University-developed code should be available for everyone to use, even if that means someone can later come along and compete with a closed-source version.
I'm curious if the Swedish system more closely resembles the current USA system or the old USA system.
Re: (Score:2)
Re: (Score:1, Funny)
Lack of 1337 skillz, of course.
Re: (Score:2)
True. That should be too.
That said, why isn't that medical history itself public domain? While we're on that, why am I not able to walk into a public library and read your driver's license, birth records, marriage records, medical history, criminal records, and so forth?
Actually, a lot of that winds up in court records, at least here in the US. You can walk in and read any public court
Re: (Score:2)
Why isn't the software that manages your medical history public domain, given that the public healthcare system funded it.
True. That should be too.
No. It shouldn't. The potential damage that could occur with random Joe Q Public having access to the entire methodology behind the storage of people's most private data, without even the legal protection of an NDA is just... astronomical. I think the poster of the root of this particular thread is just another of those anti-copyright zealots who think that every single thing developed should be public domain.
That said, I'll address somne of your other points as well, since I do agree with some. With
Re: (Score:2)
What kind of damage could Joe Q Public do if they had access to database schema and application code? Medical records are already protected (or so we think) by elaborate security measures, without the proper passwords just having the codebase poses no risk to the data its
Re: (Score:2)
You say most government grant funded software should be public domain. How exactly do you decide what fits into the "most" and what fits into the "no" categories? Should that be your decision? In these cases, governments should be weighing the pros and cons of release of information. In the case of the software that manages your medical records, there are no pros to releasing th
Re: (Score:1)
Why isn't the software that manages your medical history public domain, given that the public healthcare system funded it. ...
Actually it already is public domain [vistasoftware.org] (warning for PDF). Or which countries where you talking about? If you don't have it, you can download [hardhats.org] it and set it up.
Re: (Score:2)
Re: (Score:2)
Re: (Score:1, Informative)
There is no bad guy in this so called drama.
So
Re: (Score:2)
Internal medicine, perhaps? I'm struggling to think of what international medicine could be other that "diseases of people who travel".
That's why they call it charity (Score:2, Flamebait)
Re: (Score:1)
Re: (Score:2)
That's a rather limited outlook (Score:2)
I suspect most people who donate time/money to a charity do so under the assumption that nobody will get a big payoff from it (unless you consider the beneficiaries of the charity as a whole to have gotten a big payoff). It's the same reason people feel conned when they find out a charity they donated to uses 90% of its received donations to pay for administrative overhead.
Re: (Score:1)
I do question the way it smells like the author of the software will no longer be involved with it. That part seems foolish.
Re: (Score:2)
People Do Things For Different Reasons (Score:5, Insightful)
Yet life is not fair and often people have regrets and indulge in "what if" fantasies.
For something like this, even if the fellow gets no money, he can get publicity and recognition and might be able to leverage that into something to get him more money if that's what he wants.
The past is past and the price for obtaining "justice" and "fairness" can be quite high and more than one should have to pay; you can lose your future doing it.
Learn from the past and develop a plan to move forward and leverage on the lessons learned; the best revenge is always living well.
Significance levels and missing data (Score:5, Interesting)
Shortly thereafter, a site called Nation Master [nationmaster.org] cropped up, with a bit flashier and simpler user interface, but focused on CIA World Fact Book data, rather than the States of the US. (The same folks later did State Master [statemaster.org] using similar UI technology.)
Finally, Google tested Gapminder [google.com] with an even spiffier and simpler UI -- again focusing on by Nation correlations.
Aside from the usual complaints about "The Ecological Fallacy" [wikipedia.org] (a fallacy that cuts both ways BTW) there are two big pitfalls for this stuff:
What I did about missing data was simply eliminate any data points where data was missing from one or both of the variables being correlated. This reduces the sample size, hence statistical significance, but it bypasses arguments over what sort of missing data should be used. The Netflix Prize [netflixprize.com] is coming up with really good algorithms to compute missing data efficiently and accurately so maybe there is hope for something more effective here.
Statistical significance is more difficult to deal with. Usually one must look at tables for statistical significance of correlations under the assumption that the variables each follow a normal distribution. Unfortunately, many variables follow polynomial (like squared) or exponential distributions, so you have to do things like take the sqrt or log of one or both of the variables to try to normalize them. However, when you are looking for correlations, sometimes it its the relationship that is polynomial or exponential -- in which case you can apply sqrt or log to get the maximum correlation coefficient at the sacrifice of normality of one or both of the variables. Unfortunately, there is no simple arithmetic formula for calculating the significance level of a correlation given a non-normal distribution -- you can't just plug in the skewness, kurtosis, etc. as well as sample size and correlation coefficient, and get out a valid statistical significance. Therefore it is hard to make good statements about many very important correlations without watering them down to meaninglessness.
Also, a complaint about the "simple" user interfaces:
Some of the worst reporting from news media comes when they refuse to report statistics in terms remotely related to anything meaningful -- for example you will frequently hear statements to the effect that "California has the most orange trees in the nation." or some such. Such statistics are nonsense for the purposes of correlation studies since the size of the ecology (California state) is all you are really measuring with such statements. You have to divide by the population or divide by the total GDP or something to rationalize the ecology against other ecologies.
In Laboratory of the States, I did this with all my variables but I also left the raw variables around and allowed people to do arithmetic on them -- like dividing them -- to get their own rational comparisons if for some reason my choices were not adequate. This problem isn't as bad with Gapminder as it is with Nation Master and State Master -- but Gapm
Re: (Score:2)
True, but isn't that what rank correlations are for? Sure, they aren't as efficient as the Pearson (or similar) correlations, but their strength is precisely that that don't rely on questionable assumptions of norma
Watch out for data dredging (Score:2)
How do you use this? (Score:1)
Re:How do you use this? (Score:4, Interesting)
It's also not as new as people are making it out to be, besides being a variant of a scatter plot,
they've been around for awhile. To murder a quote from Hamlet:
There are more things in infographic design, OldBaldGuy, Than are dreamt of by Microsoft Excel.
Re: (Score:1)
I prefer tools like Ggobi http://www.ggobi.org/ [ggobi.org], its predecessor XGobi http://www.research.att.com/areas/stat/xgobi/ [att.com] or commercial products like SAS' insight or jmp. I'm sure there are others. They allow you to tour and manipulate the data through linked plots and displays and selectively turn on and off elements.
Re: (Score:2)
Re: (Score:1)
Hans Rosling and Trendalyzer video from TAD 2006 (Score:4, Informative)
I think one has to see Rosling work with Trendalyzer to appreciate what that piece of software can do. He got standing ovations for his presentation at the TED conference in 2006 [ted.com]. Very cool.
Wrong license? (Score:3, Insightful)
Re: (Score:3, Interesting)
http://osflash.org/pipermail/osflash_osflash.org/2 006-September/011238.html [osflash.org]
Gapminder appears to be made from mostly open source code:
"mtasc, hamtasc, swfmill, eclipse, swftools, Flash Javascript Integration kit (right now using SWFObject) are some of the tools we've used."
The design solutions are unique but the code that was developed seems trivial. Why not open source it? Perhaps the university calculated that selling to google for a small(?) sum was worth more in publicity than open sourcing the project. too bad.
Always thought statistics was boring.. (Score:3, Informative)
Re: (Score:2)
60% don't and
5% just make your brain implode.
"Don't be evil" (Score:2)
Of course then they would have less money for the gourmet food for their employees.
Re: (Score:2)
Gapminder TechTalk (Score:3, Interesting)