'There is No Such Thing as Data' (ft.com) 84
What we have are innumerable different collections of information, each of them specific to a particular application. Technology analyst Benedict Evans writes: Technology is full of narratives, but one of the loudest and most persistent concerns artificial intelligence and something called "data." AI is the future, we are told, and it's all about data -- and data is the future, and we should own it and maybe be paid for it. And countries need data strategies and data sovereignty, too. Data is the new oil. This is mostly nonsense. There is no such thing as "data," it isn't worth anything, and it doesn't belong to you anyway. Most obviously, data is not one thing, but innumerable different collections of information, each of them specific to a particular application, that can't be used for anything else. For instance, Siemens has wind turbine telemetry and Transport for London has ticket swipes, and those aren't interchangeable. You can't use the turbine telemetry to plan a new bus route, and if you gave both sets of data to Google or Tencent, that wouldn't help them build a better image recognition system.
This might seem trivial put so bluntly, but it points to the uselessness of very common assertions on the lines of "China has more data" -- more of what data? Meituan delivers 50mn restaurant orders a day, and that lets it build a more efficient routing algorithm, but you can't use that for a missile guidance system. You can't even use it to build restaurant delivery in London. "Data" does not exist -- there are merely many sets of data. Of course, when people talk about data they mostly mean "your" data -- your information and the things that you do on the internet, some of which is sifted, aggregated and deployed by technology companies. We want more privacy controls, but we also think we should have ownership of that data, wherever it is. The trouble is, most of the meaning in "your" data is not in you but in all of the interactions with other people. What you post on Instagram means very little: the signal is in who liked your posts and what else they liked, in what you liked and who else liked it, and in who follows you, who else they follow and who follows them, and so on outwards in a mesh of interactions between millions of people.
This might seem trivial put so bluntly, but it points to the uselessness of very common assertions on the lines of "China has more data" -- more of what data? Meituan delivers 50mn restaurant orders a day, and that lets it build a more efficient routing algorithm, but you can't use that for a missile guidance system. You can't even use it to build restaurant delivery in London. "Data" does not exist -- there are merely many sets of data. Of course, when people talk about data they mostly mean "your" data -- your information and the things that you do on the internet, some of which is sifted, aggregated and deployed by technology companies. We want more privacy controls, but we also think we should have ownership of that data, wherever it is. The trouble is, most of the meaning in "your" data is not in you but in all of the interactions with other people. What you post on Instagram means very little: the signal is in who liked your posts and what else they liked, in what you liked and who else liked it, and in who follows you, who else they follow and who follows them, and so on outwards in a mesh of interactions between millions of people.
No such thing as money (Score:5, Interesting)
But you can trade data for money and vice versa.
Re:No such thing as money (Score:5, Funny)
Re: (Score:2)
What the moron in the article is trying to say is that data isn't a fungible commodity, which everyone already new.
Headline is stupid (Score:2)
Yes, the headline is stupid and transparently incorrect. Yes, of course there is such a thing as data.
I have little patience for people who say deliberately stupid things to be provocative. Yes, there is such a thing as "data," yes, it is worth something (or, some of it is worth something to some entities for some purposes), and, yes, he got the last part right: most of it doesn't belong to you. The particular data he's talking about is your "personal" data, which, as he said, is mostly amalgamating stuff
Re: (Score:2)
I have little patience for people who say deliberately stupid things to be provocative.
Sometimes adding a wrong answer to your thread is the best way to get a response on Stackoverflow or Reddit. But now we're stepping away from the territory of technical analysts and into sociology and psychology.
I suspect most journalism and of course the editors that control the publishing of these articles, are carefully crafting headlines for the clicks. It doesn't seem to matter if the reader response is positive or negative, because the metric we're using is very broken.
Re: (Score:1)
How so? It seems to me that it measures "engagement", which is exactly what you want if you're Slashdot. You don't come here for answers, you come here for discussion. If you're intelligent, you may be able to separate some wheat from the chaff.
Re: (Score:2)
That's a bit like measuring value in dollars. Practical in some narrow ways, philosophically empty in others.
Re: (Score:2)
I have little patience for people who say deliberately stupid things to be provocative
Millions of baby birds will die because you wrote that! Bird Hitler!
Re: (Score:2)
yes, he got the last part right: most of it doesn't belong to you
That's probably the most interesting part of the entire story. The general public now has in their mind that the data about them also belongs to them. That is problematic when nobody has a clear idea of what data that is.
For example you end up with silly notions like "right to be forgotten" which implies you should have some power over what exists in other people's brains (and yes, I do consider a personal computer to be an extension of a person's brain), which is clearly a violation of much more important
Re: (Score:2)
Can "the moron in the article" tell the difference between "new" and "knew"?
Just funning you, but I do find it slightly ironic.
My response to the story? Something like "This is not the idea of data you are looking for."
(But I nearly wrote "This is not the data you are looking for" when any fool knows it should be "These are not the data you are looking for". Or is that one of these British English versus American things? Anyone have C-3PO's email address handy?)
Re: (Score:2)
Typos tend to happen when they're maximally embarrassing.
Or is that one of these British English versus American things?
We seem to have a lot of trouble with plurals that don't end in 's'. We'll write things like "the data shows that ..." instead of "the data show that ..." even though we know better.
Re: (Score:2)
Just the ACK, though I rather doubt that most of us know better.
This is a stupid idea (Score:5, Insightful)
That means at my intermediary application layers, where I'm transforming data from the lower layer into something I can apply higher-level reasoning to, before that it's not actually data?
The immediate usefulness or applicability of data doesn't reduce it's essential data-ness. This is hardly a question of information or data science, even, and more one of ontology.
Re: (Score:2)
While the argument is summarized as "there's no such thing as data", what he more accurately could have said is "There is no single thing called "data", rather there are many, many forms of what could be called "data" and each form has a very different value. As such, it's illogical to assume data requires a strategy or is an
Re: (Score:2)
Author knows he has to clickbait because his data told him so.
Re: (Score:2)
Clickbaiting aside, a distinction I like is that data is what happened and when it happened, as registered by some sensor (or a person); information is data transformed in a context-dependent way that allows an algorithm (or a person) to make a decision.
This is not unlike what the author is arguing, but I think it sets things more clearly. The mystery that remains is what is the "context-dependent way" but that process is at this point largely art, not science.
Technically correct, best kind of correct (Score:5, Funny)
The Only Kind of Correct (Score:2)
There is no spoon (Score:5, Funny)
Re:There is no spoon (Score:5, Insightful)
From the summary it sounds like a lot of semantic hair-splitting to try to silence common gripes about being tracked, spied-on, and victimized by identity theft. Obviously, there IS a subset of data that people have good reason to care about and want to control, and no amount of fancy words will talk anyone out of this.
Re: (Score:2)
Most obviously, data is not one thing, but innumerable different collections of information, each of them specific to a particular application, that can't be used for anything else. For instance, Siemens has wind turbine telemetry and Transport for London has ticket swipes, and those aren't interchangeable. You can't use the turbine telemetry to plan a new bus route, and if you gave both sets of data to Google or Tencent, that wouldn't help them build a better image recognition system.
The data isn't interchangeable but can collected. But isn't the collection, which is itself data, actually a new use of the "different collections" of data (contradicting his assertion "can't be used for anything else")?
Re: (Score:3)
Came here to say just this. Thank you. It reeks of verbal masturbation to not run afoul of inherent rights.
Re:There is no spoon (Score:5, Insightful)
This is some really vapid philosophical bullshit.
Written by someone who has a vested interest in preventing any privacy protections.
Yeah data doesn't exist, and if it did its not yours. Therefore nothing to protect.
Re: (Score:2)
"Data" does not exist -- there are merely many sets of data
"Birds" do not exist. There are merely many species of bird.
That is perhaps the most oxymoronic statement I have ever come across and is self-defeating. An abstract concept does not cease to exist simply because it is serves a general utility. The concept of "tools" does not cease to exist because you can't use a screw-driver as a wrench. Nevermind he asserts that something doesn't exist then immediately proceeds to use that term in reference to something he asserts does exist.
Re: (Score:3)
This is some really vapid philosophical bullshit.
And not even correct. For example, you *can* use the (wind) turbine telemetry to plan a new bus route, if you're planning a quiet route, or one for riders who are afraid of giant spinning things, or simply want to keep riders from getting cancer from the noise [politifact.com] :-)
Re: (Score:2)
This is some really vapid philosophical bullshit.
And not even correct. For example, you *can* use the (wind) turbine telemetry to plan a new bus route, if you're planning a quiet route, or one for riders who are afraid of giant spinning things, or simply want to keep riders from getting cancer from the noise [politifact.com] :-)
I was going to berate you and say that you'd use GIS [wikipedia.org] data to plan your new bus route, not telemetry data, but then I realized that you could dynamically change your route if the turbine was moving or not. And for that you do need the telemetry data.
Re: (Score:2)
I was going to berate you ...
You were going straight to "berate"? How about starting out with something kinder when you follow up.
Also, I was kind of joking about the bus route. My main point was "who is this guy to say how data can/can't be used..." He seems to lack imagination.
Re: (Score:2)
Jeez, you leave off one simple /s and people jump straight to conclusions. /s
Re: (Score:2)
Well, wind turbine telemetry may give you information (to avoid the term "data") about the average wind strength and number of windy days in some location. Knowing this, you would for example design stronger bus shelters in very windy areas.
Of course, turbine telemetry is not the best way to get meteorological information, but if it's the only source you have (or if it's considerably cheaper than the other one), you could indeed use it to design bus routes.
Re: (Score:2)
Stupid rant by a grumpy old man.
Re: (Score:2)
Stupid rant by a grumpy old man.
More likely it is either:
1. A clueless analyst who couldn't find something meaningful to report on by their deadline
2. An analyst whose goal is to be an apologist for how companies and governments are violating the privacy of customers and citizens
Re: (Score:2)
Grumpy old man here. Yes, it's a stupid rant. And get off my lawn.
--
Ceci n'est pas un sig. --AR
timecube (Score:1)
This story is not that dissimilar from this very important information that I read about a decade ago [archive.org]. I also thought maybe the data is antisemitic, racist, chauvinist or triggers people in some other manner and so we cancelled it? I don't know.
Re: (Score:1)
100%
Re: (Score:1)
Pointless Article (Score:5, Informative)
The whole summary (the article is paywalled) is just a load of semantic BS. Just because your data includes interactions with others doesn't mean you should lose your right to privacy. And large companies and governments have large amounts of your data AND context surrounding that data, so it is still very valuable. After just typing two sentences here I feel it's a waste of time to even comment about this article, so I'm going to stop here.
Re: (Score:1)
> just a load of semantic BS.
It kind of reminds me of https://wiki.c2.com/?DataAndCo... [c2.com]
Freedom of speech (Score:1)
The author is clearly stoned (Score:5, Funny)
... no such thing as data ... then describes all the different kinds of data.
Re: (Score:1)
Re: (Score:2)
Knowledge is power. Pretending that data never has application across disciplines shows ignorance of the history of science.
Re: (Score:2)
... no such thing as data ... then describes all the different kinds of data.
He describes data sets, which he says are not identically equal to data. Which as mentioned above is semantic bullshit, used to convince you "nothing to see here; move along" while he picks your pocket...
Re: (Score:2)
Does a click-bait headline also count as "data"?
Does this imply no such thing as a database? (Score:2)
It will take a while to get used to saying "innumerable different collections of information base"
Re: (Score:2)
tl;dr (Score:1)
Nonsense from yet another "analyst" (Score:2)
TFA is paywalled, so lets just fisk TFS:
There is no such thing as "data," it isn't worth anything
No such thing? He, himself lists examples of data. WTF?
Not worth anything? Funny, how much money changes hands, purchasing personal data.
You can't use the turbine telemetry to plan a new bus route
And you can't use a hammer to turn a screw. What's his point? Absolutely no one ever claimed that different data sets were interchangeable.
most of the meaning in 'your' data is not in you but in all of the interactions with other people
Obviously. What web pages did you look at? What products have you bought? Except for identity theft, nearly all uses of your personal data involve interactions with others.
We want more privacy controls, but we also think we should have ownership of that data
Yes, and why should th
That depends (Score:2)
Meituan delivers 50mn restaurant orders a day, and that lets it build a more efficient routing algorithm, but you can't use that for a missile guidance system.
That depends on what you're trying to blow up. If I want to take out all the people who ate sticky buns last Thursday, it should suffice. Or if I want to guide a missile down a street, which someone might actually want to do. (It probably won't be an actual missile, but I wouldn't bet on that either.)
The most important line (Score:4, Insightful)
that this shill salesman wants to sell you is "it isn't worth anything, and it doesn't belong to you anyway". This guy wants the people that care to ignore data collection with that statement, why? I don't know, maybe he is a salesman for the data industry. Either way he wants you to think that privacy doesn't matter, as does every company that collects data, don't believe them, advocate for privacy and demand fair use of your personal data.
But that statement that data isn't worth anything is purely false. If data didn't matter, we wouldn't have entire industries that make the bulk of their revenue based of collecting, processing and selling data. There are marketing companies that WILL PAY YOU, for your data. Because the more companies have of user data the more powerful they are, they can influence consumer or business decisions for example. I would say data is worth even more nowadays because it can be used to train AI, and that can be used to generate revenue.
This is true ... (Score:2)
self refuting BS (Score:2)
This is nothing more than someone looking for clicks, data, to boost readership and increase paywall subscriptions. The "data" is being collected and used. If the data doesn't exist try collect it? Teardown this paywall!
Short sighted 'analyst'. (Score:5, Insightful)
Having data is one thing, knowledge of how to use it is another topic, for which this analyst is clearly unequipped.
Have you got the numbers on that ? (Score:1)
Just click bait/PR (Score:2)
Cue that code is data, data is code joke ... (Score:2)
Guess this yahoo never heard of Lisp or that old code/data joke:
Recursion Error (Score:5, Insightful)
"Data" does not exist -- there are merely many sets of data.
The CompSci major in me is screaming that someone failed in planning their recursive algorithm.
But both my poor attempts at humor, and this author's attempt at making a thought-provoking article, are all a ruse. This article is attempting to obfuscate the definition of "data". Don't let it fool you. Data clearly does exist. Any CompSci major can tell you that. So can the GPDR, which clearly defines data as "any information which [is] related to an identified or identifiable natural person".
Data is as real as the problem of it being harvested without our consent, and don't let industry shrills like Benedict Evans tell you otherwise.
There is no such thing as data (Score:2)
There ARE such things as data
Who Owns You? (Score:2)
Worthless? (Score:1)
If data is worthless, then why are so many companies going through great effort to collect it and why are the same companies fighting to keep out new laws that would prevent them from using "Your" data anyway they wish?
There is no such thing as X (Score:2)
Data = Value + Context (Score:2)
This is stupid argument (Score:2)
....which sounds very much like the sorts of fervent-but-empty-headed crap that you hear from college freshmen.
Tell us you don't understand subsets (Score:2)
... without saying you don't understand subsets.
This guy will probably argue that there's no such thing as square rectangles, too.
Re: (Score:2)
Truly. I'll be mathematicians would be interested in the author's brand new discovery. After all they have to keep track of a lot of numbers and groups and things and would likely need some theories on sets of things and subsets. Might even need some new notation and a couple handy proofs. If only someone had thought of this earlier....
"Benedict Evans is a consultant and long-time mobile analyst and pundit. He has been working in the media and tech industries for 15 years on the analytical/strategic sid
Tickets and windmills is a straw man example. (Score:3)
What we're really interested in is using data in a way that provides a justifiable basis for making informed predictions. For that you need multiples *kinds* or data that bear on *single nexus of concern*, say an individual or a household. If you choose an example of two datasets expressly chosen to have no common focus of concern, naturally amalgamating those datasets seems pointless, because it *is*.
When you have a complex entity like and individual or household, it leaves behind a vast, messy trail of *apparently* unrelated information, except that it *is* related: it's all about that entity. You can use that data to *classify* that entity, to make statistically likely inferences about data on that entity you don't actually have. All you have to do is to train an AI algorithm on similar datasets for individuals you *do* have the relevant information for, then the algorithm will fill in those blanks for everyone else.
This gives data aggregators the power to peer into aspects of your life you haven't told anyone about. When they get it wrong, it can be problematic. When the get it *right* it can also be problematic.
This inferential power is intrusive, dangerous, but it is undeniably *useful*. That is why private companies and state security agencies want to become data hoarding superpowers. Individually much of it is sure to be garbage, but collectively it is a rich source of both algorithm training datasets and classification targets.
In related news today (Score:4, Funny)
There is no such thing as a "Technology analyst", Technology analysts aren't worth anything, and they do not belong to the human race anyway.
Deliberate nonsense is better (Score:2)
There are quite a few people who enjoy writing nonsensical pieces without realising they're nonsensical. This unfortunately ruins the enjoyment that readers might have gotten from the piece had the author actually understood they were writing nonsense.
It's just other peoples harddisks. (Score:2)
It's just other peoples harddisks.
READY. (Score:2)
10 READ D$
20 PRINT D$
30 GOTO 20
40 DATA "FUCK OFF!"
Poor Mr. Evans (Score:2)
Why, then (Score:2)
Appropriate GIF (Score:2)
https://tenor.com/view/star-tr... [tenor.com]
There's only output (Score:2)
There is No Such Thing as Data
Translation: There's no such thing as input, only output. Some people might find your outputs (eg. 'likes' and being 'liked') as valuable but that's not you or yours. So shut-up and piss-off.
Data doesn't exist? (Score:2)
"Data" does not exist -- there are merely many sets of data.
How can you have sets of something that doesn't exist then?
There is no such thing as Data (Score:1)
Then who does Captain Picard ask for scientific advice?
What are companies collecting and selling? (Score:2)
TL;DR (Score:2)
blah blah blah blah. nice essay for 7th grade english.
For someone who uses data and information (Score:2)
For someone who uses data and information, they cannot seem to be able to distinguish the different among the two.
Data is raw. It is stored in logs, processed in medium layers, and then becomes useful. At that point we call them information.
We might have raw Apache server logs. It will not help anyone as they are (except for HDD manufacturers). Then by just applying magic we get very valuable information (sorry about the language, but not sure author will know things like snort, nagios, or even "grep" and "