How Data Brokers Sell Access To the Backbone of the Internet (vice.com) 64
An anonymous reader writes: ISPs are quietly distributing "netflow" data that can, among other things, trace traffic through VPNs. There's something of an open secret in the cybersecurity world: internet service providers quietly give away detailed information about which computer is communicating with another to private businesses, which then sells access to that data to a range of third parties, according to multiple sources in the threat intelligence industry. The information, known as netflow data, is a useful tool for digital investigators. They can use it to identify servers being used by hackers, or to follow data as it is stolen. But the sale of this information still makes some people nervous because they are concerned about whose hands it may fall into. "I'm concerned that netflow data being offered for commercial purposes is a path to a dark fucking place," one source familiar with the data told Motherboard. Motherboard granted multiple sources anonymity to speak more candidly about industry issues.
Re: Is there a way to break it? (Score:2)
Re: Is there a way to break it? (Score:5, Informative)
I am surprised they think it can pierce VPNs. If set up properly, it can not. Since we still get untraceable SPAM calls though, I doubt their actual ability to know what you do.
Most VPNs do not pad traffic; just encrypt it. If that's what you mean by "set up properly" then I'm going to apologise in advance
No. Netflow is pretty detailed and any program than can make junk, another program can filter out.
What lots of the systems are looking for is not the content of the packets, just the length and timing. A series of short packets back and forth, one after another, is a handshake. Match that handshake exactly with the time of an incoming connection to a server and you can work out who connected to the server.
I run all of my connectivity from my home back through a site to site VPN to colo space, and it breaks a lot of functionality, but keeps me private. Just how I like it. My only leak is when my phone decides itâ(TM)s had enough of VPN, or Apple decide VPNs are not for their services.
You are exactly in the situation where junk traffic could hide the data. Make sure your VPN always pads to full packet legth, no matter what the incoming packet length. However you have a different problem - all your traffic comes from one location / your colo. All the trackers have to do is work out who logs into Google/Facebook/OnlyFans/whatever from that IP address and they can work out who you are. You'd want to have a second layer of protection if you wanted to stay private.
You've broken stuff without gaining anything (Score:4, Insightful)
> I run all of my connectivity from my home back through a site to site VPN to colo space, and it breaks a lot of functionality, but keeps me private
You said that breaks a lot of stuff. I'll trust you on that.
What you've accomplished is that the trackers know your IP to co.lo.ip, rather than ho.me.ip. That's all you've done - changed your IP. You haven't reduced tracking at all. They are just tracking you under one IP rather than the other.
It's like changing your name from John Smith to James Smith.
Okay, now everyone knows you as James, but they still know you.
Re: (Score:3)
I am surprised they think it can pierce VPNs.
Netflow gives you flow-level data. In order to trace VPN connections, you would need packet-level sizes and the VPN set-up would need to preserve packet-sizes. Nobody sets up Netflow to report on packet level except in special situations, because a) Netflow traffic can easily exceed the traffic it aggregates in volume in that case and b) you would use a (possibly reduced) packet trace instead as that would be far more efficient.
I think somebody misunderstood something here. Netflow is both old and not magic
Re: (Score:3)
a) Netflow traffic can easily exceed the traffic it aggregates in volume
Ah no it can't.
Re: (Score:2)
a) Netflow traffic can easily exceed the traffic it aggregates in volume
Ah no it can't.
Actually, it can. It is a common concern when deploying it.
Re: (Score:2)
Actually, it can. It is a common concern when deploying it.
It is nothing of the sort.
Average flow size of less than about 54 bytes would be required including packet headers of all packets comprising the flow for this to happen. This does not occur in the real world.
It is possible to go overboard with templated flow formats or to design a crummy exporter that does not properly pack flow packets. While not physically impossible the phrase "can easily exceed the traffic it aggregates in volume" is not true.
Re: (Score:2)
You have never heard of port or host scanning? Or of small-packet flooding attacks? Or on attacks specifically targeted at Netflow sensors? Fascinating. Why do you think you can contribute to this discussion again?
Re: (Score:2)
You have never heard of port or host scanning? Or of small-packet flooding attacks? Or on attacks specifically targeted at Netflow sensors? Fascinating. Why do you think you can contribute to this discussion again?
Average packet payload length for the VPN would have to be under 34 bytes which is not likely to happen. Of course as mentioned it is possible but for that your link would have to be hopelessly saturated with garbage.
Re:Is there a way to break it? (Score:4, Informative)
Fill it with junk that can't be separated?
Absolutely. Netflow basically packs IP header information from IP packet flows into a packet and shipping it off to a collector.
It collects data from every IP packet:
srcip + dstip + srcport + dstport + ip proto + bytes + packets
Exporters rely on flow caches to aggregate related packets. Say you download a 1GB file there may only be a few flows even though there were millions of packets associated with the download. The exporter is going to only have to send a few flow records instead of millions.
What you can do to really fuck with non-aggregated systems is intentionally cause lots of small flows to be created by sending small packets everywhere so the exporters can't do any meaningful aggregation and are forced to spam collectors with garbage. This also means that random destinations are getting garbage so pick your targets carefully perhaps to networks you know are not active like the old bogon lists.
Fixing the root. (Score:3)
The information, known as netflow data, is a useful tool for digital investigators. They can use it to identify servers being used by hackers, or to follow data as it is stolen.
Yes, well till someone fixes society people have to work with what they have.
Time for privacy is over... (Score:2, Interesting)
... the internet is one big spy machine, precisely because software is copyrighted not owned by the customer. That allowed companies to turn the network of pc's into a virtual mainframe that they own and control by backending the shit out of everything. That was the whole goal since the rise of "MMO's" (aka rpg's with stolen networking code) that began in 1997 with the likes of Ultima online in 97 and Everquest in 99, we got steam in 2003, then Uplay/Origin/battle.net/rockstar social club post 2010's.
Stea
Re: (Score:2, Insightful)
Re: (Score:3)
But we got a beautiful "copyright sux" rant out of it. Why spoil it with...facts.
Re: (Score:1)
Re: (Score:2)
Yeah he was conflating copyright with closed source with networking. Very confusing.
No dumbass, the reason we have DRM like steam and mmo's is precisely because they can steal the files/networking code legally out of the game and trap it on another PC.
Two PC's connected in a network form and behave as a single computer, so they are literally selling you broken programs with missing files and holding those files hostage. AKA fraud, not giving you a complete fucking program, that is only possible because software is licensed to you not owned.
Re:Time for privacy is over... (Score:5, Insightful)
Um, no. Networks were NEVER PRIVATE. I am shocked people are so ignorant about networks. Network monitoring has been going on from day one.
People were confused because they had "privacy by obscurity."
It's like if you go to a clearing deep in the woods. There is nobody else around; so far as you may know. You might even feel comfortable engaging in private activities. But it is not actually private; another person may pop out of the woods at any time. A hunter may be on a tree blind watching you from halfway up a tree. You may be being recorded on a trail cam. The people managing the forest may, as here, later build a trail that comes right past "your" clearing. You may find that hiking has become so popular, it is difficult to find a "private" moment on the trail. But it was never actually "private," merely remote.
Re: (Score:2)
We were the Slacker Generation before we were even Gen X.
Re: (Score:2)
Um, no. Networks were NEVER PRIVATE. I am shocked people are so ignorant about networks. Network monitoring has been going on from day one.
Dude you're missing the fact that STEAM/MMO's were only possible because software files are licensed, not owned as property. AKA pre internet every game developer/os maker were forced to give us the entire program on floppy/cdrom, when living an internetworked world with PC's and cell phones connected wirelessly, the files for any application can be split into two sets, and they can control your application remotely by not giving you a complete application to begin with.
That is why quake 1-3 had modding an
Re: (Score:2)
Re: (Score:2)
and yet here you are, using a web browser to interact with a site that lives on a server you do not control./quote)
That doesn't invalidate what was said though.
I also suspect that had we had high speed networks in the 80s and 90s into the home, we would of had a similar scenario as we have now.
Very sad times, but at least for now they still sell computer hardware to build your own machine and run your favorite flavor of OSS.
Re: (Score:1)
and yet here you are, using a web browser to interact with a site that lives on a server you do not control.
You're missing the fucking point, valve and everyone has been stealing our games and OS for the last 20 years buddy, there's no reason for client-server PC games or client-server OS where part of the program is sitting on some remote server so the application can be shut down remotely. With trusted computing hardware they can finally lock memory areas on your PC and file system away from you. Microsoft has been engineering NTFS/Active directory to remotely control file and byte access we're seeing their e
Re: Time for privacy is over... (Score:1)
Re: (Score:2)
Apple is the only one trying to offer privacy.
Apple isn't doing any of the sort, any client-server app on your phone or computer == you have no privacy because they can see you are broadcasting out over the network and can glean all sorts of info from that.
Re: (Score:3)
But you don't even have metadata privacy in the physical world. Are you aware that the handy USPS Informed Delivery service (that emails you scans of the envelopes of all the mail coming to your house) is actually a side-effect/fun offshoot of a program the USPS had been running for years, in which the exterior of every mail article is photographed for law enforcement purposes? https://www.nytimes.com/2013/08/03/us/postal-service-confirms-photographing-all-us-mail.html [nytimes.com]
Metadata that is used to route and del
Re: (Score:2)
Re: (Score:2)
Imagine if UPS started selling all the From/To information it had to advertisers?
They might. Do you know they don't? We, people on the street, have really no idea what all the data sources are that contribute to the advertising models that target us.
As far as I have read about this, the envelope data seems to live in at best a gray area. What's inside the envelope unarguably has an expectation of privacy. What's written on the outside, less so. What if you put your envelope on a pile of outgoing mail at the office, where anyone could glance at it? What guarantees do you have about who i
Re: (Score:2)
They might. Do you know they don't?
They explicitly say they don't. I'm aware of no evidence to the contrary, are you?
We, people on the street, have really no idea what all the data sources are that contribute to the advertising models that target us.
As far as I have read about this, the envelope data seems to live in at best a gray area. What's inside the envelope unarguably has an expectation of privacy. What's written on the outside, less so. What if you put your envelope on a pile of outgoing mail at the office, where anyone could glance at it? What guarantees do you have about who is actually involved in sorting it? You might put your outgoing mail on a table for your landlord to collect; he might sort all local mail into one pile and all distant mail into another before taking it to the post office. What about living with roommates - roommate A brings all the mail in from the box, and sorts it into piles for each resident; roommate A therefore knows all the mail that's arriving for each resident. TLDR - there aren't any global, ironclad guarantees about who is allowed to look at the envelope - only about who is allowed to open it.
I don't care about "ironclad guarantees" this was never the issue. Nobody is suggesting whether it is postal mail, postcards or packets that headers or payloads are in any way secure or private from all threats including spooks conducting LOVEINT. The issue is industrial scale systematic data collection, aggregation and dissemination.
Re: (Score:3)
Metadata that is used to route and deliver your payload arguably "belongs" to the infrastructure, not to you.
What's the difference between metadata and full contents of packet? The "infrastructure" needs to read payload same as packet header in order to forward a packet. The only exception I'm aware of is lambda switching. What makes a packet header belong to the infrastructure and payload not? What's the difference?
Certainly it has to be readable by the infrastructure, which means even if it's not legal, someone, somewhere will be collecting it for analysis.
Just because something is done in public or a third party is involved does not automatically mean anything with respect to whether or not one gets carte blanch to do whatever they damn well please
Re: (Score:3)
What's the difference between metadata and full contents of packet?
Permissible use. I continue to fall back on the envelope/contents analogy (see my other reply just posted a moment ago in this thread) because much of what I've read about this (starting back in the Clipper chip era), the courts tried to work through all these spicy new questions using the same analogy. The outside of the envelope is "To whom it may concern: please do anything you can to help deliver this message from X to Y" and considered almost public; certainly it is at least a special case, and not who
Re: (Score:2)
Permissible use.
Is this an abstract idea or is there some kind of specific law or legal doctrine that separates header from payload?
If I send an email using SMTP protocol the from, to and subject fields are contained within the packet payload so therefore should I conclude message metadata enjoys the same level of "permissible use" protection as the body of the message WRT to "permissible" use of my packets?
recording verbal conversations (wiretapping), use of telephone records (CPNI)
CPNI stuff, you have again a very limited set of parties involved and they have a business relationship that includes non-disclosure of data. I'll note here that even the act of 1996 only covers the relationship between the subscriber and the carrier. If I use my AT&T phone to call a business switchboard
The fact is information about calls are protected under CPNI as they travel throughout the telephone network even as
Re: (Score:2)
Whether calls terminates at a corporate PBX or a mobile handset CPNI applies just the same. Whether local or across country CPNI restrictions are unchanged.
At termination point CPNI rules do not apply because rules concern telephone providers not subscribers unless subscribers happen to be acting as telecom providers.
Do you see the contradiction there, and see how it applies to the specific example I gave? A corporate PBX is not a telco. Some VoIP plays are not telcos either. They're not governed by this legislation.
Regardless an important difference between ISPs and upstream providers the ISP is the entity with information necessary to correlate network identifiers with subscriber records.
Once again the PBX analogy is highly relevant. MyCorp who runs the PBX does not, it is true, have your subscriber information. However they DO have your phone number via CLI, and there are numerous resources they can use to obtain all your subscriber information using only that number. Similarly, an intermed
You need 2 VPN provider (Score:4, Funny)
In two countries that hate each other and would never willingly exchange data with each other.
Re: (Score:2)
You need a McDonald's parking lot.
Re: (Score:3)
You need a McDonald's parking lot.
Interestingly enough, if you're running some shovelware like Windows, that probably isn't going to help you as much as you might think. So you sit in the parking lot behind the public IP address that supposedly isn't associated with you. You fire up Tor. Then all of a sudden Microcrap sends some "telemetry" or some piece of software phones home for an update. Now your identity is associated with that IP address and anyone looking at the local network can associate the MAC address to that data. Argurably thi
Re: (Score:2)
Sure if you're just going to announce yourself to the world then trying to be sneaky isn't going to do you much good.
Minimal install of Linux with restrictive iptables on some device with a MAC address that's not trackable to you. And if you're going to show up in anything other than a black hoodie you can ditch in the nearest camera blindspot, at least also bring a cantenna so you can use the wifi from the Starbucks a block away.
Re: (Score:2)
Ditching the hoodie is a good idea, but not where it is likely to be found. It's highly unlikely you won't leave some kind of physical evidence on the hoodie itself, so best not to ditch it near the scene of the crime.
Re: You need 2 VPN provider (Score:2)
Toy mean like the secure core functionality in Proton VPN?
Re: (Score:2)
That sounds like a painfully slow network to me.
This is why I love Tor, allow no middlemen! (Score:1)
Especially no middlemen that can not be trusted. Our privacy is being violated by every single party in the middle. ISPs, eCommerce vendors, search vendors, social networks. They are all euqal culprits. Avoid a link to a real ID, irrelevant of the addresss. Address is a dummy in most IDs.
Different levels of security (Score:3)
.
Then you have to worry about your home ISP. They can look at everything too. So, again, you're trading one threat for another. But at least unless you have some crazy setup nothing unencrypted is ever being transmitted through the air.
Then you have your VPN provider. Do you really think they aren't tracking stuff too and probably selling the network metadata? Sure they are. So maybe you got past your ISP's surveillance and you avoided the guy sitting in the hotel room next to you, but the VPN being the endpoint is thrust into that position.
Now, if you do all that then use Tor, you're probably better off. But then again, you could just use Tor in the hotel room and accomplish much of the same thing. None of this is perfect, and anybody who goes out and sends terror threats or child porn thinking they're safe because they paid ExpressVPN or are using Norton Lifelock VPN (for examples) is an idiot.
Re: (Score:2)
All Tor exit nodes are monitored.
Probably. But unless you're engaging in international espionage or something, it's unlikely that the TLAs would fire off a fireworks display intentionally revealing that this is so.
Re:Different levels of security (Score:5, Interesting)
Their ad budget seems to be unlimited, and their selling point is "you ISP can't see what you do online!", conventiently leaving out the fact that you now have to trust the VPN company with all your traffic.
It does seem quite shady.
Re: (Score:2)
Not necessarily run by, but some very willingly cooperating. Source: trust me...
How much encryption is necessary/appropriate? (Score:1)
I am not an encryption expert, but I have worked with networks and VPN(s) for a long time. The payload (containing the remote target IP) to/from the VPN point should be encrypted. Therefore: while ISP(s) etc. can see traffic to/from a customer to/from a VPN point, those ISP(s) etc. should not be able determine which target IP was related to the customer. Sorry, if my verbiage is imperfect.
Re: (Score:2)
Good points. (I should have replied to Random361.) One good setup is using my home VPN point and querying OpenDNS (or ???): not perfect, but good. How do feel about DNS-Sec (no flames, please)?
Re: (Score:2)
Good points. (I should have replied to Random361.) One good setup is using my home VPN point and querying OpenDNS (or ???): not perfect, but good. How do feel about DNS-Sec (no flames, please)?
Generally, you'd route the DNS queries through the VPN. There's no reason to leak that information to your ISP. (Again, the VPN provider is then put in the position where they see it all. So as another poster said, your ISP knows what VPN you're using, and the VPN provider knows everything else. If you tunneled through to another VPN, then your first provider knows who the second one is. Rinse. Repeat.)
Re: (Score:1)
I agree. What I was suggesting/asking, given that "I route the DNS queries through . . . [my home] VPN" point, what is a good method for my home VPN point to query DNS?
Re: (Score:2)
I agree. What I was suggesting/asking, given that "I route the DNS queries through . . . [my home] VPN" point, what is a good method for my home VPN point to query DNS?
Oh. I'd use one of the public nameservers, and not the ISP's servers. Someone is going to see the DNS queries, but ISPs have been particularly douchy about this. (Some) ISPs have a tendency to not follow the rules and back ads to invalid domain queries. That is less of an ISP-centric problem now that every freaking possible domain out there is bought by some scalper who sets up a "parked domain" to try to sell it to you.
There is a list here at https://www.lifewire.com/free-... [lifewire.com] . It isn't like some of the
Re: (Score:2)
I am not an encryption expert, but I have worked with networks and VPN(s) for a long time. The payload (containing the remote target IP) to/from the VPN point should be encrypted. Therefore: while ISP(s) etc. can see traffic to/from a customer to/from a VPN point, those ISP(s) etc. should not be able determine which target IP was related to the customer. Sorry, if my verbiage is imperfect.
Basically this. My ISP can see I have an encrypted tunnel to a VPN server. That does not really concern me much, unless it results in traffic shaping that affects performance.
You do have to trust the VPN service does not keep logs. If you are really paranoid you can run a second VPN tunnel inside the first. Or use Tor over your VPN (or VPN over Tor).
https://www.privateinternetacc... [privateint...access.com]
For downloading stuff off TPB I find a single VPN works just fine.
Reminder: "We Kill People Based on Metadata." (Score:3)
Former Director of the NSA and CIA, General Michael Hayden, ladies and gentlemen.
Solution is a global misinformation network (Score:3)
It is reasonable easy to break their model and use all that excess data on your plans to obscure your traffic.
1 Have a client which simple simulates sending traffic to and endpoint. TCP and UDP can be asymetric so a full connection doesn't need to exist
2 Create a bogus VPN client which creates dummy endpoints allowing you to dial in a percentage of your ISPs traffic to create spurious traffic.
3 Use the above to create orders of magnitude more metadata and overflow their collection capabilties.
4 Get FSF or similar organisation to host the project with instances across both desktops and phones and Bob's your uncle.
Business model buried
Netflow is just metadata (Score:1)
Remember that netflow are just flows and their metadata and not contents. So it is either mix up or incorrect usage of the term.