 
			
		
		
	
		
		
		
		
		
		
			
				 
			
		
		
	
		
		
		
		
			
				 
			
		
		
	
    
	Tim Berners-Lee Urges New Open-Source Interoperable Data Standard, Protections from AI (theguardian.com) 29
			
		 	
				Tim Berners-Lee writes in a new article in the Guardian that "Somewhere between my original vision for web 1.0 and the rise of social media as part of web 2.0, we took the wrong path
Today, I look at my invention and I am forced to ask: is the web still free today? No, not all of it. We see a handful of large platforms harvesting users' private data to share with commercial brokers or even repressive governments. We see ubiquitous algorithms that are addictive by design and damaging to our teenagers' mental health. Trading personal data for use certainly does not fit with my vision for a free web.  On many platforms, we are no longer the customers, but instead have become the product. Our data, even if anonymised, is sold on to actors we never intended it to reach, who can then target us with content and advertising...
 
We have the technical capability to give that power back to the individual. Solid is an open-source interoperable standard that I and my team developed at MIT more than a decade ago. Apps running on Solid don't implicitly own your data — they have to request it from you and you choose whether to agree, or not. Rather than being in countless separate places on the internet in the hands of whomever it had been resold to, your data is in one place, controlled by you. Sharing your information in a smart way can also liberate it. Why is your smartwatch writing your biological data to one silo in one format? Why is your credit card writing your financial data to a second silo in a different format? Why are your YouTube comments, Reddit posts, Facebook updates and tweets all stored in different places? Why is the default expectation that you aren't supposed to be able to look at any of this stuff? You generate all this data — your actions, your choices, your body, your preferences, your decisions. You should own it. You should be empowered by it...
 
We're now at a new crossroads, one where we must decide if AI will be used for the betterment or to the detriment of society. How can we learn from the mistakes of the past? First of all, we must ensure policymakers do not end up playing the same decade-long game of catchup they have done over social media. The time to decide the governance model for AI was yesterday, so we must act with urgency. In 2017, I wrote a thought experiment about an AI that works for you. I called it Charlie. Charlie works for you like your doctor or your lawyer, bound by law, regulation and codes of conduct. Why can't the same frameworks be adopted for AI? We have learned from social media that power rests with the monopolies who control and harvest personal data. We can't let the same thing happen with AI.
Berners-Lee also says "we need a Cern-like not-for-profit body driving forward international AI research," arguing that if we muster the political willpower, "we have the chance to restore the web as a tool for collaboration, creativity and compassion across cultural borders.
 
"We can re-empower individuals, and take the web back. It's not too late."
 
Berners-Lee has also written a new book titled This is For Everyone.
		 	
		
		
		
		
			
		
	We have the technical capability to give that power back to the individual. Solid is an open-source interoperable standard that I and my team developed at MIT more than a decade ago. Apps running on Solid don't implicitly own your data — they have to request it from you and you choose whether to agree, or not. Rather than being in countless separate places on the internet in the hands of whomever it had been resold to, your data is in one place, controlled by you. Sharing your information in a smart way can also liberate it. Why is your smartwatch writing your biological data to one silo in one format? Why is your credit card writing your financial data to a second silo in a different format? Why are your YouTube comments, Reddit posts, Facebook updates and tweets all stored in different places? Why is the default expectation that you aren't supposed to be able to look at any of this stuff? You generate all this data — your actions, your choices, your body, your preferences, your decisions. You should own it. You should be empowered by it...
We're now at a new crossroads, one where we must decide if AI will be used for the betterment or to the detriment of society. How can we learn from the mistakes of the past? First of all, we must ensure policymakers do not end up playing the same decade-long game of catchup they have done over social media. The time to decide the governance model for AI was yesterday, so we must act with urgency. In 2017, I wrote a thought experiment about an AI that works for you. I called it Charlie. Charlie works for you like your doctor or your lawyer, bound by law, regulation and codes of conduct. Why can't the same frameworks be adopted for AI? We have learned from social media that power rests with the monopolies who control and harvest personal data. We can't let the same thing happen with AI.
Berners-Lee also says "we need a Cern-like not-for-profit body driving forward international AI research," arguing that if we muster the political willpower, "we have the chance to restore the web as a tool for collaboration, creativity and compassion across cultural borders.
"We can re-empower individuals, and take the web back. It's not too late."
Berners-Lee has also written a new book titled This is For Everyone.
Horse and gate (Score:5, Insightful)
Re: (Score:3)
Agreed. This seems hopelessly idealistic at this point in time.
Re:Horse and gate (Score:4, Interesting)
It's not even a horse and a gate. It's trying to get the horse to stand on one of those little platforms they use in circuses and the horse eventually gets tired of it.
There are still free web pages that live up to Berners-Lee's ideal. I've got a couple as I expect lot of people with the skill to make one do. CERN has one. Lots of big public institutions do. But web pages ain't free, even if you make and host them yourself. Those free web pages are either volunteer efforts, paid for by public funds, or made as a public service or unobjectionable advertising by private money.
Altruism only goes so far though. When it gets to be too much work it needs to make money, and the options there are subscriptions or advertising, both of which are widely available today.
Ok ? But who's going to host it ? (Score:5, Interesting)
And we're right back at square one. The reason people consolidate on these big tech platforms like X, Facebook, Youtube, the Bank is because hosting your own data and hoping people find it is the issue, not the format it's stored in.
If all your tweets in solid format are in one place, you still have to tell X where to fetch that data. Now suddenly, instead of X performing simple, local database queries to build a feed to display, it has to perform multiple network queries to various remote endpoints to fetch all the tweets from various "Solid Hosts", which can introduce a lot of latency in retrieving the data.
And since people aren't tech savvy enough to host the data themselves, or don't have proper infrastructure to do so, where do you think all these repositories of data will end up ? Azure, AWS, GCP. Now instead of a free X account or Instagram account, I need to pay Amazon to host my stuff that Reddit fetches to display to people. And I can still get banned from everywhere for wrong think.
Seems like a good idea on paper that would utterly fail in practice.
Re:Ok ? But who's going to host it ? (Score:4, Insightful)
This is has been a problem for a long time. In an ideal world, hosting data from one's own personal internet connection, be it from one's phone, work, wherever, should not be as difficult as it is. And it shouldn't have to involve anyone other than one's primary service provider. We should be able to make direct connections, and receive email without a 3rd party service... at least no one other than whoever runs the connections along the route. It should be, by and large, private except to the sender and recipient with the exception of any routing data to get it the correct address.
But although this could in theory happen (and occasionally does from knowledgeable power-users with the right configuration and equipment), for the vast majority, it doesn't. And the problem is legion. No fixed IP addresses, NAT (even worse carrier grade NAT), DDOS attack protection, having to config a DMZ on a router, even just setting up a data source/website are some of the many, many problems. So of course, naturally all this gets offloaded to companies like AWS, Cloudflare, Google, and the big social networks because they make it easy. You likely have to pay, or pay by giving up your privacy, or both, but they take care of everything. You can find your own webhost, but again, it's like paying a third party to get your postal mail for you...because your mailbox is moving around, hundreds of people are trying to look through it or destroy it, or find a way into your house using it.
It doesn't have to be this way. But someone has to make it trivially easy to solve these problems without the big tech companies, and I'm not sure I see that happening anytime soon. Can we ban carrier grade NAT? Can we require internet providers to allow hosting and stop DDOS attacks? Is the cure worse than the pain now? I don't know what the answer is, but he has a point. We need to take back ownership of our data and the web
Re:Ok ? But who's going to host it ? (Score:4, Interesting)
Not only that, but people are relying on third-parties to keep the data available 24/7/365 until the end of time. I can tell you for a fact that if a company goes under, so does your data. I had an SVN repository hosted by a third-party. The company went tits-up and al of that data is now gone. There might be a backup of it somewhere but it's inaccessible to the company's customers. This is really no different that relying on some physical media to store data. Long gone are 7-inch, 5-inch, 3.5 inch floppy disks. Gone are Syquest disks. Gone are magneto-optical disks. Gone are Zip drives. Gone are magnetic tape drives of bunchteen flavors. CD-ROMs are probably still readable... if you can find a drive for them. Compact Flash probably still work. SD cards and Micro SD cards, plenty of those around... if you can remember what was on them because you can't easily label them. Oops, did you roll over one with your desk chair? Sayonara. External hard drives? Oh, did it use some long-dead interface like SCSI? Heh. And the drive is also hopelessly stuck. Not to worry though. Most of that data wasn't important anyway.
Re: (Score:3)
When I woke up this morning the birds were singing. Absolutely wonderful sound.
Tomorrow the birds, maybe the same, maybe others, will be singing a comparable song.
Nothing lasts forever. Not even the mountains or the stars.
Re:Ok ? But who's going to host it ? (Score:4, Interesting)
It's not worth it anymore... even if you setup a PHP or whatever server and host the site on your *own* computer, complete with forwarding it through your modem and firewall, and you post useful self-made programs and a blog and have a full message board and chatroom... because you're small, nobody will find it in a search unless you give them the direct address to it.
And, these days, anything you post online is going to get digested by an LLM as "input data" to be regurgitated later for no real reason, and Gods help you if you post a unique program (especially if you post the code)... someone *will* find it and make it theirs.
Re: (Score:2)
nobody will find it in a search unless you give them the direct address to it.
I don't think this is a problem. Random people online can't find my facebook posts in a search either. Hell, I doubt anyone would even be able to find this reply in a search.
Re: (Score:2)
Very true.
Guess it's just the thought of an AI scraping my site for training data, and then regurgitating it without crediting me for that information.
Re: (Score:2)
Can we ban carrier grade NAT?
I remember when I learned about IPV6 in my IT training class back in the early late 90's early 2000's that it feels to me this universe you describe is what we seemed to think it would enable. Everything on a big network, no NAT, unlimited address space, everything auto magically configured. So much potential.
Re: Ok ? But who's going to host it ? (Score:2)
Re: (Score:3)
Not scaling is a good problem to have. Let me worry about that later. Right now we can't even host.
Re: (Score:1)
Re: (Score:2)
Barely anyone uses that outside of progressive circles (even worse than Bluesky), and it's still a bunch of servers hosting other people's content, not people hosting their own content.
Will it increase profits? (Score:2)
Protections from AI (Score:3)
Re: (Score:2)
Just build your web site with cursive script.
Sure, if the bad guys agree to play nice (Score:5, Insightful)
It's an impossible job from a technology perspective. It requires the bad guys to play nice. You can make a secure system that keeps your data out of the hands of everyone, that's not an issue. But you don't want to keep it out of the hands of everyone. You have it online so you can give it out selectively to people and companies. As soon as you let someone see any part of it, though, that part is no longer under your control. I don't care what fancy permissions and terms of use you have on it, you're just trusting that your wishes are respected. Let's face it, if we could trust companies to play nice we wouldn't be in this situation to start with.
Not possible, technically. It might be possible legally, if lawmakers create and enforce penalties for non-compliance. Europe might do it, but no way such anti-business legislation is going to pass in the USA. Not for another decade at least.
Sorry Tim (Score:3)
The billionaire tech bros have killed the Internet. I used to be excited at the possibilities. Now I only curse the outcomes. Google, amongst others, has become a cancer that will only get worse unless people wake the hell up.
Re: (Score:1)
The billionaire tech bros didn't build the internet. A lot of smart individuals did. Nothing changes here either. All it needs is some smart individuals to make a beginning.
Re: (Score:2)
Oh I know, that's why I said they killed it. Their insatiable greed blinds them from the marvel that the Internet actually is.
Owning your data isn't the problem... (Score:3)
Not that data ownership and control isn't a big issue that needs to be addressed. It is.
The issue is how social media has injected itself into our culture. So many people, without a single thought, document their day to day in the form of text, photos and videos with no thought to why and to what the consequences are. Because they have been trained to do so. To support levels of consumerism that are not sustainable in the long term.
The solution is sociological. We have to put social media and related technology at a proper distance. Banning access to smart phones in schools is a start. It needs to be coupled with a lot of media literacy, parental and governmental support to reduce and largely eliminate the usage of social platforms by those under 18.
And no, this isn't about unworkable laws for age verification. We have to break some fundamental patterns here. Because the impact on our overall health and functioning as a society as a whole is being negatively impacted to a degree we can't ignore.
Re: (Score:2)
"The issue is how social media has isocial media has injected itself into our culture". You have that backward. The issue is how we have injected our culture into social media. We collectively gave them the data to enslave us.
Everyone needs to suggest something new know (Score:2)
Some companies want to introduce micropayments for crawlers
The RSS people want to create RSL to define licensing
Then there is the suggestion for a content-policy in robots.txt
Now Tim Berners-Lee comes with their own suggestion
Line on the left. One cross each. Next.
More like New Path not wrong path (Score:2)
This process describes how online platforms and digital media initially offer high-quality, user-focused experiences but gradually degrade into exploitative, click-driven environments prioritizing ad revenue and shareholder profits over genuine content.
Major corpora
it was Anerica and the cancer (Score:2)
IEEE P1912 (Score:1)
Narrator: (Score:2)
It was too late.