AI Junk Is Starting To Pollute the Internet (wsj.com) 55
Online publishers are inundated with useless article pitches as websites using AI-generated content multiply. From a report: When she first heard of the humanlike language skills of the artificial-intelligence bot ChatGPT, Jennifer Stevens wondered what it would mean for the retirement magazine she edits. Months later, she has a better idea. It means she is spending a lot of time filtering out useless article pitches. People like Stevens, the executive editor of International Living, are among those seeing a growing amount of AI-generated content that is so far beneath their standards that they consider it a new kind of spam.
The technology is fueling an investment boom. It can answer questions, produce images and even generate essays based on simple prompts. Some of these techniques promise to enhance data analysis and eliminate mundane writing tasks, much as the calculator changed mathematics. But they also show the potential for AI-generated spam to surge and potentially spread across the internet. In early May, the news site rating company NewsGuard found 49 fake news websites that were using AI to generate content. By the end of June, the tally had hit 277, according to Gordon Crovitz, the company's co-founder. "This is growing exponentially," Crovitz said. The sites appear to have been created to make money through Google's online advertising network, said Crovitz, formerly a columnist and a publisher at The Wall Street Journal.
Researchers also point to the potential of AI technologies being used to create political disinformation and targeted messages used for hacking. The cybersecurity company Zscaler says it is too early to say whether AI is being used by criminals in a widespread way, but the company expects to see it being used to create high-quality fake phishing webpages, which are designed to trick victims into downloading malicious software or disclosing their online usernames and passwords. On YouTube, the ChatGPT gold rush is in full swing. Dozens of videos offering advice on how to make money from OpenAI's technology have been viewed hundreds of thousands of times. Many of them suggest questionable schemes involving junk content. Some tell viewers that they can make thousands of dollars a week, urging them to write ebooks or sell advertising on blogs filled with AI-generated content that could then generate ad revenue by popping up on Google searches.
The technology is fueling an investment boom. It can answer questions, produce images and even generate essays based on simple prompts. Some of these techniques promise to enhance data analysis and eliminate mundane writing tasks, much as the calculator changed mathematics. But they also show the potential for AI-generated spam to surge and potentially spread across the internet. In early May, the news site rating company NewsGuard found 49 fake news websites that were using AI to generate content. By the end of June, the tally had hit 277, according to Gordon Crovitz, the company's co-founder. "This is growing exponentially," Crovitz said. The sites appear to have been created to make money through Google's online advertising network, said Crovitz, formerly a columnist and a publisher at The Wall Street Journal.
Researchers also point to the potential of AI technologies being used to create political disinformation and targeted messages used for hacking. The cybersecurity company Zscaler says it is too early to say whether AI is being used by criminals in a widespread way, but the company expects to see it being used to create high-quality fake phishing webpages, which are designed to trick victims into downloading malicious software or disclosing their online usernames and passwords. On YouTube, the ChatGPT gold rush is in full swing. Dozens of videos offering advice on how to make money from OpenAI's technology have been viewed hundreds of thousands of times. Many of them suggest questionable schemes involving junk content. Some tell viewers that they can make thousands of dollars a week, urging them to write ebooks or sell advertising on blogs filled with AI-generated content that could then generate ad revenue by popping up on Google searches.
Starting? (Score:5, Insightful)
Re:Starting? (Score:5, Insightful)
The internet has been filled with useless Junk since inception.
That is true, but it used to be human-produced. Now it is mass-produced in large volume.
Orders of magnitude easier to produce means orders of magnitude more.
Re: (Score:2)
The internet has been filled with useless Junk since inception.
That is true, but it used to be human-produced.
Indeed, now we may actually see a jump in quality of the internet junk as we remove humans from the loop.
Re:Starting? (Score:4, Interesting)
"Junk" has always been a problem. And there have always been solutions to it, from the Internet JunkBuster [wikipedia.org] in the late 90s to today's tools of NoScript, Adblock and their myriad friends. The problem isn't the tools, or the junk, it's the fact that most people are too stupid to use the tools to make "junk" unprofitable.
Re: (Score:2)
Both Mosaic and NCSA HTTPd (precursor to Apache) were released 1993. The internet itself is older of course, and I saw a demo of it in room full electronics in 1980. The technician spent 15 minutes trying to get a some number. Finally he typed a command (not ping) and the number and thereafter proudly declared that what he had showed was a working internet over the Atlantic, and this was the future. The first fiber cable across the Atlantic cam
Re: (Score:2)
The internet has been filled with useless Junk since inception. But now instead of junk it is useless money grabs via advertisements and flat out bad information; junk is an understatement.
Wouldn't it be nice of we could perhaps identify an ideal virtual community that would be known as the only legal place online useless Junk can be marketed. A sort of spam dumping ground, located far away from the useful portions of the web.
I hear Social Media is quite toxic this time of year. Why just last week Aunt Karen was buying Grandma's freedom with a Nigerian wire transfer, after those horribly realistic kidnappers deepfuked their way into her Meta-world.
Sounds perfect.
Re: (Score:2)
>an ideal virtual community that would be known as the only legal place online useless Junk can be marketed
Ooo Ooo I know one! It has existed since day one ... the dot-com domain. Limit ALL marketing (ads, spiels, , buying, selling of all kinds) to [whatever].com.
New users must OPT-IN to .com .
Anyone who BREAKS the LAW by marketing elsewhere (after they finish their jailtime) gets an IP-filtered connection to .com ONLY ... FOR LIFE, without parole.
Re: (Score:2)
That's certainly an interesting reason for people to (ab)use a VPN service...
Starting to? (Score:5, Insightful)
Google has returned almost exclusively crap for two years now, and the problem has gotten seriously worse since about 2016
Re: (Score:2)
Re: (Score:2)
Between the rising ad placement and the ever-improving censorship it was already on its way down. The flooding of crap content was also well underway prior to the release of AI content generators.
Day'll come when the Internet 'shrinks' from the individual's perspective because you'll only want to visit and participate in verified local sites - anything else will just involve too much filtering effort and still be too untrustworthy to be useful.
Re: (Score:2)
I wonder how many people realize they are only seeing a part of the internet. The part their corporate overlords agree with.
Re: (Score:2)
People used to brag how they rarely had to go beyond the first page to find what they wanted on google. These days you disregard the entire first page as nothing but promoted sales pitches.
Re: (Score:2)
I stopped using Google about 3 years ago. Why are you still using it?
Re: (Score:3)
Re: (Score:2)
Most likely using nothing instead of Google, the open internet having been whittled down to a handful of "content platforms" each with their own search tool.
Re: (Score:2)
There is nothing that comes close to the original Alta Vista, sadly. There are some search engines that do deliver less crap though. Google is really "worts of breed" at this time. What I am currently using is DDG, but I am open to move again
I feel so sad... (Score:2)
My heart goes out to the Russian penis-pill sellers, Indian and Filipino ersatz-blogger ad farmers, and Chinese domain squatters being displaced by this latest entrant into the Search Enshittification Optimization sweepstakes.
The New Term- AILie (Score:2)
Welcome to the new world of the AILie. What the hell did you think was going to happen?
Re: (Score:1)
Re: (Score:2)
Proper nouns have no proper pronunciation.
Re: The New Term- AILie (Score:2)
The proper pronunciation is whatever the owner wants it to be. Now, where is my luxury yacht?
Solution: Use AI to filter out worst ideas (Score:1)
AI just allows anyone to flesh out an idea in a way that is comprehensible to others.
So maybe the first step is to use AI to put ideas into buckets you can review to filter out entire categories of ideas as being fundamentally bad, then have a human decide which submissions are worth keeping from the summaries that remain...
AIs spamming each other (Score:5, Insightful)
You know how holding a microphone near a speaker creates feedback noise, that's what is going to happen with all the AIs spamming themselves and learning from each other. You'll get some weird shit as the result. Multiple generations of feedback loops result in bias errors being amplified until it crowds out everything else.
Re:AIs spamming each other (Score:4, Interesting)
Re: (Score:3)
Re: (Score:2)
Good thing that no greedy people will ever be able to think of a solution to this problem!
Re: (Score:2)
Wait and that's why you never worried about this? It seems like a perfectly good reason to actually worry about it.
Re: (Score:2)
Re: (Score:2)
I explained that problem here back in March [slashdot.org], the hot new tern for it is model collapse [arxiv.org].
Now is the time to start building a 'clean' content database. It only gets harder from here.
Re: (Score:2)
that's what is going to happen with all the AIs spamming themselves and learning from each other. You'll get some weird shit as the result.
That's exactly what's going to happen. Probably for a long time to come. Or perhaps until the weird shit it just happened to randomly regurgitate turns out to be useful. Who knows.
Ask why ChatGPT stopped training data in 2021 (Score:2)
IMHO, I believe that it is because tech like this started filtering into the training data, making the results worse, not better. Companies that are locking down access for a $$ like Reddit may have already let the horse out of the barn, as the datasets that are useful have already been scraped from their sites.
The internet was already polluted (Score:5, Insightful)
Ah yes Re:The internet was already polluted (Score:1)
You must be speaking of the before-time, before the September that never ended [wikipedia.org].
Re: (Score:2)
Here's the real problem:
In the 90's, only scammers were flooding the net with pop-ups and pop-unders. That was annoying but manageable.
In the 2000's, only legitimate companies were flooding the net with ads. That was annoying but manageable.
Today, everyone and their grandma is flooding the net with sponsorships and begging for patrons. Now I don't "surf" the Internet anymore.
Yeah, I know their pain (Score:2)
Usually online publishers are the source, not the victim, of such junk, though.
And this junk will train the next LLMs (Score:3)
Just imagine, ChatGPT-5, Llama-2 and Bard-2 will be trianed on all the crap generated by ChatGPT-4, Llama and Bard, + decades of SEO Optimized crap wrtitten by Humans.
Definetly, a recipe for a "Bad Experience" tm
WSJ discovers SEO (Score:1)
AI In-breeding (Score:4, Insightful)
A local radio personality warned that as AI generated content is put on the internet and used to train other AI systems, we will end up with the AI equivalence of inbreeding. I liked that analogy a lot and thought it appropriate for this conversation.
Re: (Score:2)
I already see this on image boards. All those AI images look cool from a technical perspective, but... they all look the same.
Some people don't understand why quickly-created sketches like Trollface were so popular. Sometimes, you just got to throw something out of left field.
Was this slashdot post generated by AI? (Score:1)
Someone soon ... (Score:1)
Then there will be AI tools for spoofing the AI junk-detection tools. And another turtle on top of that on. Then another. Until the system collapses, crashes, and destroys a few dozen "paper billionaires".
We may be seeing the first such melt-down with Musk not paying his bills for Twitter - and when that collapses, the lawsuits (for unpaid wages, unpaid compute-farm bills, etc) will move on to disembowel his other const
Hey! (Score:2)
Your sewage is contaminating my cess pool.
Source is not the issue. (Score:2)
It's not like the rest of the internet is so perfectly useful.
What if we had "identity" on the internet?!?! (Score:2)
I know it'd be crazy, but what if you forced companies to CARE about what they hosted. And then... PUNISHED people for bad stuff.
I know it's crazy on the internet to even think stuff like that, but... Otherwise... I'll keep generating more stuff you hate. I swear it!
Re: (Score:2)
Aren't we seeing 'authors' using AI... (Score:3)
...to 'write' their posts?
I'm seeing an uptick in odd spelling errors, consistent and repeating grammar errors, etc. It seems that AI is being used by these authors to transcribe their thoughts.
Ugh.
The spelling errors aren't outright misspelling, but wrong spelling for the use. I see errors in tense. Odd stuff.
fwiw, I do not use AI for my posts. All the odd stuff is mine and mine alone.
Re: (Score:2)
-Bingo. You won some thing. Automated story telling
And it'll be much worse (Score:1)
Just a Thought. (Score:1)
Another Neal Stephenson prediction (Score:1)
What about the factcheckers? (Score:1)
Publishing to generate ad revenue (Score:2)
If doing it with AI is more efficient, i.e. generates substantially more ad revenue than it costs, then it is an effective business model & the internet as an ad revenue generating platform is working as intended.
If inciting indignant outrage with articles about the world & the people in it regardless of whether it is true or not is the most efficient & competitive wa