


Google is Using YouTube Videos To Train Its AI Video Generator (cnbc.com) 31
Google is using its expansive library of YouTube videos to train its AI models, including Gemini and the Veo 3 video and audio generator, CNBC reported Thursday. From the report: The tech company is turning to its catalog of 20 billion YouTube videos to train these new-age AI tools, according to a person who was not authorized to speak publicly about the matter. Google confirmed to CNBC that it relies on its vault of YouTube videos to train its AI models, but the company said it only uses a subset of its videos for the training and that it honors specific agreements with creators and media companies.
[...] YouTube didn't say how many of the 20 billion videos on its platform or which ones are used for AI training. But given the platform's scale, training on just 1% of the catalog would amount to 2.3 billion minutes of content, which experts say is more than 40 times the training data used by competing AI models.
[...] YouTube didn't say how many of the 20 billion videos on its platform or which ones are used for AI training. But given the platform's scale, training on just 1% of the catalog would amount to 2.3 billion minutes of content, which experts say is more than 40 times the training data used by competing AI models.
And how do I opt out? (Score:3)
Better yet, this is opt-in right? And Google pays us for using our videos in this way? Right? Google?
Re: (Score:2)
Even if you were to opt out there's still millions of videos for channels that have been dead long before the terms changed. There's also nothing stopping any third parties from scraping YouTube for videos to train on either. The only difference is that YouTube has way better metrics about the videos on their own platform and know which parts of the video are skipped or replayed multiple times
Re: (Score:3)
Re: (Score:2)
They're now using that very content to try to replace the people who made it.
That's fucking classically juicy. I love it.
Re: (Score:2)
Google has always had an irrevocable right to use what you literally store on their property in any way they like. They're now using that very content to try to replace the people who made it. That's fucking classically juicy. I love it.
Has Google figured out where the stock-propping revenue is gonna come from when A.I. Greed convinces Capitalism that paying humans is old-fashioned and out-dated? Watching Greed get bitch-slapped by Common F. Sense, is predictably ironic. I hate it.
Re: (Score:2)
It might slow down our march toward idiocracy, though.
Re: (Score:2)
Say anything right-wing or pro-trump and they'll happily demonetize you and then make sure nothing from your vid will corrupt their pure model.
Re: (Score:2)
Given the way that YT has "shaped" the content it hosts, by way of its community guidelines and how it considers certain types of software to be "harmful" content etc... the odds are that the output of any AI system trained on YT videos will not be totally balanced. How would it handle this prompt:
"Create a video of an anti-LGBTQ zealot installing ad-blocking software on their computer with a swastika on the wall behind them"
Sorry, I have no matching material in my training data
Re: (Score:2)
Easy. Don't post content to YouTube. Any more questions?
Re: (Score:2)
So, Gemeni has learned ... (Score:2)
Re:Gemini mass trained on cute kittens (Score:2)
You say that like it's a bad thing.
Re: (Score:2)
Nobody loves cute kittys playing in a cardboard box [imgur.com] more than I.
Google already roughly 1/3 AI, won't that (Score:1)
magnify AI flaws? I suspect they use bots to detect and skip likely AI or CGI.
Re: (Score:2)
That's about uncurated content, because it then amplifies errors in bad outputs. That are the outputs that are (usually) not uploaded anywhere, because people would not want to see them.
YouTube has all the (proxy) metrics for quality, much more than just ratings and view numbers. People stopped looking at the crappy CGI? Don't train on it. People watched the AI clip to the end? It can't be that bad.
What did you think? (Score:2)
Seriously, what did you think they would do? What did you think after you saw the result, a video generator well ahead of the competition? What do you think why you need to give them many usage rights when you upload something? Of course they use YouTube.
Everyone else does as well, until Google blocks their crawlers. :P
Probably it's been used by everyone else too ... (Score:2)
... at least if it's any kind of public video. What was the question?
The platform holders are going to be the only ones (Score:2)
What that means is AI as a technology is going to belong to the very wealthiest people on the planet and everyone else gets nothing.
Re: (Score:2)
If you think that's bad (Score:2)
Microsoft trains their AI on corporate data, and most likely YOUR private data that you have to give your employer, produce for your employer under your real name using real information, and your LinkedIn.
Microsoft\s AI shit is a disaster waiting to happen because they've collected a lot of actual, sensitive, valuable data through Windows, Office 365, Teams and all the other cloudy garbage most companies use now, and because Microsoft has proven utterly incompetent with security for the past 50 years.
I'm mu
How is this surprising? (Score:2)
Google is Using YouTube Videos To Train Its AI Video Generator
What did they think Google would use, old VHS tapes?
Google missed the AI boat (Score:3)
Why is it not yet possible to search YouTube based on video content rather than going by the title and description? I mean we should be able to search things like "videos that have boats moving in the background and the sky is cloudy". They don't even have that working properly for regular image search.
Re: (Score:1)
Why is it not yet possible to search YouTube based on video content rather than going by the title and description? I mean we should be able to search things like "videos that have boats moving in the background and the sky is cloudy". They don't even have that working properly for regular image search.
Because that's precisely the type of request that would require an "AI". How do you think it'd be done?
Re: (Score:2)