Getty Images is Suing the Creators of AI Art Tool Stable Diffusion for Scraping Its Content (theverge.com) 64
Getty Images is suing Stability AI, creators of popular AI art tool Stable Diffusion, over alleged copyright violation. From a report: In a press statement shared with The Verge, the stock photo company said it believes that Stability AI "unlawfully copied and processed millions of images protected by copyright" to train its software and that Getty Images has "commenced legal proceedings in the High Court of Justice in London" against the firm. Getty Images CEO Craig Peters told The Verge in an interview that the company has issued Stability AI with a "letter before action" -- a formal notification of impending litigation in the UK.
The AI cat is out of the bag (Score:2)
Re: (Score:3, Insightful)
It's not even just AI. Pastiches are commonplace these days, and learning from existing art is how art is taught. I worked twenty years in information technology for a school district and the district boardroom and associated building's interior walls were covered in student art displays that called-out the masters from whom the students found inspiration, literally, "In the manner of Matisse," or, "in the manner of Dalí."
If Getty made these images available on the Internet without putting them behin
Re:The AI cat is out of the bag (Score:5, Interesting)
earning from existing art is how art is taught.
If Warner Bros. animators didn't imitate the Disney style, we literally wouldn't have the Looney Tunes and Merrie Melodies cartoons we know today.
If you like learning about that stuff, KaiserBeamz' "The Merrie History of Looney Tunes" on YouTube, an ongoing series, is quite a rabbit hole to go down (no puin intended).
Re:The AI cat is out of the bag (Score:5, Informative)
Of course, those animators actually drew new art. E.g. you wouldn't have seen Disney Logo appear in the work. You also didn't have the verbatim characters reproduced.
AI 'learning' is a bit of an overstatement. The models have the ability to detect that part of their training set is applicable to the description and how to convincingly blend that into other elements it pulls out of its training set, but it's not making a new creative work. The article illustrates this by showing examples where the AI just plopped the Getty Images watermark right into the result. The AI doesn't "know" that watermark is not part of the image, so it just pulled that in with the rest of the assets.
Re: (Score:3)
No no no, the AI was just drawing images in the style of Getty [slashdot.org].
(Sarcasm, as always.)
Re: (Score:3)
It may draw a watermark in the same style but the model does not store images. That's not at all how it works.
Re: (Score:1)
Re: (Score:2)
Re: (Score:3)
Re: The AI cat is out of the bag (Score:2)
*You* acquiring a skill is memorising. All youâ(TM)re doing is training weights and biases in your brain over hundereds of iterations of looking, and trying.
Re: (Score:3)
Re: (Score:2)
The "learning" that an AI does is not "acquiring a skill". It's memorizing.
What do you think acquiring a skill is?
Those who adopt your point of view will be forced to argue in court for the existence of an immortal soul, if they pursue this line of reasoning.
Re: (Score:2)
Re:The AI cat is out of the bag (Score:4, Informative)
It doesn't save the original images verbatim as a whole, it saves chunks of the original images. You don't have to copy an *entire* work verbatim to run afoul of copyright.
Anyone who has worked with "AI" knows that the training data "leaks through" in very recognizable, obvious chunks. Not "Hey, they drew a picture in the same style as some other picture" but "hey, that has bits of the some other picture pasted in". Same with text, the models frequently dump out very distinct chunks of text verbatim from the training dataset.
This is a huge reason to be wary of using these approaches except in cases where you *know* the entire dataset is licensed/copyrighted in a fashion allowing for this sort of manipulation. If the AI code generation is trained on open github projects, brace yourself for potentially violating licenses. Same here, if AI has chewed on Getty images or shutterstock... brace yourself for having your output land you on the wrong side of a copyright case...
Re: (Score:2)
It doesn't save the original images verbatim as a whole, it saves chunks of the original images.
Are you sure about that? I do not believe that's the case. My understanding is that AI like Stable Diffusion breaks an image down to pixels to learn the relationship between those pixels and create contexts, then later when it's provided with contexts it'll reverse diffuse what it has come up with until it gets an image. You make it sound like it's a piece of software that's good at photoshopping a bunch of sections cut out of other images. If by sections, you mean down to the pixel level, then I guess
Re:The AI cat is out of the bag (Score:4, Insightful)
Ultimately that 'learned relationship between pixels' amounts to an encoding. A very lossy encoding with some interesting hooks for the model to be able to reference when and how it should be combined with others, but nonetheless an encoding.
It's not literally opening up Photoshop and copy/pasting, but it is combining the contexts in a way analogous to how that would work. The visualizations referenced in the article illustrate this principle pretty clearly.
Re: (Score:2)
Seems like the illiterate rube is convinced of the opposite, actually.
Re:The AI cat is out of the bag (Score:4, Interesting)
Storage of the images is not required for a copyright claim.
Getty needs only to establish that Stable Diffusion, A.) copied the work without their permission, and B.) said copying affects the commercial value of the work.
The first part establishes copyright infringement, and the second part negates the fair use defense. There is no necessity of distribution of copyrighted works to establish infringement of copyright. (Indeed, the RIAA lawsuits showed that they didn't need to prove that distribution had occurred to win a copyright claim - the copying was enough.)
Re: (Score:2)
Just wait until college textbook publishers hear about this. They'll be looking for royalties for any time someone remembers a phrase or concept during their careers.
Re: (Score:2)
Fair use is a defense against copyright infringement. The copying itself is considered infringement, unless a fair use defense can be raised. The courts consider (among other things) the commercial impact of the copying on the value of the work when deciding whether or not copying is fair use. But yes, the act of copying alone is presumed to be copyright infringement unless a fair use defense can be raised.
This is why, for example, Rick Beato's YT videos get taken down. Even though many lawyers have
Re: The AI cat is out of the bag (Score:3)
Please define creativity. No matter how you define it, if it becomes law I can use it as a bludgeon on other artists to make sure that if their style even remotely resembles one in Getty's portfolio, it will be going to court.
I really think artists should be very careful what they wish for.
Re: (Score:3)
Getty isn't characterized by a 'style'. We aren't talking about visually similar, we are talking about being able to verbatim identify copy/pasted chunks of the source material.
There's a lot of room for ambiguity when it comes to law and creative works, but copy/paste chunks of other pictures with a touch of photo editing to blend it with the rest of the picture would universally be seen as running afoul of copyright, unless there's some parody or fair use exemption in play. This is effectively how somethi
Re: (Score:2)
The article illustrates this by showing examples where the AI just plopped the Getty Images watermark right into the result. The AI doesn't "know" that watermark is not part of the image, so it just pulled that in with the rest of the assets.
IMO to not acknowledge that Getty has public domain images with their watermarks all over them on their site would be kind of problematic, as that phrasing implies it's all "their" images having watermarks on them, and ignoring Getty's dodgy past with trying to claim licensing rights, if not outright ownership, over public domain works.
Re: The AI cat is out of the bag (Score:2)
Most likely, if you want to look at the art online to learn from it, there's no copyright issue. It's only when you go mass scrape the site that you have broken the T&C and therefore are now subject to copyright infringement.
For example, if you download an image and save it to your hard drive, then make other images from it using an AI, you almost certainly infringed. Because AI cannot by law be a creator, an AI probably also cannot creatively transform a work, which would at least protect you from non-
Re: (Score:2)
And what if the scraping-process simply analyzed each image for its constituent elements at several levels (think image software and filters for things like finding edges, mapping color palettes, facial recognition) where the original image could not be reconstructed from the analysis data, and then built images off of that analysis data instead?
Re: The AI cat is out of the bag (Score:3)
If you never put one of the generated images online, it's probably fine. That's personal use and it's covered by fair use. But if you use that tool to generate derivative works then publish those, you may be infringing. The rule is written such that this is ok only if you have significantly transform the work in a creative way.
For example, if you make a tool that scrapes all of Getty's images, then processes each one by adding a picture of your dog using a script, then this is almost certainly a violation o
Re: The AI cat is out of the bag (Score:2)
For the "dog" example, assume you create dogsoftheweb.net to host the generated images. Obviously, if you wanna wallpaper your house in downloaded Getty images, that's ok, too.
Re: The AI cat is out of the bag (Score:2)
Since the creativity is in the prompt, and the AI is just a tool, there is no need for the AI to be a creator. One could argue that the fact its using images isn't really relevant either. So neither is copyright.
Re: (Score:3)
you have broken the T&C and therefore are now subject to copyright infringement.
But wouldn't this conflate a ToC violation, and copyright violation?
I mean a ToC violation can be a copyright violation, but that doesn't automatically mean ToC violations can constitute copyright infringement. For example, if a licensing agreement for a movie prohibits copying bits and pieces, and I use small bits in a review, sure it'd need to be litigated to be certain, but this is a classic, bona fide example given of fair use.
Re: The AI cat is out of the bag (Score:2)
What I mean is that your activity - which might be construed as copyright infringement by the courts otherwise - may be protected by the T&C. But as soon as you break the T&C, you lose any extra protections it may have granted you.
Re: (Score:2)
Most likely, if you want to look at the art online to learn from it, there's no copyright issue.
I don't thinks it's an issue of "learn" as much as it is style. Boris Vallejo [duckduckgo.com] has a very distinct style, as does Marc Chagall [duckduckgo.com]. If you go look at those and then paint something in one of those styles, you'd still be infringing, just like if you as much as use the first ten consecutive notes from Stairway to Heaven it'd be considered infringing.
I've played with AI image generation. If you prompt it to "draw me a painting of two cats playing chess", it will and will use its knowledge to create the painting.
Re: (Score:2)
It's only when you go mass scrape the site that you have broken the T&C
US law currently allows screen scraping, regardless of T&C restrictions attempted by the site owner. Which makes sense, since a screen scraper doesn't do anything different than a web browser. It is just an HTTP request, and then the web server sends the content it wishes to. The client needs to store the text and images on the client either way.
Re: The AI cat is out of the bag (Score:3)
Under copyright law you can scrape. But under copyright law you also cannot send that image to an online tool for analysis.
Re: (Score:2)
Under copyright law you can scrape. But under copyright law you also cannot send that image to an online tool for analysis.
What leads you to believe that? Copyright law just says no one can publish that image to make money off of it. An artist, art critic, student, AI researcher, etc. can do whatever analysis they want to on that image.
Re: The AI cat is out of the bag (Score:2)
You can't send an image to a third party for processing. Copyright law does not even clearly give you the right to make backup copies.
I don't know how everyone got so confused about copyright. You can't make copies. That's the whole point. It's pretty fucking basic.
https://copyrightalliance.org/... [copyrightalliance.org]
Re: (Score:2)
If Getty made these images available on the Internet without putting them behind some kind of login then they published them for all to look at. Undoubtedly some human art students have taken inspiration from what they've seen, and then turned around and created art based on that. If the plaintiff cannot demonstrate relationships between original and new works that show the exact same arrangement of elements, or cannot demonstrate how the AI went about generating its art, then it may be a tough case to actually prove that the AI is simply copying and then editing the existing images.
Do they have to prove the relationship? I'm not sure Fair Use (or Fair Dealing in the UK) allows images to be downloaded, preprocessed, and then used as input for a for-profit ML model.
This is going to be a deeply influential ruling either way as this new class of generative models really depends on those kinds of data sources. Not to mention other forms of big data scraping.
Re:The AI cat is out of the bag (Score:4, Insightful)
The software is open source, even if the official models get taken down by DMCA, "AI Pirates" will continue to make models and will distribute them through more uncensored means. The copyright space is getting more and more loose every year. It looks like copyright lawyers are practicing for Mickey in 2024.
Not necessarily, the tough part isn't the software (at least once it's written), the tough part is building the massive datasets [springboard.com] and then getting the giant cluster to train the models [openai.com].
If the models themselves get out then certainly anyone can run them, but no one is going to recreate those models without a big team and some very deep pockets.
Re: (Score:2)
The government won't even allow its copyright laws to be interpreted in a way which will significantly restrict AI research and development. The US is not going to want to cede technological superiority in this sector to other countries with less restrictive copyright laws.
Using copyrighted material to train AIs is not commercial activity. It is training and education. Selling the results of the AI's work is commercial activity. I will be very surprised if future legal cases result in anything other than th
Re: (Score:2)
Re: The AI cat is out of the bag (Score:2)
You've never heard of fair use, have you?
Cat out of bag, finds box, fits, sits (Score:3, Insightful)
> The images are not open sourced.
Seems completely irrelevant to me.
Any artist can go to the Getty site, or any other site, that is showing even just thumbnails, much less the larger watermarked versions that are more typical, and brew up derivative ideas. Almost all visual art is highly derivative. Likewise music, another form of art. Considerable written fiction. Etc.
ML applications like Stable Diffusion are doing exactly this. No more, no less. They're not copying works. They're creating derivative wo
Re: (Score:2)
illegally accessing
Allegedly. Alleging a claim is true doesn't automatically make it so, that's what the litigation is for.
That is the sort of overconfidence that lead to Sony being beaten twice legally on emulation in the US, and the music industry failing to take on decentralized p2p clients like Grokster & bitTorrent clients.
Re: The AI cat is out of the bag (Score:2)
Theyâ(TM)re not AI pirated anyway. Copyright bans the distribution of the artwork. The AI isnâ(TM)t distributing the artwork, and nor are the creators of the AI.
Would Linkedin v HiQ be relevant? (Score:4, Informative)
This is als rich coming from a company that appears to attempt to license out/claim the ability to license public domain works (look at the licensing plan options existing on public domain works that are on their site/smeared with their logo feces), and whose subsidiary tried to claim rights over a public domain image of the coronavirus.
I'm guilty too (Score:2)
Re: I'm guilty too (Score:2)
That's fair use. No issue here.
But I assume you didn't then put those images up on a website for others to see, which would clearly be copyright infringement.
Re: (Score:2)
Re: I'm guilty too (Score:2)
It depends. The courts decide on a case by case basis whether the "inspiration" counts as a creatively transformative work. For example, if you take 4 Andy Warhols and you put them in a 2x2 tile, that's clearly infringement. For something a bit more complicated, see the discussion here on photo mosaics: https://www.avvo.com/legal-ans... [avvo.com]
Fair is fair (Score:2)
If you can't claim copyright through AI usage you shouldn't be liable for infringement from it either.
it's a joke, meta-mod, give me AI moderation please!
Open the door please Hal.
Shalmaneser save us! Can you still Stand On Zanzibar?
The AI is not copying content (Score:2)
Just watching and learning.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
True but Getty then have a habit of going after people using the same public domain image for copyright infringement. Including at times the original creator of the image.
Using knowlege of prior work is not infringement (Score:3)
Just having seen and studied prior art does not make all future art produced by an artist infringement.
This was predictable (Score:2)
I wrote about copyright vs AI [aardvark.co.nz] a day or two ago and pointed out that copyright issues are going to be a real problem with the way AI systems are being trained and with the results they're producing:
Need More Input (Score:2)
No Disassemble, Getty Images!!