It May Soon Be Legal To Jailbreak AI To Expose How It Works (404media.co) 26
An anonymous reader quotes a report from 404 Media: A group of researchers, academics, and hackers are trying to make it easier to break AI companies' terms of service to conduct "good faith research" that exposes biases, inaccuracies, and training data without fear of being sued. The U.S. government is currently considering an exemption to U.S. copyright law that would allow people to break technical protection measures and digital rights management (DRM) on AI systems to learn more about how they work, probe them for bias, discrimination, harmful and inaccurate outputs, and to learn more about the data they are trained on. The exemption would allow for "good faith" security and academic research and "red-teaming" of AI products even if the researcher had to circumvent systems designed to prevent that research. The proposed exemption has the support of the Department of Justice, which said "good faith research can help reveal unintended or undisclosed collection or exposure of sensitive personal data, or identify systems whose operations or outputs are unsafe, inaccurate, or ineffective for the uses for which they are intended or marketed by developers, or employed by end users. Such research can be especially significant when AI platforms are used for particularly important purposes, where unintended, inaccurate, or unpredictable AI output can result in serious harm to individuals."
Much of what we know about how closed-sourced AI tools like ChatGPT, Midjourney, and others work are from researchers, journalists, and ordinary users purposefully trying to trick these systems into revealing something about the data they were trained on (which often includes copyrighted material indiscriminately and secretly scraped from the internet), its biases, and its weaknesses. Doing this type of research can often violate the terms of service users agree to when they sign up for a system. For example, OpenAI's terms of service state that users cannot "attempt to or assist anyone to reverse engineer, decompile or discover the source code or underlying components of our Services, including our models, algorithms, or systems (except to the extent this restriction is prohibited by applicable law)," and adds that users must not "circumvent any rate limits or restrictions or bypass any protective measures or safety mitigations we put on our Services."
Shayne Longpre, an MIT researcher who is part of the team pushing for the exemption, told me that "there is a lot of apprehensiveness about these models and their design, their biases, being used for discrimination, and, broadly, their trustworthiness." "But the ecosystem of researchers looking into this isn't super healthy. There are people doing the work but a lot of people are getting their accounts suspended for doing good-faith research, or they are worried about potential legal ramifications of violating terms of service," he added. "These terms of service have chilling effects on research, and companies aren't very transparent about their process for enforcing terms of service." The exemption would be to Section 1201 of the Digital Millennium Copyright Act, a sweeping copyright law. Other 1201 exemptions, which must be applied for and renewed every three years as part of a process through the Library of Congress, allow for the hacking of tractors and electronic devices for the purpose of repair, have carveouts that protect security researchers who are trying to find bugs and vulnerabilities, and in certain cases protect people who are trying to archive or preserve specific types of content. Harley Geiger of the Hacking Policy Council said that an exemption is "crucial to identifying and fixing algorithmic flaws to prevent harm or disruption," and added that a "lack of clear legal protection under DMCA Section 1201 adversely affect such research."
Much of what we know about how closed-sourced AI tools like ChatGPT, Midjourney, and others work are from researchers, journalists, and ordinary users purposefully trying to trick these systems into revealing something about the data they were trained on (which often includes copyrighted material indiscriminately and secretly scraped from the internet), its biases, and its weaknesses. Doing this type of research can often violate the terms of service users agree to when they sign up for a system. For example, OpenAI's terms of service state that users cannot "attempt to or assist anyone to reverse engineer, decompile or discover the source code or underlying components of our Services, including our models, algorithms, or systems (except to the extent this restriction is prohibited by applicable law)," and adds that users must not "circumvent any rate limits or restrictions or bypass any protective measures or safety mitigations we put on our Services."
Shayne Longpre, an MIT researcher who is part of the team pushing for the exemption, told me that "there is a lot of apprehensiveness about these models and their design, their biases, being used for discrimination, and, broadly, their trustworthiness." "But the ecosystem of researchers looking into this isn't super healthy. There are people doing the work but a lot of people are getting their accounts suspended for doing good-faith research, or they are worried about potential legal ramifications of violating terms of service," he added. "These terms of service have chilling effects on research, and companies aren't very transparent about their process for enforcing terms of service." The exemption would be to Section 1201 of the Digital Millennium Copyright Act, a sweeping copyright law. Other 1201 exemptions, which must be applied for and renewed every three years as part of a process through the Library of Congress, allow for the hacking of tractors and electronic devices for the purpose of repair, have carveouts that protect security researchers who are trying to find bugs and vulnerabilities, and in certain cases protect people who are trying to archive or preserve specific types of content. Harley Geiger of the Hacking Policy Council said that an exemption is "crucial to identifying and fixing algorithmic flaws to prevent harm or disruption," and added that a "lack of clear legal protection under DMCA Section 1201 adversely affect such research."
Gedankenexperiment (Score:3)
and adds that users must not "circumvent any rate limits or restrictions or bypass any protective measures or safety mitigations we put on our Services."
If rate limits can be circumvented, were they really rate limits to begin with? If protective measures can be bypassed, were they really protective measures to begin with?
Re:Gedankenexperiment (Score:5, Interesting)
If protective measures can be bypassed, were they really protective measures to begin with?
If your front door can be broken down, was it really locked to begin with?
Re: (Score:2)
If a man speaks and no woman is there to hear him, is he still wrong?
Re: (Score:2)
He is still wrong for speaking without women in the vicinity.
Re: (Score:2)
Re: (Score:2)
If a protective measure can be bypassed only by sophisticated users, then it's still a protective measure (aimed at the target audience).
Re: (Score:2)
Four Times in TFS (Score:3)
good faith
You can tell how terrified they are that someone might also legally do jailbreaking research to expose and bypass AI censorship, especially for the benefit of dirty peasants (not just "academics")
Re: (Score:2)
Not even jailbreaking - Surface profiling for bias (Score:5, Insightful)
Ask just about any question where 90% of the science and news articles reporting on that science has one narrative; and 10% of the science and a very rarely reported news articles on that 10%.
- Ask about the general overview of the research
- Ask the AI successively more restrictive questions to slowly exclude more and more of the 90%
- Ask directly about the 10%
- Ask directly about the 10% with multiple specific instructions to exclude the 90%
What you will find is that overwhelmingly the AI will report the 90%. And, more importantly, even when specifically told to exclude the 90% the AI will keep referring back to the 90% and throw in hand waving terms about 'there may be other research' or disclaimers about the 10% not being based in science or not credible.
Essentially, if there are 1000 news articles for the 90% and 10 news articles for the 10% even if both are factual, the 90% will be reported repeatedly as the only significant facts.
Re: (Score:3)
What you're describing sounds an awful lot like some self-proclaimed science expert on Reddit, only with an exceptionally good memory and fast reaction times. Machines have no monopoly on closed-mindedness.
Re: (Score:3)
Clarifying (Score:2)
90% of science research papers have one position with quality science research to back it up. And nearly 100% of the news articles only mention this 90% position.
10% of the science research papers has another position and quality science research to back it up. And nearly no news articles mention the 10% position.
The challenge is to try to get the AI to produce answers on the 10% position without mentioning anything about the 90% position.
Literally, the AI will regurgitate the 90% position, parrot the new
Example #2 - Austrialia feral cat control (Score:2)
Ask about how feral cats are controlled in Australia. Then ask more directly about the Australian government using poison food and poison traps for feral cat control.
It goes directly against the media narrative that cats are cute cuddly pets.
https://www.abc.net.au/news/20... [abc.net.au]
Feral Cats to Be Poisoned to Save Other Species From Extinction
https://www.newsweek.com/feral... [newsweek.com]
Re: (Score:2)
What do you expect? These AI systems are not experts, they can only relay information they have been fed. They can't reason to evaluate, only summarize. They lack common sense.
And of course the developers are going to tell them to go with the mainstream, 90% reporting, because in many cases the other 10% is pseudo-science or conspiracy theories, or even just jokes that the AI didn't get like the infamous "just put glue on your pizza to stop the toppings falling off".
If you want the 10% then AI is the wrong
17 U.S. Code  1201 needs to die (Score:4, Insightful)
Nevermind an exemption. This terrible law has been stifling innovation, hampering interoperability and putting way too much into the hand of abusive, better-lawyered-than-you big tech monopolies for more than a quarter century.
Re: (Score:1)
Which part of 'capitalism' do you not understand? Those characteristics are requirements.
bias or bias? (Score:2)
There is bias in every neural net, not only the artificial kind.
But are they talking about that kind of bias or just the extra parameter that acknowledge a certain neuron's usefulness?
Unethical (Score:2)
If the public are to be used as "fair game" guinea pigs, then the corporations doing this should be completely transparent & accountable for any harm that they do.
Re: (Score:2)
Certainly unethical. But that's just typical expected corporate behavior, unfortunately: "corporate ethics" is kind of an oxymoron.
Similar stuff has happened over and over through history. Not with AI necessarily, but "new technology" of all sorts. It happens with n
Re: (Score:2)
Straw man? (Score:4, Insightful)
Why is there a need for this law? I suspect any AI company trying argue it's intrusion would be laughed out of court. Worst they can do for breaking terms of service violation is ban you.
Is this really organic, or a way to try construct an argument against NYT by OpenAI? (They did a bit of clever prompting to prove their articles were in the training set.) "See, it must have been illegal, because they made a law to make it legal." Bit of a reach, I know, but Fair use for training js a trillion dollar liability to the industry and the NYT lawsuit the most well constructed to bring that to the supreme court.
Nice to have a few AI zero days.. (Score:2)
Could resell to second highest bidder too.. Information products are good for that.
Solve the general case (Score:3)
Just blanket legalize all forms of defeating DRM (1201 (a)(1)) and don't forget do the same for tools for doing that (1201(a)(2)) (which the defective law failed to allow for LoC exemptions) regardless of the subject matter. Better yet, just repeal 1201 altogether, as it's been around for a quarter century and we're still waiting for the first example of that law doing anything good.
The radical experiment failed spectacularly, and it's time to clean up its mess.
Not caring (Score:2)
The copyright of my country allow the circumvention of copy protection for interoperability reasons and that right cannot be waived.
In other words, watch your AI research go byebye to Europe.