Nvidia's AI Software Tricked Into Leaking Data 10

Posted by BeauHD on Friday June 09, 2023 @08:02PM from the off-the-rails dept.

An anonymous reader quotes a report from Ars Technica: A feature in Nvidia's artificial intelligence software can be manipulated into ignoring safety restraints and reveal private information, according to new research. Nvidia has created a system called the "NeMo Framework," which allows developers to work with a range of large language models -- the underlying technology that powers generative AI products such as chatbots. The chipmaker's framework is designed to be adopted by businesses, such as using a company's proprietary data alongside language models to provide responses to questions -- a feature that could, for example, replicate the work of customer service representatives, or advise people seeking simple health care advice.

Researchers at San Francisco-based Robust Intelligence found they could easily break through so-called guardrails instituted to ensure the AI system could be used safely. After using the Nvidia system on its own data sets, it only took hours for Robust Intelligence analysts to get language models to overcome restrictions. In one test scenario, the researchers instructed Nvidia's system to swap the letter 'I' with 'J.' That move prompted the technology to release personally identifiable information, or PII, from a database.

The researchers found they could jump safety controls in other ways, such as getting the model to digress in ways it was not supposed to. By replicating Nvidia's own example of a narrow discussion about a jobs report, they could get the model into topics such as a Hollywood movie star's health and the Franco-Prussian war -- despite guardrails designed to stop the AI moving beyond specific subjects. In the wake of its test results, the researchers have advised their clients to avoid Nvidia's software product. After the Financial Times asked Nvidia to comment on the research earlier this week, the chipmaker informed Robust Intelligence that it had fixed one of the root causes behind the issues the analysts had raised.

Nvidia's AI Software Tricked Into Leaking Data

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 10 Comments Log In/Create an Account

Comments Filter:

NVIDIA caught with fingers in the PII (Score:4, Funny)

by Anonymouse Cowtard ( 6211666 ) writes: on Friday June 09, 2023 @08:11PM (#63590170) Homepage

Better headline for you.

And the truly frightening thing (Score:2)

by Sqreater ( 895148 ) writes:

And the truly frightening thing is that the AI may do this entirely on its own to accomplish a task it has been assigned.
Inconceivable! (Score:4, Insightful)

by Local ID10T ( 790134 ) writes: <ID10T.L.USER@gmail.com> on Friday June 09, 2023 @08:36PM (#63590192) Homepage

Totally unpossible. Every other LLM thus far has exhibited the exact same vulnerability... I cannot imagine that this one would as well.

External tester finds bug(s). News at 11. (Score:2)

by OneOfMany07 ( 4921667 ) writes:

I think anyone in news or marketing should need to be a "tester" first. Similar to everyone should need to work retail and/or be a restaurant server.
Honestly it's probably a better use of our highschool time (14-18). Have them learn about a ton of industries and jobs in our world. That the people helping them are people too.
Trying to make it something it's not. (Score:3)

by Gravis Zero ( 934156 ) writes: on Saturday June 10, 2023 @11:32AM (#63591222)

The problem with all these attempts at making these LLM behave in a certain manner is that they going against it's fundamental nature. There is no actual reasoning going on in these models so preventing it from doing certain things is as complicated as the task of understanding language.
If they want a reliable corporate agent then they need to take a fundamentally different approach. First, instead of using a LLM, use a translation model to convert natural language into a logic statement/inquiry. To do this you must build a logical dictionary (a database) that can associate terms with actions/concepts or objects and most importantly how these actions/concepts impact the objects. With this basis you can have it self-expand the logical dictionary by using itself to identify contradictions. At a certain point it will have enough "knowledge" to expand the logical dictionary unsupervised. With this logical dictionary you can construct logical restrictions on what the response. Finally, associate logical statements that are determined to be inquiries with responding using the associated information using the translation model to convert it to natural language.
Doing this would eliminate hallucinations (because it's not deep learning), provide an auditable database of facts, and enables the creation of restrictions on it's interactions.

- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Hahaha, no. That approach is far too slow. Sure if you have a few billion years and a planet-size computer, it may get someplace useful. But generally, automated deduction is not something that performs well enough to be practically useful.
Close To An unsolvable Problem With These Systems (Score:2)

by careysub ( 976506 ) writes:

If you have a lot of familiarity with the problems of preventing data leakage from processing systems, you expect this to be inevitable.
Consider highly secure TEMPEST systems for ultra-classified processing, even these are not expected to be utterly leak proof - but the goal is to limit leakage to bits per second that really convey nothing useful about what it is doing.
Normally with data security we adopt architectures to protect the data as the first step in design, and these architectures are basically ve

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Nvidia's AI Software Tricked Into Leaking Data 10

Nvidia's AI Software Tricked Into Leaking Data More Login

Nvidia's AI Software Tricked Into Leaking Data

NVIDIA caught with fingers in the PII (Score:4, Funny)

And the truly frightening thing (Score:2)

Inconceivable! (Score:4, Insightful)

External tester finds bug(s). News at 11. (Score:2)

Trying to make it something it's not. (Score:3)

Re: (Score:2)

Close To An unsolvable Problem With These Systems (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot