Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Google AI Security

Google's Big Sleep LLM Agent Discovers Exploitable Bug In SQLite (scworld.com) 36

spatwei writes: Google has used a large language model (LLM) agent called "Big Sleep" to discover a previously unknown, exploitable memory flaw in a widely used software for the first time, the company announced Friday.

The stack buffer underflow vulnerability in a development version of the popular open-source database engine SQLite was found through variant analysis by Big Sleep, which is a collaboration between Google Project Zero and Google DeepMind.

Big Sleep is an evolution of Project Zero's Naptime project, which is a framework announced in June that enables LLMs to autonomously perform basic vulnerability research. The framework provides LLMs with tools to test software for potential flaws in a human-like workflow, including a code browser, debugger, reporter tool and sandbox environment for running Python scripts and recording outputs.

The researchers provided the Gemini 1.5 Pro-driven AI agent with the starting point of a previous SQLIte vulnerability, providing context for Big Sleep to search for potential similar vulnerabilities in newer versions of the software. The agent was presented with recent commit messages and diff changes and asked to review the SQLite repository for unresolved issues.

Google's Big Sleep ultimately identified a flaw involving the function "seriesBestIndex" mishandling the use of the special sentinel value -1 in the iColumn field. Since this field would typically be non-negative, all code that interacts with this field must be designed to handle this unique case properly, which seriesBestIndex fails to do, leading to a stack buffer underflow.

This discussion has been archived. No new comments can be posted.

Google's Big Sleep LLM Agent Discovers Exploitable Bug In SQLite

Comments Filter:
  • It was looking at recent commits.

    Did it find something other current tools would not have found?

  • Risky to disclose (Score:5, Informative)

    by idontusenumbers ( 1367883 ) on Tuesday November 05, 2024 @12:36PM (#64921371)

    This seems risky to disclose considering the nature of sqlite being embedded and how many things that use SQL don't use a shared library or get updated often, if ever.

    • You're saying the disclosure is the risky thing, and not using the software in that way?

    • ... "vulnerability in a development version" makes me think this is not an old bug in existing releases.

      My tinfoil hat thinks the bug was committed just to sell the idea of AI 'finding' it.
    • Re:Risky to disclose (Score:5, Informative)

      by swillden ( 191260 ) <shawn-ds@willden.org> on Tuesday November 05, 2024 @02:13PM (#64921619) Journal

      This seems risky to disclose considering the nature of sqlite being embedded and how many things that use SQL don't use a shared library or get updated often, if ever.

      Embedded sqlite libraries (and lots of other stuff) not being updated often is the problem there, not the disclosure. It's a broad and deep problem, and a serious one, but holding back disclosures is not how you fix or mitigate it. Holding back disclosure just ensures that more device/systems are vulnerable for more time.

      In cases where it's feasible to notify developers who use a vulnerable library before public disclosure that's the right way to do it, but for widely-used open source libraries like sqlite, there's no way. Any notification to developers using sqlite is a public notification. The best you can do with sqlite is to let the sqlite team notify all paid support contract holders, and it seems likely that was done since the sqlite team was notified a month ago and the public announcement was last week.

      • by reanjr ( 588767 )

        "All historical vulnerabilities reported against SQLite require at least one of these preconditions:
        The attacker can submit and run arbitrary SQL statements.
        The attacker can submit a maliciously crafted database file to the application that the application will then open and query."

        I can't think of anyone using SQLite in a way that would actually present a risk.

        • Comment removed based on user account deletion
          • Yeah, I've always felt like CVEs were mostly security theater. As a system administrator, I am going to concentrate on layered security and keeping everything patched. There's almost never any reasonable action I can take on a CVE. That shit is upstream. But then the bosses want to take up my time proving why each CVE doesn't apply. And even when it does, they're not willing to let me shutdown the server while upstream patches it, so WTF am I gonna do with that? Maybe once in a blue moon there's a config to

        • I can't think of anyone using SQLite in a way that would actually present a risk.

          We can hope.

          That said, I agree that sqlite is an impressively high-quality piece of software, and its APIs don't encourage app developers to write code in ways that enable arbitrary SQL injection, so... maybe.

          I also want to plug paid sqlite support (honestly the main reason I decided to reply). I don't know what it costs, but I've had to use it twice and Dr. Hipp and his team are fantastic. Very responsive and extremely capable. No "Did you try this list of obvious things" after your carefully-written

    • "in a development version"

      Caught before release according to TFS.

    • It's actually pretty damn difficult if not impossible to exploit in most cases, in particular for the very embedded uses you mention, so the risk appear to be pretty minimal.
  • Seems that sentinel values are a bad idea in the first place.

  • by gillbates ( 106458 ) on Tuesday November 05, 2024 @12:59PM (#64921419) Homepage Journal

    Maybe in the near future, software engineers will rely on AI to write the test cases and run the tests... Because, let's face it, how many software engineers like writing test cases?

    • by suutar ( 1860506 )

      I have seen some AI generated test cases - trivial stuff, like "that method returns a string. Verify that the string is equal to " but (a) good for coverage numbers and (b) good for catching accidental modifications to user-visible text.
      I look forward to when it can do more sophisticated checking of the code flow and build test cases that exercise different paths. I don't expect it to be perfect but it'd be a nice starting point.

    • by reanjr ( 588767 )

      I find writing tests to be relaxing, as long as the code being tested is functional. Object-oriented software test cases are a fucking nightmare.

  • by Tim the Gecko ( 745081 ) on Tuesday November 05, 2024 @01:59PM (#64921591)

    I wonder if the LLM is named after the movie [nytimes.com].

    The Big Sleep is one of those pictures in which so many cryptic things occur amid so much involved and devious plotting that the mind becomes utterly confused. And, to make it more aggravating, the brilliant detective in the case is continuously making shrewd deductions which he stubbornly keeps to himself. What with two interlocking mysteries and a great many characters involved, the complex of blackmail and murder soon becomes a web of utter bafflement. Unfortunately, the cunning script-writers have done little to clear it at the end.

  • An area with lots of mostly-automateable work where the result can be checked by humans and false positives are no big deal. Perfect usecase for AI.

  • text should be "...a widely used software *package*" ... there is no such thing as "a software" just as you do not have "one information." You have a piece of software. Grammar. *sigh*
  • While this is good news that LLMs are used to discover potential 0-days, it would be much better if AI could be trained to spot such flaws directly in the code instead of being just getting better at running fuzzer against binary

    Discovering exploits at analyzing the source code would not only be a real breakthrough, but also a major progress at having a more secure code base.

  • "However, the team emphasized that Big Sleep remains âoehighly experimentalâ and that they believe a target-specific fuzzer âoewould be at least as effectiveâ at detecting vulnerabilities as the AI agent in its current state."

    It was also only a bug in recently-committed development code, never pushed to release, and there's nothing to say it wouldn't have been caught before then.

    Sorry, but this is more AI hyperbole even as the authors literally say "Yeah, you could also find this with a

"Here's something to think about: How come you never see a headline like `Psychic Wins Lottery.'" -- Comedian Jay Leno

Working...