Why DARPA is Funding an AI-Powered Bug-Spotting Challenge (msn.com) 43
Somewhere in America's Defense Department, the DARPA R&D agency is running a two-year contest to write an AI-powered program "that can scan millions of lines of open-source code, identify security flaws and fix them, all without human intervention," reports the Washington Post. [Alternate URL here.]
But as they see it, "The contest is one of the clearest signs to date that the government sees flaws in open-source software as one of the country's biggest security risks, and considers artificial intelligence vital to addressing it." Free open-source programs, such as the Linux operating system, help run everything from websites to power stations. The code isn't inherently worse than what's in proprietary programs from companies like Microsoft and Oracle, but there aren't enough skilled engineers tasked with testing it. As a result, poorly maintained free code has been at the root of some of the most expensive cybersecurity breaches of all time, including the 2017 Equifax disaster that exposed the personal information of half of all Americans. The incident, which led to the largest-ever data breach settlement, cost the company more than $1 billion in improvements and penalties.
If people can't keep up with all the code being woven into every industrial sector, DARPA hopes machines can. "The goal is having an end-to-end 'cyber reasoning system' that leverages large language models to find vulnerabilities, prove that they are vulnerabilities, and patch them," explained one of the advising professors, Arizona State's Yan Shoshitaishvili.... Some large open-source projects are run by near-Wikipedia-size armies of volunteers and are generally in good shape. Some have maintainers who are given grants by big corporate users that turn it into a job. And then there is everything else, including programs written as homework assignments by authors who barely remember them.
"Open source has always been 'Use at your own risk,'" said Brian Behlendorf, who started the Open Source Security Foundation after decades of maintaining a pioneering free server software, Apache, and other projects at the Apache Software Foundation. "It's not free as in speech, or even free as in beer," he said. "It's free as in puppy, and it needs care and feeding."
40 teams entered the contest, according to the article — and seven received $1 million in funding to continue on to the next round, with the finalists to be announced at this year's Def Con, according to the article.
"Under the terms of the DARPA contest, all finalists must release their programs as open source," the article points out, "so that software vendors and consumers will be able to run them."
But as they see it, "The contest is one of the clearest signs to date that the government sees flaws in open-source software as one of the country's biggest security risks, and considers artificial intelligence vital to addressing it." Free open-source programs, such as the Linux operating system, help run everything from websites to power stations. The code isn't inherently worse than what's in proprietary programs from companies like Microsoft and Oracle, but there aren't enough skilled engineers tasked with testing it. As a result, poorly maintained free code has been at the root of some of the most expensive cybersecurity breaches of all time, including the 2017 Equifax disaster that exposed the personal information of half of all Americans. The incident, which led to the largest-ever data breach settlement, cost the company more than $1 billion in improvements and penalties.
If people can't keep up with all the code being woven into every industrial sector, DARPA hopes machines can. "The goal is having an end-to-end 'cyber reasoning system' that leverages large language models to find vulnerabilities, prove that they are vulnerabilities, and patch them," explained one of the advising professors, Arizona State's Yan Shoshitaishvili.... Some large open-source projects are run by near-Wikipedia-size armies of volunteers and are generally in good shape. Some have maintainers who are given grants by big corporate users that turn it into a job. And then there is everything else, including programs written as homework assignments by authors who barely remember them.
"Open source has always been 'Use at your own risk,'" said Brian Behlendorf, who started the Open Source Security Foundation after decades of maintaining a pioneering free server software, Apache, and other projects at the Apache Software Foundation. "It's not free as in speech, or even free as in beer," he said. "It's free as in puppy, and it needs care and feeding."
40 teams entered the contest, according to the article — and seven received $1 million in funding to continue on to the next round, with the finalists to be announced at this year's Def Con, according to the article.
"Under the terms of the DARPA contest, all finalists must release their programs as open source," the article points out, "so that software vendors and consumers will be able to run them."
Not "sees flaws", just an easy reproducible target (Score:2)
If it was a closed source target it would be hard to evaluate by multiple groups and no guarantees that issues found could be recycled back into the code (or even acknowledged by the author). Open source allows competitors to compare code-bases directly to make sure they are all using the same configuration / version, and ensures the taxpayer funding goes into something that can verifyably benefit everyone.
Why the assumption of doom-and-gloom?
Re: (Score:2)
It's also a case of "doing something" for more security theater. No doubt there's a Rust summary in the works seeking to propagandize.
If DARPA is actually serious about this: It's about elimi
Re: (Score:2)
In either case, I think this may have some use for attacks for low-quality code. Of which there is a lot, same as low-quality coders. But as defense? That may be an exceptionally bad idea, especially when the machine is tasked with the fixes as well.
Is it the "Smaller Stack Case" here? (Score:2)
Are these supply chain attacks, FOSS bug bounty programs, CrowdStrike BSOD, logJ, SolarWinds, Python package site takeover just a roundabout way down the path where more and more FOSS packages will become "unverified" security-wise and be dropped from use?
Re: (Score:2)
I don't think so. "Have a care what you download" already applies. You are still better off and likely will be better off with FOSS if you have some clue. Sure, many IT people are lacking that clue. But COTS is really not doing any better.
Re: Not "sees flaws", just an easy reproducible ta (Score:1)
Most binaries can be de-compiled pretty accurately back into source code. The problem for humans is that compiler optimizations are applied symbols arenâ(TM)t included (unless the devs forgot to turn off that option at compile time).
This makes it very hard to read function 3748 calls function 12029 with argument 12283, unrolled loops and lots of time is spent re-naming variables to be able to mentally keep track of it all. There is also a lack of comments on what the code actually is intended to do vs
Re: (Score:2)
An LLM (not AI) could easily convert variable names, add comments and find common flaws (such as potential null-based exploits) and some work on this is already being done.
Actually, it cannot. At least not in an useful way, because that would require understanding and insight. It can make things worse by adding misleading, wrong and incomplete comments though. Stop attributing magical powers to LLMs. They do not have those.
Re: (Score:1)
It doesn't require understanding and insight to tell you what most code does, it's called language parsing and LLM are pretty good at it, hell compilers can often tell you where mistakes are made.
What you're talking about is logic and what the developer intended to do, and what I specifically said is a problem in closed source which is off course more difficult for an LLM although much code is just copy-pasta from online sources, so it is likely you can find statistically similar matches to other code, whic
Re: Not "sees flaws", just an easy reproducible t (Score:2)
Well, if you restrict this to simplistic beginner-level code, sure. But that is not what the story is about.
Re: (Score:1)
Most code is beginner level code at the function level. Most code calls other well-defined library functions for the heavy lifting. If you are re-inventing the crypto algorithm in your closed source product, you've got other problems.
Find? Maybe. (Score:5, Insightful)
Fix? No. Or rather a very bad idea. That is going to put in patterns of security bugs when it fails (and it will often enough) and that is worse, because suddenly you can attack different code in the same or similar way. Also remember that attackers will routinely evaluate code changes. And they will find out fast if AI fails at this in some typical ways and then go specifically after those.
And "no human intervention"? That is just really stupid. Maybe when (if) we get AGI, but not before.
I really do not get it. IT security is really not "just use the right pattern" or "just use the right component or software or algorithm" at this time. That may work when IT finally becomes a respectable engineering discipline and uses mostly standardized components with assured characteristics, say in 100-300 years. But before, insight and understanding is mandatory, and no current AI can do either.
Re: Find? Maybe. (Score:2)
Re: (Score:2)
You think? I think not at all.
Re: (Score:2)
AI bug spotting is certainly in the "not yet possible" category, but it's nowhere close to being "on the near horizon". LLMs might be able to generate text that looks like analysis, but they are clearly not capable of actually performing any.
DARPA is fully aware of this, so why hold the contest? With luck, we'll get some new static and dynamic analysis techniques, maybe we'll even find a few tricks for automatically correcting some common problems. AI is there just to attract attention. Traditional algo
Re: (Score:2)
> AI bug spotting is certainly in the "not yet possible" category,
I has been able to do bug spotting and fixing for quite some time already, but it can't find all bugs and it can't fix all of them. But neither can humans. So it is just a question of how close they can get compared to the best humans and how often they make errors when they try.
Me: Are there any bugs in this code:
#include
int main(){
int width = 2;
int height = 5;
int area = width + height;
Re: (Score:2)
You can't be serious. Yes, you'll get correct output for the many homework problems found in the training set. However, LLMs are not capable of reason and analysis. This is an objective fact.
Re: (Score:2)
Indeed. No idea why so many people are deep in denial about this solidly established fact.
Re: (Score:2)
Nope. AI can pattern-recognize known and well published bugs and can pattern-replace them with known and well-published changes, which may or may not be fixes. That is all it can do. But guess what, modern security scanners can do the same, only that the fixes are actually reviewed and they can also find some things that are not well-published.
Re: Find? Maybe. (Score:2)
Re: (Score:2)
Neural networks can be used to do many things
They're a lot less capable than you realize. People seem to think that NNs are magical electronic brains, poised to soon become more advanced than any normal computer. Nothing could be further from the truth. In terms of computational power, neural networks are down at the level of combinational logic -- equivalent to lookup tables.
This should be obvious. A neural network is just a function that maps inputs to outputs. That's it. Neural networks are also pure functions. They do not change when we us
Re: (Score:2)
That nicely sums it up. A NN or LLM is about as "intelligent" as as an encyclopedia. Sure, search is better, but not fundamentally so. And they are _old_ tech. The current hype has basically been created only by the higher efficiency that allowed larger training data sets, nothing else. Hallucinations, for example, have been known for a long time and cannot be fixed with this technology, no matter how much training data is used. There is also no reason to believe this is "early days" and there are a lot of
Re: Find? Maybe. (Score:2)
Re: (Score:2)
In actual reality, _none_ of them are like biological brains at all. You seem to _not_ be familiar with things at all.
Re: Find? Maybe. (Score:2)
Re: (Score:2)
Not even "fish-level".
Re: (Score:2)
Neural networks cannot "analyze" and cannot do "insight". Period. Look at the mathematics. Limitations do not not get more certain than this.
As to other AI areas, yes, automated proof generation can do limited insight and limited analysis. It is the only known AI area that can do it. (The approaches can be transferred to other application areas but essentially stay the same and have the same limitations.) But automated proof generation gets bogged down in complexity so fast, it cannot even do slightly advan
Re: (Score:2)
Maybe DARBA is just serving the hype. Or maybe they get political pressure to use AI "to fix the world" and do this contest to demonstrate that this is not actually possible at this time.
Not even wrong. (Score:3)
The code isn't inherently worse than what's in proprietary programs from companies like Microsoft and Oracle, but companies are too stingy to hire skilled engineers to help maintain the open-source code that they rely on because they are freeloaders.
FTFY. Let's put the blame where blame belongs. If you need solidly tested software then it's your responsibility to ensure that it's being solidly tested.
Re: (Score:2)
What a funny load of crap! (Score:2)
I will say this - it took balls for MSN to publish this so soon after the Windows + Crowdstrike fiasco.
Will the NSA, CIA and FBI have a say? (Score:2)
"scan millions of lines of open-source code, identify security flaws and fix them"
Will the NSA, CIA and FBI have a say on the fixing part? Maybe they would want to use the vulnerability/bug against an adversary?
Re: (Score:2)
Maybe they would want to use the vulnerability/bug against an adversary?
Ha ha don't be silly, the NSA/CIA/FBI/DIA would never ever do anything naughty.
(You've heard about the first 3, but you probably never heard of the DIA (the Defense Intelligence Agency) and trust me, that's the way they like it. Let's just say that the NSA, FBI, and CIA get a lot of their ideas from the lads in the DIA.)
Humans make errors.. Get rid of 'em, Right? (Score:2)
Ask CloudStrike how that went.
NP complete and the Halting Problem (Score:1)
If you know anything about computer sciene, you know this is a failure, and that makes it a fund-the-gov contractos scam.
I you don't, just google "the halting probelm" and how it's imporrible to detect bad code past a certain very useless point.
If you don't want to google and just want to vote, downvote me. I speak truth. This happens to be real and dangerous,
but hey, your head, your sand, stick deep.
Re: (Score:2)
It is impossible to find all bad code with 100% certainty from any code base. But it is very easy to find some of the bad code from large code base with no certainty of finding it all. I know this, because I have personally done this several times. I don't see any harm trying and I think they will find a few new bugs, but probably not much.
Re: (Score:2)
I once convinced my boss to let me do a week worth of work for a FOSS project to find and fix bugs in it that made its usage hard in our company. I got the permission to do the work, but I was told to do it on my own name, not on the company name (possibly fearing lawsuits).
Free Is Not Gratis (Score:1)
Boy if all the freeloaders of OSS realized that open source software is not gratis and pony up the value they extract without paying anything for it. If as an organisation you're afraid that something is not tested properly then contribute testing people to it. But no. Let's put some money into AI bullshit and pretend that's a good contribution to OSS.
Not only in Open Source software (Score:3)
...but I dare say in all software.
Which raises the question why software offered to the public is not mandated to be Open Sourced, so it _can_ be checked for problems at all.
Re: (Score:2)
...but I dare say in all software.
Certainly. By framing the challenge in terms of open source software, the competitors don't need to be vetted for special access and you get solutions that can be thoroughly checked by many eyes. That's a good project.
However, once a solution is found (if a good solution is found), the same process can be applied to closed source and even secret code. There are never enough eyes looking at secret code. DARPA's first priority is "defence" so securing secret code is most dear to them.
A machine that can find bugs ... (Score:2)
Is a machine that can insert exploits and back doors just as easily as it could implement a fix. Even better, it could insert evil code disguised as a fix or cooperating with a fix. This has obvious military and intelligence applications. I'm not saying I'm opposed, necessarily, though that is my default position, I'm just saying that DARPA spends money for the benefit of the US government, not out of the goodness of their open-sourced hearts.
AI Lint (Score:2)