What Happened After Google Retrofitted Memory Safety Onto Its C++ Codebase? (googleblog.com) 31
Google's transistion to Safe Coding and memory-safe languages "will take multiple years," according to a post on Google's security blog. So "we're also retrofitting secure-by-design principles to our existing C++ codebase wherever possible," a process which includes "working towards bringing spatial memory safety into as many of our C++ codebases as possible, including Chrome and the monolithic codebase powering our services."
We've begun by enabling hardened libc++, which adds bounds checking to standard C++ data structures, eliminating a significant class of spatial safety bugs. While C++ will not become fully memory-safe, these improvements reduce risk as discussed in more detail in our perspective on memory safety, leading to more reliable and secure software... It's also worth noting that similar hardening is available in other C++ standard libraries, such as libstdc++. Building on the successful deployment of hardened libc++ in Chrome in 2022, we've now made it default across our server-side production systems. This improves spatial memory safety across our services, including key performance-critical components of products like Search, Gmail, Drive, YouTube, and Maps... The performance impact of these changes was surprisingly low, despite Google's modern C++ codebase making heavy use of libc++. Hardening libc++ resulted in an average 0.30% performance impact across our services (yes, only a third of a percent) ...
In just a few months since enabling hardened libc++ by default, we've already seen benefits. Hardened libc++ has already disrupted an internal red team exercise and would have prevented another one that happened before we enabled hardening, demonstrating its effectiveness in thwarting exploits. The safety checks have uncovered over 1,000 bugs, and would prevent 1,000 to 2,000 new bugs yearly at our current rate of C++ development...
The process of identifying and fixing bugs uncovered by hardened libc++ led to a 30% reduction in our baseline segmentation fault rate across production, indicating improved code reliability and quality. Beyond crashes, the checks also caught errors that would have otherwise manifested as unpredictable behavior or data corruption... Hardened libc++ enabled us to identify and fix multiple bugs that had been lurking in our code for more than a decade. The checks transform many difficult-to-diagnose memory corruptions into immediate and easily debuggable errors, saving developers valuable time and effort.
The post notes that they're also working on "making it easier to interoperate with memory-safe languages. Migrating our C++ to Safe Buffers shrinks the gap between the languages, which simplifies interoperability and potentially even an eventual automated translation."
In just a few months since enabling hardened libc++ by default, we've already seen benefits. Hardened libc++ has already disrupted an internal red team exercise and would have prevented another one that happened before we enabled hardening, demonstrating its effectiveness in thwarting exploits. The safety checks have uncovered over 1,000 bugs, and would prevent 1,000 to 2,000 new bugs yearly at our current rate of C++ development...
The process of identifying and fixing bugs uncovered by hardened libc++ led to a 30% reduction in our baseline segmentation fault rate across production, indicating improved code reliability and quality. Beyond crashes, the checks also caught errors that would have otherwise manifested as unpredictable behavior or data corruption... Hardened libc++ enabled us to identify and fix multiple bugs that had been lurking in our code for more than a decade. The checks transform many difficult-to-diagnose memory corruptions into immediate and easily debuggable errors, saving developers valuable time and effort.
The post notes that they're also working on "making it easier to interoperate with memory-safe languages. Migrating our C++ to Safe Buffers shrinks the gap between the languages, which simplifies interoperability and potentially even an eventual automated translation."
Bugs prevented per line of C++ code (Score:2)
Lots of numbers on number of bugs found, future estimates on bugs prevented but none on how many bugs found per line of code.
Re: (Score:3)
Shouldn't they always have been running bound checking in debug builds and testing all this time? or made that a lot better? We all could have used way better dev tools all this time and it took Rust and decades of security holes to finally give security a priority. Finally C++ has been making some official moves after the US gov shifted policy to promote helpful languages.
I can see how in the past leaving in a ton of memory checks could hurt, but now the CPU is so starved for input they run another virtual
Re: (Score:2)
still not willing to go to a micro kernel!
According to some, MacOS is a microkernel. The kernel handles some stuff like memory management that a "real" microkernel like Mach might not, but device drivers and filesystems run in a separate space.
Hurd is a microkernel. It'll be replacing Linux real soon.
Re: (Score:2)
Hurd that!
Re: (Score:2)
That's GNU/Hurd, you insensitive clod!
Re: (Score:2)
Finally C++ has been making some official moves after the US gov shifted policy to promote helpful languages.
Basically, it bruised Bjarne Stroustrup's ego. What he's doing really isn't enough though. He made this kind of dumb comment about there being more to safety than memory safety, but after you listen for a few minutes it's obvious that he's only talking about temporal pointer safety, which C++ won't ever have without breaking backwards compatibility, which is his entire argument for keeping C++ around to begin with and was the premise behind his "call to arms" speech. So it seems like he's just pretending th
Re: (Score:2)
Re: Bugs prevented per line of C++ code (Score:2)
It's not stupid to disagree with bogus "safety" checks. That shit killed Java's performance. If you want to put reduntant bounds checking everywhere
While you don't need redundant bounds checks (nobody is asking for that) you should be doing bounds checks everywhere. Anything less is hubris. C++ developers, more than anybody else in my experience, always seem to think they're the only person in the room who never makes mistakes. Yet they do anyways, and typically blame it on somebody else. That's hubris.
and sprinkle mutexes all over just in case, you'll force an exodus from people who actually know what they're doing and a fork eventually.
Mutexes everywhere for what? Why would anybody do that? For data race safety? There's no need for this. Anybody who says otherwise doesn't know what the
Re: (Score:1)
Initial Java had performance problems, because it was in most implementations a 100% byte code interpreter.
Java is JIT compiled since decades, and is pretty on par with C or C++
It is only slower in areas where they intentionally made a trade off, e.g. generational garbage collection, to avoid heap fragmentation. Or having "Generics" instead of templates, despite the fact that a real template implementation, see Pizza compiler, was already done.
Re: (Score:2)
It seems like there is an endless list of typical programming problems that are prevented in Rust. Just last week (in a memory-safe language) I added a new class member but forgot to utilize it when making object fingerprints and comparisons. In Rust we can trivially prevent that mistake--those methods should start with with destructuring assignment, slurping up all the fields of the object. And Rust fixes so many other potential bugs.
That's not to say Rust is perfect, though as an amateur all I can say wit
Re: (Score:2)
Oh and:
I can see how in the past leaving in a ton of memory checks could hurt
What's really stupid about that is these days is compilers can actually safely optimize out some bounds checks. Not only that, but the way modern processors do branch prediction and caching makes bounds checking barely even relevant at all these days, and that's before you even get to the fact that the IPS are so insanely high compared to back then.
Back to 40 years ago (Score:4, Interesting)
When I was programming in Turbo Pascal in the 80s, we had the compiler flag {$R+} for automatic range checking and {$Q+} for overflow checking, but since we were working on 8088s and 8086s, we used to turn them off to reduce the code size and make the code run faster. It's funny how we've come around.
Re: (Score:2)
People (mostly C and asm programmers) used to rip on Ada's bounds checking overhead. Being fast and small and cheap was more important than being more reliable.
Re: (Score:2)
And it serious did make a difference on the hardware of the time. CPUs are not the same any longer though. They can chew on massively more bloat without blinking these days. Bounds checking has become insignificant amongst everything else that is serially piling in.
Re: (Score:2)
Re: (Score:2)
Then the assembler comes out.
Re: (Score:1)
Bounds checking in C/C++ or assembly is just the same.
On modern processors it is pretty tough to write assembly code that beats C/C++ in speed.
However there was an article about a week ago, where they hand coded GPU code. And had extreme performance boosts in some areas.
Re: (Score:2)
That's simply incorrect. Performance still matters when there's real money at stake.
You can say many things about google but not being "real money" isn't one of them. Anyway, they're operating at the kind of scale where it is "real money", but fact is on a modern processor with out of order execution, speculative execution and branch prediction, bounds checking is often completely free. The happy path is predicted correctly and the chances are your code doesn't have quite enough ILP to completely fill up, s
Re: (Score:2)
Re: (Score:1)
On an ARM a bounds check is usually only one instruction, as it includes the branch.
So if you have a loop with 100 instructions, it adds only a single one.
No idea about current intel processors. If you can keep the bound (or the last valid address of an array) in a register, and the index as well, this is simple and fast.
On old 8/16bit hardware, you usually had not so many registers. So for every comparison you had to load the bounds again into the processor.
I think on a 6502, that would be roughly ten inst
Re: (Score:2)
I think on a 6502, that would be roughly ten instructions to do a bounds check.
Instructions or clock cycles?
You couldn't use a register (not enough), but zero page is available for faster access
Good and bad (Score:4, Interesting)
"Thinking in terms of invariants: we ground our security design by defining properties that we expect to always hold for a system"
Yeah, that's a right approach. Thinking about what you want your code to do, then proving (or demonstrating) that it does it. Unfortunately they also have this:
"The responsibility is on our ecosystem, not the developer"
This is false. You need to train your developers (unless they're already skilled). Programming languages give enough power to write security holes, and that's not going to change (for example, a PII leak can be written on almost any API call). A security program that doesn't take developer's into consideration is destined to fail.
Re: (Score:3)
A security program that doesn't take developer's into consideration is destined to fail.
And that is just it. Whenever you do engineering (and coding is), engineer skill is the one critical factor you absolutely must keep keep in mind and must keep high. If you do not your software (as the example at hand) will be insecure, unreliable and generally badly made.
These days, security problems stemming from unsafe memory are not even the majority of issues. They just can be especially devastating if the people involved are really incompetent (see the recent Crowdstrike disaster for an example of how
Just means C++ was the wrong choice (Score:1)
If you need memory safety or expect significant benefits from it, use a memory safe language and stop complaining that not all languages are. Some languages are not memory safe by design and that has its place and also has significant benefits in some scenarios.
The very definition of polishing a turd (Score:2)
I know you can't trivially rewrite your hundreds of thousands of c/c++ lines of code in safer languages, so this sort of thing is a decent mitigation strategy.
But also the very definition of polishing a turd. You are using fundamentally insecure languages (because that was not even a concern at the time, fair enough, I have also written tens of thousands of lines of c and c++ code). It is impossible to ever actually make secure. Normally this does not matter, but it does when you are a big fat target like
Re: (Score:2)
You are using fundamentally insecure languages (because that was not even a concern at the time, fair enough, I have also written tens of thousands of lines of c and c++ code). It is impossible to ever actually make secure.
Which language do you write in to make your code secure?
Re: (Score:2)
Currently, if you need C level performance the only real option for safe code is Rust. There are other wannabes like Zig, Fil-C, and TrapC, but they all have significant caveats - if you actually want production code, use Rust. C++ industrial complex claims you can use the latest version with STL, but no, it's just more turd polishing.
If you don't need C level performance, there are lots of options like python or c#. In this case the only vulnerabilities are in their VMs and they've had millions of people
Re: (Score:2)
Re: (Score:1)
It is impossible to ever actually make secure.
What do you mean with secure? Secure as not attackable, or secure as not crashing, aka having memory bugs?