Google Says CPU Patches Cause 'Negligible Impact On Performance' With New 'Retpoline' Technique (theverge.com) 120
In a post on Google's Online Security Blog, two engineers described a novel chip-level patch that has been deployed across the company's entire infrastructure, resulting in only minor declines in performance in most cases. "The company has also posted details of the new technique, called Retpoline, in the hopes that other companies will be able to follow the same technique," reports The Verge. "If the claims hold, it would mean Intel and others have avoided the catastrophic slowdowns that many had predicted." From the report: "There has been speculation that the deployment of KPTI causes significant performance slowdowns," the post reads, referring to the company's "Kernel Page Table Isolation" technique. "Performance can vary, as the impact of the KPTI mitigations depends on the rate of system calls made by an application. On most of our workloads, including our cloud infrastructure, we see negligible impact on performance." "Of course, Google recommends thorough testing in your environment before deployment," the post continues. "We cannot guarantee any particular performance or operational impact."
Notably, the new technique only applies to one of the three variants involved in the new attacks. However, it's the variant that is arguably the most difficult to address. The other two vulnerabilities -- "bounds check bypass" and "rogue data cache load" -- would be addressed at the program and operating system level, respectively, and are unlikely to result in the same system-wide slowdowns.
Notably, the new technique only applies to one of the three variants involved in the new attacks. However, it's the variant that is arguably the most difficult to address. The other two vulnerabilities -- "bounds check bypass" and "rogue data cache load" -- would be addressed at the program and operating system level, respectively, and are unlikely to result in the same system-wide slowdowns.
You can't "patch" hardware (Score:1, Interesting)
Re: You can't "patch" hardware (Score:2, Informative)
You can fix the microcode. You can also include software workarounds for hardware flaws. An example was the Pentium F00F bug, which was addressed by the operating system.
Re:You can't "patch" hardware (Score:5, Informative)
Geez... You make it sound like this is the first ever time someone has had to write a software patch to bypass a hardware flaw. Driver developers have had to come up with clever workarounds to hardware defects since the the dawn of computing.
These Intel firmware fixes are just going to become part of yet another security update that will be required to keep systems secure.
Re:You can't "patch" hardware (Score:4, Insightful)
Re: (Score:1)
Based in the summary, this is a fix that dramatically reduces the impact of meltdown (too lazy to read up as it doesn't directly impact me), if they found a way to keep meltdown in the lower bound, they're doing alright.
Lower bound being about 5% (initial patch on a pcid supporting processor was 7% in an artificial postgress benchmark that was more prone to slowdown than real life), if they found a way to get ok'd chips to that point, and shave a little bit off their, it dramatically reduces the problem.
It
Re: (Score:2)
I don't understand this talk about 'dramatically reducing the problem'. Either there is an exploitable flaw or not. If the fix only makes implementing the type of exploit harder, then it's not going to help at all. Some assembler freak and malware author somewhere in the world will still make it work.
I'm not claiming that there is no fix, only that mere workarounds may be of limited value. What I've read so far hasn't really reassured me. The same can be said about rowhammer, btw. What's so worrying about
Re: (Score:2)
Sure, but it's kind of like the Intel Pentium F00F bug. The underlying hardware issue will always be there, but the OS kernel can prevent that instruction from being run on the system.
amd needs desktop level server chips / ipmi boards (Score:2)
amd needs desktop level server chips / ipmi boards. Like intel exon-e3
Ryzen PRO chips fully support ECC so we just need a few boards with IPMI
ThreadRipper is an nice workstation system.
Threadripper boards with IPMI will be nice as it has higher clocks with less cores then epyc chips.
an full eypc board is overkill for smaller site hosts.
Re: amd needs desktop level server chips / ipmi bo (Score:5, Informative)
Re: (Score:2, Informative)
Sorry, but ARM says it does apply to some of the ARM models. Variant 3: rogue data cache load (CVE-2017-5754) is Meltdown.
https://developer.arm.com/support/security-update
For AMD's sake, I hope their assessment about Ryzen's different architecture is 100% correct. If someone should come up with a POC working on these, AMD would be completely screwed.
"Lesser" is subjective. It appears that Meltdown can be mitigated if not negated by the KAISER patches to operating systems but Spectre needs to have software (
Re: (Score:1)
This is a hardware level problem. This will be continued to be exploited pretty much indefinitely.
Have you looked at the actual retopline patches rather than simply inserting foot? It is an interesting approach to block speculative fetching by using indirect jumps/calls/returns.
time flies (Score:5, Funny)
Pentium 4.99989 disaster seems like yesterday.
Or just Buy AMD & get no slow down with more p (Score:5, Informative)
Or just Buy AMD & get no slow down with more pci-e lanes.
Re: (Score:2, Insightful)
This probably offers a false sense of security. It's very possible that there are bugs lurking in AMD hardware that are just as severe. Just because AMD processors aren't susceptible to Meltdown doesn't mean there aren't other vulnerabilities unique to AMD processors.
And sticking with Intel even after this patch probably offers a false sense of security. It's very possible that there are more bugs lurking in Intel hardware that are just as severe. Just because Intel processors have been patched for Meltdown doesn't mean there aren't other vulnerabilities unique to Intel processors.
Re:Idiotic Moderation (Score:5, Insightful)
Re: (Score:3)
Because it doesn't make sense: Intel has a KNOWN UNFIXABLE FLAW in Meltdown. It cannot be fixed. You are saying "don't switch to AMD because they might have a major flaw too at some point". Meltdown is a much larger problem than Spectre is.
Except that I read the write-up by the team and it did NOT say that AMD was immune to Meltdown. It actually said that they were able to get AMD processors to execute the pipelines but were unable to read it before the cache was invalidated. They speculated that a more optimized attack may be able to read the cache but they did not know for sure if it was possible. Thus they were not able to use their existing attack against AMD but that does not mean that it is not possible. AMD claimed that those pipel
Re: (Score:1)
Based on what I read.
AMD said they're immune (to meltdown because they keep the protection of kernel memory more strict
Intel said 90% of last five years, not 90% of vulnerable.
This isn't to shit on intel, the 5ish percent slow down on COUs that support PCID isn't so bad, just a clarification of how I've understood the news.
Re:Idiotic Moderation (Score:5, Informative)
Correction, they [meltdownattack.com] speculated that they were able to get AMD chips to do that. Their toy attack (within process) succeeded showing AMD chips will do speculative ordering. No actual security risk there, beause processes can read their own memory.
BUT, they didn't know for a fact why they didn't succeed in attacking the kernel.
We've now had statements from AMD (after the paper was released) - namely, that permission bits are checked BEFORE issuing instructions so kernel memory isn't readable, even speculatively.
So.. .yeah, remember the paper is only what they think could be happening.
Re: (Score:3, Interesting)
AMD pushed a patch [1] to disable the workaround for Meltdown on AMD CPUs. That means they are 100% sure that their CPUs are immune.
[1] https://lkml.org/lkml/2017/12/27/2
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
They're both full of bullet holes but AMD at least has less holes in short.
Re: (Score:1, Interesting)
The shill gets modded up while posts get modded down for pointing out why the shill is giving bad advice.
Given the choice of buying one of two x86-64 processors you would choose the Intel one that has a known critical security flaw that can only be mitigated with a performance crippling software patch rather than the AMD one that does not have this flaw. I think it's quite obvious who the shill is on this one and he/she is in some pretty serious damage control at the moment.
Re: Idiotic Moderation (Score:4, Interesting)
I take it you didn't read AMD's press release explaining exactly what you say you want to hear.
It's true that all processors have errata and can have bugs/flaws/security weaknesses... but, the Meltdown flaw which does not affect AMD is a specific kind which can't affect AMD because of architecture differences. Specifically, AMD checks to make sure user land code doesn't try to access kernel data without the correct permissions before executing predictive branches on it. Intel doesn't -- it goes ahead and runs the illegal code before flagging an exception to dump the branch after the fact. So, for a short time, there's data in cache on an Intel chip that should NOT be there because it should never have been accessed by the system to begin with.... and a specially crafted program can read it before it's flushed. This is because Intel (and ARM and others) chose a certain optimization for their speculative engine while AMD chose a different, more secure architecture.
https://www.pcgamesn.com/intel... [pcgamesn.com]
AMD's fix is -- no fix needed b/c we weren't stupid enough to let even speculative code run without checking its permissions first.
Per AMD for the initial Linux kernel patch:
AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault.
AMD is definitely vulnerable to lesser exploits -- some which are also patched others are mitigated... and some are obfuscated because they are processor generation specific. But, they are not vulnerable for Meltdown or any variant like it by design.
Now remember... the fix for Meltdown is to flush the cache -- all levels -- when switching from user mode to kernel mode or vice versa.... every single time. That's a heck of a hit for some use cases. I believe Intel has found some ways to mitigate it with their 8th gen core series and will likely tinker with a better patch in the future.
It is absolutely a great idea to purchase an AMD processor if it suits the needs of one's business for those use cases where it will perform better than an Intel chip that is crippled by this horrendous bug -- all things being equal. Obviously, businesses have contracts with 3rd party suppliers and don't necessarily get to pick and choose every aspect of hardware, nor is AMD a savior necessarily if their total cost of ownership is higher because of servicing more varieties of equipment, dealing with more motherboard types and vendors, electricity / Air conditioning costs, etc.
One doesn't have to be a shill for AMD to notice it's obvious that Intel has a serious hardware flaw that AMD lacks and while any CPU can have errata, most can be patched with negligible effects. Intel having to flush caches between modes is a serious flaw if one runs programs that switch modes constantly. For average users and even gamers, there's not a huge impact. I'm running the patch right now for Windows and I can tell it affects Virtual Machines and a bit of file serving, but not enough for me to be too upset about it. If I had a high-end cluster for databases, a 20% hit to that would definitely make me want to check out AMD as an alternative... b/c even IF AMD has a bug that needs patching, it's unlikely to ever affect performance like this one does by requiring cache flushes to avoid having processes of user and kernel modes running at the same time for fear of one stealing data from the other.
Re: Idiotic Moderation (Score:1)
Re: (Score:1, Interesting)
A flaw has just been discovered in my car where it has a 20% chance of spontaneously bursting into flames when you turn on the ignition. However, I've decided to keep buying the same model of car because other cars likely have equally severe issues that just haven't been discovered yet.
Be smart - keep buying shit.
Re: (Score:1, Informative)
Re: Idiotic Moderation (Score:5, Insightful)
Is there a compelling reason to believe that AMD processors are less likely to be vulnerable in the future than Intel processors?
Right now only Intel is massively exposed on one security issue where other manufacturers are not. So yes - this makes it appear that AMD design philosophy values security over performance. Whether that is proved out remains to be seen.
If one manufacturer is cutting corners with the engineering and the other isn't, then there's a logical reason.
Intel seems to be the one cutting corners - for decades. You do remember the FDIV and FOOF bugs in early Pentiums? I don't recall other manufacturers having such severe problems (sure, mainly PR with FDIV) that a recall was required.
Otherwise, there isn't a logical basis for using that as a reason to change your behaviour in the future.
Intel cannot provide CPUs to retail without this flaw for another 18 months or so. That should most certainly influence short-term future behaviour IF the fix causes significant performance issues with your workload.
It's also entirely possible that, faced with backlash and distrust, the manufacturer might take additional steps to ensure that no such similar issues occur in the future. If there was demonstrable evidence of this, it might be a good reason not to switch.
Sounds strange to not switch to a vendor that doesn't suffer from this vulnerability, in the hope that Intel will fix it's processes to ensure this doesn't happen again. Right now though, there's no good reason to specify Intel for your CPUs.
The important question is whether there is any reason to believe Intel processors will be more vulnerable in the future.
Why is that important? All manufacturers will have problems. You make plans with known data today. Intel messed up big time, and until the problem is fixed they should absolutely have this issue in the 'known problems' pile when consideration of CPU choice is done.
Re: (Score:1)
Re: (Score:2)
I recall the FDIV bug quite well, and it had nothing to do with cutting corners. The design of the circuit was correct. In the transfer to manufacturing, some relatively insignificant bits in a hardware lookup table were truncated erroneously. The rarity of the failures allowed the mishap to escape detection in the validation phase.
Intel's test probably should have been stronger in this area,
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I think this would make sense if you had the vendors at rough sales parity and the virtualization vendors had healthy experience on both platforms so all the gotchas of moving live workloads between CPU vendors were understood and mitigated.
It might actually not work well or require heterogeneous vendor-specific clusters to avoid CPU feature masking that dumbed both vendor platforms to some lowest common denominator.
Re: (Score:2)
Google, Microsoft, and Amazon dwarf Intel. They should not be waiting around for sales parity. They should be creating vendors if the vendors they need aren't there.
In past industries, powerful industries would foster competition amongst their suppliers even if it involved significant loss. It is a necessary business expense that leads to many benefits including competition, diversity in supply (we are vulnerable to terrorists taking out foundries and countries cutting chip supplies today), and diversity in
Re: (Score:2)
My guess is that the broadest explanation is that Google, Microsoft and Amazon largely want x86 compatibility because of the efficiencies associated with the network effect of a widely adopted processor, both in terms of software availability and in terms of platform stability.
As AMD (and failed competitors) have shown, a competing platform to Intel's CPUs isn't easy to pull off. Google, et al, could pay a subsidy to AMD to produce a competing product but there's no guarantee they would get one and they wo
Re: (Score:2)
Some cpu generations will have both issues. Some one issue. Very few will not have any problem.
More lies (Score:2)
Re: (Score:3)
I definitely don't see how requiring you to replace GCC and recompile every single binary is "chip-level".
Re:More lies (Score:5, Interesting)
Re: (Score:1)
Re: (Score:2)
Exactly, you can't provide a general fix to chip-level security problems by changes to "programs". People can compile their own programs and have root access on VMs that they control.
However, Google controls the hypervisor and presumably, it's at this level that the attack can be blocked or mitigated.
Re: (Score:3)
Google's technique requires patching binaries/code (Score:5, Interesting)
Re: (Score:2)
Re: (Score:2)
Re:Google's technique requires patching binaries/c (Score:5, Insightful)
Google's technique ... has a small performance hit but much smaller than KPTI.
Keep in mind Google's technique (retpoline) is not an alternative to KPTI. Retpoline addresses Variant 2. KPTI addresses Variant 3. Both are required.
Retpoline is for Spectre (Score:1)
Meltdown patch (KPTI) will still hurt applications with lots of syscalls, or lots of userspace->kernel context switches.
Summary not very helpful, here's my attempt. (Score:5, Informative)
Google has created "retpoline", a technique which allows an indirect branch (e.g. a vtable call) to occur in a way that effectively disables speculative execution by isolating branch target prediction into a safe effectless loop. This addresses Variant 2 (aka Spectre).
Retpoline does not depend on or assist a CPU or an OS patch: it is done purely at the software level, per-app, by a compiler. There is no simple OS-wide patch.
Google says a retpoline call has performance "within cycles" of a regular old mispredicted branch. The zero-cost predictions we're used to are a thing of the past, because it effectively forces misprediction. I'd be curious to see a benchmark of an indirection-heavy platform like .NET.
This does not help address or optimize Variant 3, which is what the big kernel patches for Page Table Isolation are needed for. So, your I/O-dependent apps like databases are still going to take a big performance hit. Nor does it address Variant 1.
Re: Summary not very helpful, here's my attempt. (Score:2)
EXACTLY. The summary is horrible. It made it sound like Google invented a novel technique that makes the KPTI/Variant 3 (Meltdown) mitigation slowdown "negligible". But actually the blog post simply says:
Google is connected to Intel at the hip (Score:5, Insightful)
Re: (Score:2)
Google is dependant on Intel CPUs at the moment and has a vested interest in not saying well our cloud just got 5-30% percent slower.
Exactly the same as their competitors, including in-house data centers as well as other cloud providers.
I think they're not looking at the big picture (Score:1)
These three exploits are instances, not three different principles. The principle is the same, and there is no reason to suspect that there won't be more instances that follow that principle. CPUs speculatively execute code and load cache lines based on that execution. Intel CPUs can furthermore access privileged memory when unprivileged code is executed speculatively. That's the principle. The way the speculatively executed code is guarded and the speculative window is widened differs between the three exp
Seriously misleading (Score:3)
Not only do they misspell the name of the mitigation technique, the "retpoline" technique only protects against the indirect branch variant of Spectre. The fix for Meltdown is still KPTI, with all the same overhead that involves. The "negligible inpact on performance" is on top of the KPTI changes.
Re: (Score:1)
I don't think many ARM CPUs use out of order.
Posting primarily to be corrected if I'm wrong.
Just installed the Win 10 patch on my i5 7500 (Score:2)
Re: (Score:1)
SSD IO seem to get hit the hardest. check there and see where you're at. On an ancient dual Xeon system I took a 30% hit.
How about compressed/encrypted code? (Score:2)
Compiler support? (Score:2)