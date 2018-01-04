Catch up on stories from the past week (and beyond) at the Slashdot story archive

 


Google Says CPU Patches Cause 'Negligible Impact On Performance' With New 'Retpoline' Technique (theverge.com) 78

Posted by BeauHD from the good-news-for-chipmakers dept.
In a post on Google's Online Security Blog, two engineers described a novel chip-level patch that has been deployed across the company's entire infrastructure, resulting in only minor declines in performance in most cases. "The company has also posted details of the new technique, called Retpoline, in the hopes that other companies will be able to follow the same technique," reports The Verge. "If the claims hold, it would mean Intel and others have avoided the catastrophic slowdowns that many had predicted." From the report: "There has been speculation that the deployment of KPTI causes significant performance slowdowns," the post reads, referring to the company's "Kernel Page Table Isolation" technique. "Performance can vary, as the impact of the KPTI mitigations depends on the rate of system calls made by an application. On most of our workloads, including our cloud infrastructure, we see negligible impact on performance." "Of course, Google recommends thorough testing in your environment before deployment," the post continues. "We cannot guarantee any particular performance or operational impact."

Notably, the new technique only applies to one of the three variants involved in the new attacks. However, it's the variant that is arguably the most difficult to address. The other two vulnerabilities -- "bounds check bypass" and "rogue data cache load" -- would be addressed at the program and operating system level, respectively, and are unlikely to result in the same system-wide slowdowns.

  • Pentium 4.99989 disaster seems like yesterday.

  • Or just Buy AMD & get no slow down with more p (Score:5, Informative)

    by Joe_Dragon ( 2206452 ) on Thursday January 04, 2018 @08:02PM (#55866139)

    Or just Buy AMD & get no slow down with more pci-e lanes.

    • This incident highlights the importance of maintaining vendor diversity in data centers. Modern processors are complex enough that it is not unlikely that any given design has problems waiting to be discovered. It would seem wise for large-scale clients to hedge their bets by having a mix of devices carrying their workload. Imagine the damage if someone discovered a means of bricking Intel processors and added the payload to one of the better viruses.

    • Re: (Score:2)

      by AHuxley ( 892839 )
      Think of the problem as a Venn diagram and the two CPU "vulnerabilities" as lists of CPU's within the diagram.
      Some cpu generations will have both issues. Some one issue. Very few will not have any problem.
  • This isn't a "chip-level" patch. The spin control here is admirable.

    • Re: (Score:2)

      by nyet ( 19118 )

      I definitely don't see how requiring you to replace GCC and recompile every single binary is "chip-level".

      • It isn't "chip level". The Intel PR spin is out in full effect. Meltdown is a major flaw that can only be fixed by removing the flawed Intel processor and replacing it with a processor that doesn't contain the flaw. If you don't do that, the best you can do is mitigate the effects. There is no microcode fix either. What Google is doing is recompiling everything, which is fine, but hackers aren't going to do that.

    • Exactly, you can't provide a general fix to chip-level security problems by changes to "programs". People can compile their own programs and have root access on VMs that they control.

      However, Google controls the hypervisor and presumably, it's at this level that the attack can be blocked or mitigated.

      • Exactly. The funny thing is these "cloud companies" always control their own infrastructure, so these types of "fixes" make sense. Everyone else is screwed.

  • Google's technique requires patching binaries/code (Score:4, Informative)

    by JoeyRox ( 2711699 ) on Thursday January 04, 2018 @08:08PM (#55866177)
    Google's technique is to patch binaries so that branches/calls don't use the branch prediction mechanism of the CPU, which has a small performance hit but much smaller than KPTI. I suppose the presumption is that harmful code which uses the technique would have to compile it into their binary since most OS's prevent the self-modification of code segments/TLB entries once they've been placed into memory by the OS loader. But what about code segments generated entirely at runtime, including from interpreters and libraries like libjit?
    • How is patching software a 'chip-level patch?' Is the summary that wrong?
    • It works for Google because they run everything on their own infrastructure and have full control over it. They don't run it on someone elses "cloud". Rather ironic.

    • Google's technique ... has a small performance hit but much smaller than KPTI.

      Keep in mind Google's technique (retpoline) is not an alternative to KPTI. Retpoline addresses Variant 2. KPTI addresses Variant 3. Both are required.

  • Summary not very helpful, here's my attempt. (Score:4, Informative)

    by PhrostyMcByte ( 589271 ) <phrosty@gmail.com> on Thursday January 04, 2018 @08:47PM (#55866367) Homepage

    Google has created "retpoline", a technique which allows an indirect branch (e.g. a vtable call) to occur in a way that effectively disables speculative execution by isolating branch target prediction into a safe effectless loop. This addresses Variant 2 (aka Spectre).

    Retpoline does not depend on or assist a CPU or an OS patch: it is done purely at the software level, per-app, by a compiler. There is no simple OS-wide patch.

    Google says a retpoline call has performance "within cycles" of a regular old mispredicted branch. The zero-cost predictions we're used to are a thing of the past, because it effectively forces misprediction. I'd be curious to see a benchmark of an indirection-heavy platform like .NET.

    This does not help address or optimize Variant 3, which is what the big kernel patches for Page Table Isolation are needed for. So, your I/O-dependent apps like databases are still going to take a big performance hit. Nor does it address Variant 1.

  • Google is connected to Intel at the hip (Score:3)

    by bongey ( 974911 ) on Thursday January 04, 2018 @08:52PM (#55866407)
    Google is dependant on Intel CPUs at the moment and has a vested interest in not saying well our cloud just got 5-30% percent slower.

    • Google is dependant on Intel CPUs at the moment and has a vested interest in not saying well our cloud just got 5-30% percent slower.

      Exactly the same as their competitors, including in-house data centers as well as other cloud providers.

  • These three exploits are instances, not three different principles. The principle is the same, and there is no reason to suspect that there won't be more instances that follow that principle. CPUs speculatively execute code and load cache lines based on that execution. Intel CPUs can furthermore access privileged memory when unprivileged code is executed speculatively. That's the principle. The way the speculatively executed code is guarded and the speculative window is widened differs between the three exp

  • Not only do they misspell the name of the mitigation technique, the "retpoline" technique only protects against the indirect branch variant of Spectre. The fix for Meltdown is still KPTI, with all the same overhead that involves. The "negligible inpact on performance" is on top of the KPTI changes.

