Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Google Security AMD Intel Operating Systems Software Hardware

Google's Project Zero Team Discovered Critical CPU Flaw Last Year (techcrunch.com) 124

An anonymous reader quotes a report from TechCrunch: In a blog post published minutes ago, Google's Security team announced what they have done to protect Google Cloud customers against the chip vulnerability announced earlier today. They also indicated their Project Zero team discovered this vulnerability last year (although they weren't specific with the timing). The company stated that it informed the chip makers of the issue, which is caused by a process known as "speculative execution." This is an advanced technique that enables the chip to essentially guess what instructions might logically be coming next to speed up execution. Unfortunately, that capability is vulnerable to malicious actors who could access critical information stored in memory, including encryption keys and passwords. According to Google, this affects all chip makers, including those from AMD, ARM and Intel (although AMD has denied they are vulnerable). In a blog post, Intel denied the vulnerability was confined to their chips, as had been reported by some outlets. The Google Security team wrote that they began taking steps to protect Google services from the flaw as soon as they learned about it.
This discussion has been archived. No new comments can be posted.

Google's Project Zero Team Discovered Critical CPU Flaw Last Year

Comments Filter:
  • I wonder who else they informed. There is quite a zero-day hole here.

    • Last year was 4 days ago, so probably not a huge number of people.
  • by Anonymous Coward

    > Recent reports that these exploits are caused by a “bug” or a “flaw” and are unique to Intel products are incorrect.

    By using the AND statement there, a casual reader might think these not bugs or flaws in their processors. A close logical reading reveals the only reason this statement is accurate is because the second half is presumed "incorrect".

    • by NicknameUnavailable ( 4134147 ) on Thursday January 04, 2018 @02:52AM (#55860927)
      There's two separate issues. One is specific to pulling things from core memory in Intel chips, the other is an architectural issue which impacts all chips made in the last decade or so and cannot be patched. They're focusing on the Intel one because that can be patched whereas the architectural issue requires a redesign that isn't in place yet, will probably take years to pass QA properly and have the masks manufactured, and will require a complete recall of every chip made after the 90's. From the Snowden and other leaks we learned that all the hacker tools can leak without issue because nobody actually cares to exploit them but governments and corporations anyway - and they're pretty quiet about it most of the time. Additionally we've known that between Intel ME and AMD's equivalent all the chips were already compromised. This is nothing new. We're already running through barbed wire naked and nobody gives a shit, if anything this revelation of a security hole which can be patched is to make people believe things will be safe if they stick some more spyware on their machine because the quality of data from people who know their spied on is lower than those that don't.
    • Intel PR seems to forget that they sell CPUs not (usually) to the final end user, they sell them to PC manufacturers or computer skilled people. None of these persons would be fooled by PR speeches.
  • Link to technical details for those that want it: https://security.googleblog.co... [googleblog.com]

  • If you aren't running virtual machines then it isn't an issue.

    This is more of a server attack and a web host attack.

    • by 93 Escort Wagon ( 326346 ) on Thursday January 04, 2018 @02:47AM (#55860917)

      This is more of a server attack and a web host attack.

      You might want to read this Mozilla blog post.

      https://blog.mozilla.org/secur... [mozilla.org]

    • by DrYak ( 748999 ) on Thursday January 04, 2018 @04:26AM (#55861089) Homepage

      This is more of a server attack and a web host attack.

      No, it's not specific to web servers.
      They do use web servers as an example of where the exploit might be applied, but it's not specific.

      Basically, this exploit allows to abuse the way speculative execution is done to leak information out of the kernel space into the user space.
      (And there are presentation at the CCC of successful abuses done... in Javascript. In a browser).

      For more details :

      most modern CISC processors (Intel - except for Atoms and Xeon Phi - AMD, etc.) are pipelined and do out-of-order execution.
      Executing a CISC instruction requires several steps (micro-ops) and for performance reasons, they keep several instruction in flight (Once instruction A goes out of step 1 and into step 2, you can try already pushing instruction B into step 1).
      To gain even more performance, CPUs try to be clever about this (instruction B actually needs results of instruction A, so it needs to wait. But the next instruction C actually can already be started, it doesn't depend on anything still in the pipeline).
      Bordering on crystal ball-clever (the next instruction B is a conditional jump. But it looks like a loop, so there's a high chance that it will jump back and repeat. We might as well start working back on instruction A, in case we are correct about this jump).
      That's speculative execution : working in advance on stuff that might not even be needed.
      (Sometimes, you end up needing to bail out of your speculation, throw the work away and restart because you got your crystal ball wrong. But it's better than just sitting there waiting).

      now about memory :
      any modern processor worth its salt has memory protection, meaning it handles access rights : Which process can read-write which virtual addresses ?
      Usually, sensitive information in the OS is shielded away from the regular software.
      On a modern Linux, you can't crash the whole system by writing junk at the wrong address, like you used to do in the old MS-DOS days.
      If your software attempts to read something out of the system, the read attempt will be rejected.

      the exploit relies on how these both play together.

      It happens to be that, in the case of Intel's processors (but not of AMD's), the step where the memory page is loaded from the DRAM stick into the cache happens before the check if the read is valid.
      By the time the Intel CPU does the check and notice that the read is invalid and rejects it, quite a lot has happened.
      (Things got loaded into cache, other instructions have started their speculative execution in the pipeline, etc.)
      These things are measurable (you can measure the timing of some computation to guess what's in the cache and what's not).

      Meaning that it's possible to leak sensitive information, that normally pertain in to the OS and shouldn't be application-accessible, by doing a ton of such speculative-execution and timings.
      At CCC there was some presentation of this done in javascript: Technically, your browser right now could be executing some random javascript shit from some shaddy website in one of your background tabs and trying to learn as much from your OS as possible.
      Such information could further be used while mounting privilege escalations, or other attacks.

      In the specific situation of AMD processors, the check is done much earlier (according to their lklm post) and thus not much else has happened already, and there's not much leak from which you could learn.

      I have no idea how ARM64 are affected. (But it might also be the cache getting populated before the read attempts get rejected).

      • by fubarrr ( 884157 ) on Thursday January 04, 2018 @05:38AM (#55861281)

        Yes, the problem is if you check for page faults before starting executing a branch, you must check page faults for all branches, but if you check it post factum you need to do page fault check only for the correct branch, thus greatly reducing performance penalty of memory protection checks.

      • by Shinobi ( 19308 )

        There's still a lot to be tested.

        One thing that's not been tested is the leakiness where you have mixed levels in a process, like hardware acceleration in browsers, or games using GPU, on the Linux side. DSP's etc also need testing.

      • by Anonymous Coward

        The important observation is AMD succumbed to this attack only when they were able to run user controlled code with kernel privileges, bypassing AMD rejecting the loads early. Tightening up on kernel exec exploits can fix the problem for AMD and spoils Intel's attempt to spread the blame.

        • by Shinobi ( 19308 )

          The problem is, due to the Unix architecture, a lot of the GPU system lives in the kernelspace while still executing userspace code, and a process can thus straddle both.

          On Windows, due to the GPU drivers being usermode, that's mitigated somewhat, but still not entirely safe.

          • The problem is, due to the Unix architecture, a lot of the GPU system lives in the kernelspace while still executing userspace code, and a process can thus straddle both.

            Yeah, but actually... Nope. Not at all.

            The only tiny bit that is running in kernel is the driver that receives the command stream and passes it to the actual physical GFX hardware for rendering.
            That's the DRM module, the tiny stuff with ".ko" at the end ("amdgpu.ko", etc.)
            Everything else in the rendering stack is handled by libraries (mesa's "libGL.so", "libdrm.so" and its hardware specific variants). All these libraries are in charge of handling all the simpler and nicer language and API that your software

            • by Shinobi ( 19308 )

              Yes, the tiny DRM bit, that controls mode setting, memory management via DMA-BUF(which conveniently also allows for CPU access...) and a whole lot of other neat kernelspace stuff.

              DMA-BUFs kmap in particular, used together with Spectre, will definitely need to be tested.

              Also, with CUDA, some OpenCL implementations, and some Vulkan implementations, you can build Compute Kernels that run both on GPU and CPU.

              Couple all those above, with the move towards UMA, and you have some serious testing that needs to be do

              • Yes, the tiny DRM bit, that {... long list skipped ..}

                None of which executes arbitrary code provided by the end-user, which was the entire point of the discussion.
                All the long list you give are fixed functions that the DRM performs when called.

                When given arbitrary code, none of it gets executed in kernel space. The kernel code only performs the task of pushing that code to the GPU for execution, it doesn't execute anything itself.

                (Also, compared to the actual Mesa userland, the DRM bit *is* small, even the parts which are not concerned by the execution of arbi

  • It was well known that on Pentium line cpus, a speculative execution branch can access protected memory, but it will just cause page fault in the end.

    The first practical access timing exploit was discovered in 2016. Googlers just found out an even easier way last

  • What about my Commodore 64?

    • by DrYak ( 748999 ) on Thursday January 04, 2018 @05:13AM (#55861203) Homepage

      What about my Commodore 64?

      In all seriousness :

      - old, in-order, non-pipelined CPU like the 6502 in your good old trusted C64 don't do speculative execution and thus aren't affected specifically by such exploits.

      but:

      - your 6502 doesn't do any form of memory protection : any piece of software can access any part of the whole system (because poking weird memory location is how you control the hardware on such old system) so any software has full access to anything.
      So you C64 is leaking sensitive information.

      (Later 68k motorola CPU (68030 and up) eventually started to include an built-in MMU to protect memory access, and thus later Amiga machine featuring them (A2500/30, A3000) can be made imune to OS information leaking into userland. That would the first Comodore hardware - vaguely remote cousin of your C64 - to do so)

      Yup, i'm giving a technical answer to a joke.

      • - old, in-order, non-pipelined CPU like the 6502 in your good old trusted C64 don't do speculative execution and thus aren't affected specifically by such exploits.

        If I'm reading this correctly, older Intel Atoms are safe because they are in-order CPUs ( https://spectreattack.com/#faq [spectreattack.com]). I still have an Atom from 2010, and it's already slow enough so I'd rather leave it without KPTI. Of course, my important servers are all AMD.

        • If I'm reading this correctly, older Intel Atoms are safe because they are in-order CPUs ( https://spectreattack.com/#faq [spectreattack.com]). I still have an Atom from 2010, and it's already slow enough so I'd rather leave it without KPTI. Of course, my important servers are all AMD.

          ...and same for Xeon Phi.
          (Which are basically the same kind of in order approach like Atoms, but linked together with a ginormous SIMD unit - the AVX512 - some kind of ultra-SSE/AVX on steroids that border onto GPU territory. That shouldn't be a surprise, as Xeon Phi are basically what Intel salvaged out of their failed Larrabee GPU experiments).

          According to the Wikipedia article [wikipedia.org] about Atom architecture, there's only one single micro-ops ever in flight from a given process (though they DO hyperthreading an

      • Actually Commodore had a CPU expansion board, the A2620, which was a MC68020 which had a 68851 MMU, for the A2000.
  • by Anonymous Coward on Thursday January 04, 2018 @04:12AM (#55861059)

    There are two exploits revealed here: Meltdown and Phantom

    Intel, AMD, and some/all ARM chip are vulnerable to at least one of the two Phantom attacks, but patching Phantom will not produce any significant performance reductions.

    At this time, only Intel systems have exhibited vulnerability to Metldown. Patching Meltdown comes with serious consequences.

    So AMD is basically correct in stating that they are not in the same position as Intel .

    • Boundary violations (Score:5, Interesting)

      by DrYak ( 748999 ) on Thursday January 04, 2018 @05:00AM (#55861177) Homepage

      Basically :

      AMD checks access rights first and if rejected nothing much happens.
      Meaning no leaks from kernel information into user-space running software.
      - Google only demonstrated a in user-space software accessing its own in user-space info.
      - And by using some non standard settings, it's possible to give bytecode to that kernel, and that piece of in-kernel software will access its own in-kernel info. (But you're already on the other side of the kernel fence)
      Nothing gets accross the kernel fence.

      Intel checks access rights much later on. By that time quite a lot has happened (e.g.: things could have been loaded in the cache, etc.). By measuring those things, you can deduce information that you should not have access to.
      It means that a user-space software could end up getting sensitive information that normally should stay in-kernel.
      These subtle timings of cache enable you to get information accross the kernel fence into user-land.
      To mitigate these, each time a user-land software calls into a kernel function (e.g.: filesystem access), the OS needs to flush all it's space from the accessible space. This comes at a big performance cost.

      • by swilver ( 617741 )

        I don't see how a different timing would allow you to deduce anything from something that was loaded into the cache. A cache load on modern CPU's consists of loading multiple bytes at a time (8, 16, 32) and a slight timing difference won't tell me the value of those --
        just if something was loaded or not, not any of the individual hundreds of bits.

        Now, *if* you could do a second read on the same location *after* it was loaded into the cache (perhaps bypassing the security check as it won't hit DRAM)

        • by Pulzar ( 81031 )

          I don't see how a different timing would allow you to deduce anything from something that was loaded into the cache. A cache load on modern CPU's consists of loading multiple bytes at a time (8, 16, 32) and a slight timing difference won't tell me the value of those

          The way this works is that you don't load the protected data into cache, but you use it in a subsequent instruction to load one of the two addresses that you do have access to into cache. I.e., in some pseudo-assembly:

          load r1, [protected_addr]
          and

  • Google did not test these vulnerabilities on any Zen based CPUs. They tested only on older processors:
    "AMD FX(tm)-8320, AMD PRO A8-9600 R7"

    https://googleprojectzero.blog... [blogspot.com.es]

    • If you dig into the details :

      AMD actually don't violate boundaries.
      As in their LKLM post, they do the access rights checks before anything else, and if rejected nothing much happens that can be timed.
      Meaning there's no leaking of kernel information into user-space programs.

      The only thing that Google successfully demonstrated is :
      - leaking some users-space's program own information (yay!...). There's not much boundary violation here.
      - using some non-standard linux kernel settings, to send eBPF (the bytecode

      • I think I understand it better now.
        There are actually, 3 vulnerabilities: 2 spectre and 1 meltdown.
        AMD Zen CPU's are actually affected by the first spectre vulnerability and they admit to that: https://www.amd.com/en/corpora... [amd.com]

        The other Spectre vulnerability and the meltdown don't affect Zen. Meltdown is the vulnerability that needs the KPTI patch. Presumably there is some other patch on the way to fix spectre.

        • Presumably there is some other patch on the way to fix spectre.

          And according the cited article, the mitigation to fix spectre is much less costly.

          Also Spectre exploits only basically works around things like array-boundary checks.
          i.e.: the check that controls if you're not reading past out-of-bound memory might not have finished yet, and the actual invalid read might have entered the pipeline.

          Basically, it's a slap in the face of all "rust-trolls" who are touting array limits check, whenever there's a buffer overflow exploit mentioned.

          Using bound checking doesn't excus

      • by Alioth ( 221270 )

        Leaking a user space program's own information can be a serious risk especially if that program can also execute arbitary code. A web browser is an example of such code. They have done a proof-of-concept where Javascript running on Chrome can leak information to a remote attacker information within Chrome's memory space. This could include sensitive information such as authentication tokens, private keys, the content of Chrome's password manager, etc.

        • Leaking a user space program's own information can be a serious risk especially if that program can also execute arbitary code.

          Yes, but although the fact that checks (e.g.: array limit checks done in software) don't work perfectly is a problem per se, the fact that YOU ARE RUNNING ARBITRARY CODE in the first place is the main problem here.

          In other words, using rust is a nice thing, but it doesn't stop you from writing stupid code in the first place.
          (to play with all the usual "rust-troll" that come screaming for out-of-bound checks whenever there's memory overflow exploit mentioned)

          A web browser is an example of such code. They have done a proof-of-concept where Javascript running on Chrome can leak information to a remote attacker information within Chrome's memory space. This could include sensitive information such as authentication tokens, private keys, the content of Chrome's password manager, etc.

          This is the main reasons why there's been efforts

  • The vulnerabilities discovered in the Intel CPUs will never be exploited, as the Intel Management Engine already provides all the necessary backdoors.

  • Intel have been very slow in producing new CPUs the past months. This issue (they've known for a year) is likely related to the decreased production.
  • by Master Of Ninja ( 521917 ) on Thursday January 04, 2018 @04:59AM (#55861173)
    There's a lot of interesting language being used here, and if everyone is so coy it just strikes me that this is a serious thing. Couple of observations:

    (1) There seems to be two [theregister.co.uk] separate [theregister.co.uk] exploits [theregister.co.uk] which you need to dig into the reporting to work. The Register's coverage is quite good and explains it all. "MELTDOWN" seems to be the more problematic one, and affects Intel and ARM chips. "SPECTRE" seems less problematic and affects AMD chips as well.

    (2) AMD affected or not? Google says yeah, AMD says nay. However the wording from the LKML list is that "AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against" [lkml.org]. I think this references that the kernel patch is targeted against MELTDOWN, which does not affect AMD chips (see point 1)

    (3) Although everyone's kicking Intel down, the main problem is that no-one can really trust each other now. I know there is a claim of "defective by design", but a lot of things can be described that way if they aren't used in their intended manner. In a "sane" world there would be no malicious actors trying to exploit what seems like quite a clever trick relying on timings (not a chip designer/expert). I read a lot of issues with the web came about, due to the fact that when it was designed everyone on the internet trusted each other, so security against bad apples wasn't designed in. As things have been commercialised you can see the effects, to the point that the only sane way to browse is using ad blockers and no script.

    My thoughts on people suing Intel are a bit conflicted. Probably based on US law they would lose, but my analogy is like blaming (insert car manufacturer here) for selling you a car which crashes only when someone throws stones at it. We need stronger laws and protections against the rise in hostile actors.

    (4) It's interesting that the Google blog post couldn't wait for the embargo-ed deadline of 9th January. They and their customers must have been getting really spooked. I suspect that this was being worked on and known by multiple parties, and a bit of coordination would have been good rather than panic.

    (5) It'll be interesting to see what happens with regards to performance - from my understanding the SPECTRE variants just needs code recompilation. Most home workloads should not be affected by the two exploits, however I think if you are I/O heavy then it may be an issues.

    Interesting time indeed.

    • (1) There seems to be two separate exploits which you need to dig into the reporting to work.

      There are at least three separate exploits so far, so you didn't actually dig.

      AMD affected or not? Google says yeah, AMD says nay.

      AMD is affected, at least by 2/3 exploits, but mitigation will be cheap because of architectural differences.

      Although everyone's kicking Intel down, the main problem is that no-one can really trust each other now.

      AMD seems trustworthy.

      I know there is a claim of "defective by design", but a lot of things can be described that way if they aren't used in their intended manner.

      Intel is bad at branch prediction. Remember the P4? That's why that thing sucked. They're still bad at it, so they took shortcuts, which turned out to come back to bite them — and their customers. And here I am just using AMD processors, but that's none of my business.

      In a "sane" world there would be no malicious actors trying to exploit what seems like quite a clever trick relying on timings (not a chip designer/expert).

      In that case, we don't live

      • by west ( 39918 )

        And based on UK law they will probably win, because they actually have a standard of fitness for purpose. If you bought it on the premise that you would have a certain level of performance and now you won't, you should be able to return it.

        Care to lay odds on Intel actually being ruled against? 2:1? 3:1?

        If it weren't for the fact that guilt is immaterial compared to PR so this will never get ruled upon, and Intel will settle for something that enriches a few lawyers and gives owners a $0.50 discount on th

        • Care to lay odds on Intel actually being ruled against? 2:1? 3:1?

          No, but I still think they might actually be punished overseas. They are seen as a US company first and an Israeli company second, and everything else somewhere much further down the line. No one outside of one of those two nations will hesitate to insist that Intel actually make good without big, big bags of money.

      • Comment removed based on user account deletion
    • Comment removed based on user account deletion
  • by enriquevagu ( 1026480 ) on Thursday January 04, 2018 @05:29AM (#55861259)

    There are three different attacks in the blog post [blogspot.com.es] by Google's Security team. The first one, for example, works as follows: it loads from a kernel memory address; this will generate an exception, but before the exception is generated (because the page permission check is delayed to improve performance) the subsequent instructions are executed speculatively. None of the following instructions will ever commit, but they can have a noticeable impact on the processor state, as follows: they speculatively execute a load, based on the contents of the position loaded from the kernel space. The load is issued (but not committed), what caches a given memory location. The specific location is based on one bit of the .

    When the first load is detected to be illegal, the instructions in the pipeline are flushed, but (the following is the critical part) the cached address remains in L1. By timing a memory access to the corresponding address, they can infer one bit of the given kernel memory. By repeating this, they can subsequently infer the whole word, one bit at a time.

    How can they solve this issue? I can only foresee two alternatives:
    - Perform permission checks earlier in the pipeline, but this requires modifying the processor microarchitecture. AMD cores are not affected by this attack, so their uarch probably checks permissions before issuing the load.
    - Completely or partially flush the contents of the cache after a processor miss-speculation. This is probably the solution being implemented in the patches being developed.

    Note that miss-speculations are VERY frequent, since most of the execution of Out-of-Order processors is speculative to improve performance. This explains the VERY significant performance penalties caused by the patches.

    • by tlhIngan ( 30335 )

      Note that miss-speculations are VERY frequent, since most of the execution of Out-of-Order processors is speculative to improve performance. This explains the VERY significant performance penalties caused by the patches.

      No, they're caused because of the page table isolation. Resetting the page tables is a very expensive operation, which is why most OSes mapped kernel memory into the process space. This includes having to flush the page table caches (known as the Translation Lookaside Buffer, or TLB) far mor

    • by Junta ( 36770 )

      As someone more thoroughly said, the KPTI patch doesn't make the behavior impossible, it just takes the teeth out of it by unmapping all the juicy things they would want.

      Note that partially flushing the cache after a misprediction would certainly mitigate, but there would still be a window where the memory has been cached speculatively and the fault being detected. You instead would have to maintain a mask of cached-but-not-yet-valid-for-issue cache memory, and ignore it in specific scenarios until everyth

    • by swilver ( 617741 )

      Perhaps this entire class of problem can be solved by not providing such accurate timers in the CPU or randomize them somehow, making them useless for measuring these kinds of tiny variations but still useful for measuring things on the order of microseconds.

  • They say [amd.com]

    Branch Target Injection: Differences in AMD architecture mean there is a near zero risk of exploitation of this variant. Vulnerability to Variant 2 has not been demonstrated on AMD processors to date.

    Near zero implies that it is possible! What are the differences, and why do they make it unlikely? could enhancements to the attack make it feasible?

    • by AHuxley ( 892839 )
      "“Meltdown” and “Spectre”: Every modern processor has unfixable security flaws" (1/4/2018)
      https://arstechnica.com/gadget... [arstechnica.com]
      has the Spectre news re "with proof-of-concept attacks being successful on AMD, ARM, and Intel systems"
  • Intel stock sold (Score:4, Informative)

    by sad_ ( 7868 ) on Thursday January 04, 2018 @06:49AM (#55861459) Homepage

    It has also come to light that Intel CEO sold $24M in stock when he was aware of the issue.

    http://www.businessinsider.com... [businessinsider.com]

  • by PPH ( 736903 ) on Thursday January 04, 2018 @10:14AM (#55862251)

    So, five days ago?

    • by HiThere ( 15173 )

      Reports say the info was sent to Intel, AMD, a few others (not all named) last June. So 6 months. Additional info was sent later, but the report didn't say what additional info, or when.

  • That's 5 days ago.

"Someone's been mean to you! Tell me who it is, so I can punch him tastefully." -- Ralph Bakshi's Mighty Mouse

Working...