Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Intel Technology

Researchers Invent a Way to Speed Intel's 3D XPoint Computer Memory (ieee.org) 43

Memory modules using Intel's 3D XPoint technology are on their way, and researchers in North Carolina have already figured out how to make them better. New submitter mnemotronic writes: At the 45th ICSA (International Symposium on Computer Architecture), a group of researchers from North Carolina State University led by Prof. Yan Solihin proposed a method called lazy consistency to speedup write operations to 3d XPoint memory. XPoint, developed by Intel and Micron, is non-volatile, cheaper and denser than DRAM but requires more power and writing takes longer. The method proposed reduces write overhead times from 9% to 1% by incorporating a checksum to the cache memory system. The researchers were not able to verify their approach on actual XPoint memory, as those products only recently started sampling. They tested using simulations and DRAM and plan to verify when Intel's modules become more widely available.
This discussion has been archived. No new comments can be posted.

Researchers Invent a Way to Speed Intel's 3D XPoint Computer Memory

Comments Filter:
  • It's ISCA.

  • 3D XPoint memory is in between NVRAM [anandtech.com] which is RAM backed by supercapacitators, running some kind of kind of machine/rack-level UPS to ensure RAM is saved to "regular" flash drives or just persisting against NVMe drives before declaring the transaction complete. So there are faster and more expensive options and slower and less expensive options and it also depends on how many components you want involved. But that's always a discussion, if a disgruntled data center worker takes a sledgehammer to your machin

  • Won't Work (Score:5, Interesting)

    by sexconker ( 1179573 ) on Wednesday June 20, 2018 @11:29AM (#56816724)

    The delay is because XPoint doesn't work. The writes usually take, but sometimes they don't. Intel hasn't figured out why.

    They current practice is to verify all the writes and simply redo them if they don't take.
    This means you're tying up the the bus, and this is why Intel now recommends dedicating entire memory channels to XPoint instead of mixing and matching with DRAM. If you have XPoint in all of your channels, your latencies go through the roof and your performance tanks.

    Wait for generation 3 before considering XPoint NVDIMMs.

    • by swb ( 14022 )

      Why would you mix and match DRAM and Xpoint on the same bus anyway when Xpoint is so much slower than DRAM? Even without extra verification and writes its still much slower than DRAM and would seem to clog the channel.

      I'll admit that maybe I don't know something about existing DRAM access paths/channels/buses, AFAIK the NUMA node was basically the smallest subdivision. Or is it possible to address individual DRAM modules/pages in parallel with others on the same NUMA node?

      I guess I had figured that DRAM i

      • Comment removed based on user account deletion
        • No. Existing NVMe SSDs already use DRAM for the cache, which is faster and more reliable than Xpoint.
          And you're bottlenecked by NVMe performance and latency.

          Xpoint was supposed to replace both DRAM and storage. Now it's a very expensive replacement for storage that's better in one metric (MAYBE two) and a very poor replacement for DRAM that's worse in every metric except cost. And until Intel actually starts shipping these things for sale to end users, that cost benefit is up in the air.

          I for one expect

          • by torkus ( 1133985 )

            You're missing the situations where you need to handle extremely large data sets.

            It's not common, but neither is the need for a Xeon 8180 yet they certainly exist (well, for the price of a cheap car).

            3d-xpoint RAM isn't going to outright replace DRAM any time soon, but it certainly can supplement it.

      • by Junta ( 36770 )

        Note you can make numa domains smaller. It is quite plausible that 3d xpoint in 'memory' mode appears as a different NUMA domain with SLIT indicating higher penalty for use. NUMA is relatively expressive and can describe the high level divisions and implications of this sort of approach.

        A challenge for Intel is justifying sitting in a rather inconvenient DIMM slot. It may be able to deliver better performance rather than PCIe,, but even PCIe attached NVME has had a very protracted adoption cycle as the r

        • by swb ( 14022 )

          I kind of figured XPoint DIMM would wind up in alternative form factors oriented around hyperconverged architectures, something (much) smaller than blade systems that would allow many (6+) nodes in a 2-4U space without losing storage density in the process. Blown XPoint DIMMs would just be replaced by pulling the whole blade and swapping out DIMMs. By having many nodes you don't worry about losing compute capacity or redundancy.

          But I can never make hyperconverged work from a cost basis -- too many nodes r

          • by Junta ( 36770 )

            The problem is that xpoint dimms even as advertised capacity wise will lose out to U.2 drives for storage per unit volume.

            Never mind the probably-not-going-to-succeed-either ruler form factor which tries to get more SSD storage density by elongating the drives for optimizing SSD storage per unit volume without incurring crazy number of connectors.

            Hyperconverged in practice loves 2U servers with lots of drives. A lot of people love trying to make the argument for storage rich blade-dense form factors, but i

      • Why would you mix and match DRAM and Xpoint on the same bus anyway when Xpoint is so much slower than DRAM?

        Because Intel originally said you s hould do this and that it would be awesome and that the memory controller knew how to make it all work out so you get DRAM speed for DRAM stuff and multi-channel XPoint speed for NV stuff.

        Nope.

    • Citation needed? I've not found anything that says 3D XPoint doesn't work as you describe.
      • https://semiaccurate.com/2018/... [semiaccurate.com]

        Charlie has been following and detailing Xpoint and its failures for a while now. He's got a half dozen articles or so with more specifics, including official marketing BS from Intel and how its changed over time. I haven't seen Charlie sink his teeth this deep into a story since bumpgate. If you remember that one, he was basically the one guy that bothered to do the legwork to prove an large number of Nvidia's GPUs were defective. This resulted in lawsuits against Nvidi

  • The summary includes NO link to any cited article, just links for defining the conference name, school, the professor, etc.

    • Re:No cited article (Score:4, Informative)

      by Solandri ( 704621 ) on Wednesday June 20, 2018 @02:15PM (#56817790)
      Ever since the Slashdot redesign a few years back, the main link to TFA appears to the right of the title, where it's easy to miss and not at all obvious that it's a link. But I imagine they paid some designer handsomely to make the site less usable.
    • The summary includes NO link to any cited article, just links for defining the conference name, school, the professor, etc.

      Also, to add insult to injury, TFS includes this nugget:

      The method proposed reduces write overhead times from 9% to 1% by incorporating a checksum to the cache memory system.

      That makes 0 Ohms sense.

      Clue for submitter & editor: the SI unit for time is the second, last time I looked.

      • by torkus ( 1133985 )

        Also, to add insult to injury, TFS includes this nugget:

        The method proposed reduces write overhead times from 9% to 1% by incorporating a checksum to the cache memory system.

        That makes 0 Ohms sense.

        Clue for submitter & editor: the SI unit for time is the second, last time I looked.

        This makes perfect sense. Write overhead time as a function of the overall write operation is absolutely the correct measurement in this context.

        It communicates three critical information points (the previous overhead, reduction, and new overhead) simply and directly without extraneous information like the write operation overall time which is tangential to the article. If I said they reduced write overhead from 9ms to 1ms you'd have no clue what that meant in relation to the overall write operation. I'd

Think of it! With VLSI we can pack 100 ENIACs in 1 sq. cm.!

Working...