Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Data Storage Math Technology Hardware

Taking a Hard Look At SSD Write Endurance 267

New submitter jyujin writes "Ever wonder how long your SSD will last? It's funny how bad people are at estimating just how long '100,000 writes' are going to take when spread over a device that spans several thousand of those blocks over several gigabytes of memory. It obviously gets far worse with newer flash memory that is able to withstand a whopping million writes per cell. So yeah, let's crunch some numbers and fix that misconception. Spoiler: even at the maximum SATA 3.0 link speeds, you'd still find yourself waiting several months or even years for that SSD to start dying on you."
This discussion has been archived. No new comments can be posted.

Taking a Hard Look At SSD Write Endurance

Comments Filter:
  • Holy idiocy batman (Score:4, Insightful)

    by Anonymous Coward on Tuesday February 19, 2013 @08:46AM (#42943825)

    100000 writes? 1M writes?

    What the fuck is this submitter smoking?

    Newer NAND flash can sustain maybe 3000 writes per cell, and if it's TLC NAND, maybe 500 to 1000 writes.

    • by Anonymous Coward on Tuesday February 19, 2013 @08:50AM (#42943847)
      • SLC NAND flash is typically rated at about 100k cycles (Samsung OneNAND KFW4G16Q2M)
      • MLC NAND flash used to be rated at about 5k – 10k cycles (Samsung K9G8G08U0M) but is now typically 1k – 3k cycles
      • TLC NAND flash is typically rated at about 1k cycles (Samsung 840)
      • by craznar ( 710808 ) on Tuesday February 19, 2013 @09:02AM (#42943949) Homepage

        Obviously the TLC NAND is named for the Tender Loving Care you need to give it during use.

        I think the Slack Lazy Careless stuff is more robust.

        • Obviously the TLC NAND is named for the Tender Loving Care you need to give it during use.

          I think the Slack Lazy Careless stuff is more robust.

          I will just stick with the My Little Crony NAND for now.

      • by afidel ( 530433 )

        True, those are typical values for value oriented parts, there's also high endurance SLC at ~1M cycles and eMLC at ~30k cycles, the downside is a much higher $/GB so it only makes sense to use them in environments where you know you'll have long periods of high write intensity (like write cache for a SAN or ZIL for a ZFS volume).

    • by CajunArson ( 465943 ) on Tuesday February 19, 2013 @09:04AM (#42943967) Journal

      The AC is dead-on right. At 25nm the endurance for high-quality MLC cells is about 3,000 writes. That's a relatively conservative estimate so you are pretty much guaranteed to get the 3K writes and likely somewhat more, but it's a far far cry from the 100K writes you can get from the highly expensive SLC chips. Intel & Micron claimed that one of the big "improvements" in the 20nm process was hi-K gates that are claimed to maintain the 3K write endurance at 20nm, which otherwise would have dropped even more from the 25nm node.

      The author of the article went to all the time & trouble to do his mathematical analysis without spending 10 minutes to find out the publicly available information about how real NAND in the real world actually performs....

      • Exactly, I was going to talk about how the 10K number was from years ago at a much larger process size.

        The thing to keep in mind is that as write endurance decreases due to process shrinks, the capacity for the given area of chip increases so the overall write endurance (as measured in block erases per squared area) remains about the same.

        An average capacity SSD can be written to at maximum erase rates for weeks without wearing them out.
    • Re: (Score:2, Interesting)

      by Anonymous Coward

      A quick glance at wikipedia tells me that you're being rather pessimistic...

      "Most commercially available flash products are guaranteed to withstand around 100,000 P/E cycles before the wear begins to deteriorate the integrity of the storage. Micron Technology and Sun Microsystems announced an SLC NAND flash memory chip rated for 1,000,000 P/E cycles on 17 December 2008."

      http://en.wikipedia.org/wiki/Flash_memory#Memory_wear [wikipedia.org]

    • On the other hand, while you're right that they're an order of magnitude and a half out with that, they're also deliberately 3-4 orders of magnitude or more out with the rate at which you write data, so in reality, the likelihood is actually lifespans much longer than those listed in the article.

    • by tlhIngan ( 30335 ) <slashdot&worf,net> on Tuesday February 19, 2013 @11:46AM (#42945747)

      100000 writes? 1M writes?

      What the fuck is this submitter smoking?

      Newer NAND flash can sustain maybe 3000 writes per cell, and if it's TLC NAND, maybe 500 to 1000 writes.

      Actually, NAND flash doesn't "die" when you try to do the N+1 erase-write cycle (it's cycles, not writes. A cycle consists of flipping bits from 1 to 0 (aka write), and then from 0 to 1 (aka erase)). In practically all controllers, you do partial writes. With SLC NAND, it's fairly easy - you can write a page at a time, or even half pages. MLC lets you do page at a time as well - given typical MLC "big block" NAND of 32 4k pages, a block can be written 32 times before it's erased (once per page - you cannot do less than a page at a time).

      And... other dirty little secret - the quoted cycle life is guaranteed. It means your part will be able to be written and erased 3000 times. Most typically, they're an order of magnitude more conservative - so a 3000 cycle flash can really get you 30,000 with proper care and tolerance.

      Of course, a really big problem with cheap SSDs is lame firmware because what you need is a good flash translation later (FTL) which does wear levelling, sector translations, etc. These things are VERY proprietary and HEAVILY patented. A dirt cheap crappy controller you might find on low end thumbdrives and memory cards may not even DO translation or wear levelling. The other problem is the flash translation table must be stored somewhere so the device can find your data (because of wear levelling, where your data is actually stored versus where your PC thinks it is different - again, the FTL handles this). For some things, it's possible to just scan the entire array and generate the table live, but generally it's impractical at the large scale because it requires time to perform the scan. So usually the table is stored in flash as well, which of course is not protected by the FTL. Depending on how things go, this part could corrupt itself easily leading to an unmountable device or basically, a dead SSD.

      For some REAL analysis, some brave souls have been stressing cheap SSDs to their limits until failure - http://www.xtremesystems.org/forums/showthread.php?271063-SSD-Write-Endurance-25nm-Vs-34nm [xtremesystems.org]

      Some of those SSDs are actually still going strong.

      The best bet is to buy from people who know what they're doing - the likes of Samsung (VERY popular with the OEM crowd - Dell, Lenovo, Apple, etc.), Toshiba, and Intel - who all make NAND memory and thus actually do have experience on how to best balance speed and reliability. Everyone else is just using the datasheet and just assembling them together like they would any other PC part.

  • 100,000? (Score:5, Informative)

    by rgbrenner ( 317308 ) on Tuesday February 19, 2013 @08:50AM (#42943849)

    100,000 is only for SLC NAND. MLC, what is currently in most SSDs, is only 3,000, and TLC (found in usb drives, samsung 840, and probably more SSDs soon because it's cheaper) is only 1,000.

    Is 1,000 fine for most people, yes.. but you should be aware of it. I have a fileserver that writes 200gb per day.. which would kill a Samsung 840 in about 6-7 months.
    http://www.anandtech.com/show/6459/samsung-ssd-840-testing-the-endurance-of-tlc-nand [anandtech.com]

    • Which technology is Amazon using for their AWS instances? Their instance description page (http://aws.amazon.com/ec2/instance-types/) doesn't say one way or the other.

      • Re:100,000? (AWS?) (Score:4, Informative)

        by rgbrenner ( 317308 ) on Tuesday February 19, 2013 @09:11AM (#42944055)

        Almost certainly MLC. SLC is really only found in industrial SSDs these days. Enterprise and consumer SSDs are all MLC, with the exception of Samsung 840, the first SSD to use TLC.

        • by afidel ( 530433 )

          There are plenty of enterprise SSD's that use SLC, both FusionIO and STEC offer SLC options and since STEC has been replaced by Samsung in many applications I assume they do as well. I know HP also offers them as an option for their Proliant servers.

          • There are plenty of enterprise SSD's that use SLC, both FusionIO and STEC offer SLC options

            Yeah but who is using them?

            These arent used for massive server farms because regular drive failures are inevitable regardless of what you use. SLC flash mainly sees industrial and embedded use where small amounts of space actually gets used. If you have enormous amounts of storage then it would be very foolish to use expensive SLC's.

          • Apparently our definitions of plenty are different. FusionIO has ONE product w/ SLC. STEC has a few, but they are marked as industrial SSDs, just as I said in my original post.. and they are priced like industrial SSDs too.. about $1140 for 100gb. Industrial SSDs are typically $5-10 per GB... so that's right in line.

    • by Luthair ( 847766 )
      The numbers are no doubt mean time before failure, so inevitably many drives will fail before this.
    • Heavy database caching kills MLC SSDs in couple of months max. TLC wont last more than few weeks.

    • Re:100,000? (Score:5, Interesting)

      by beelsebob ( 529313 ) on Tuesday February 19, 2013 @09:47AM (#42944401)

      Luckily, while he's about 30 times out for the write endurance on the bad side, he's about 100-1000 times out on the speed at which you're likely to ever write to the things, on the good side, so in reality, SSDs will last about 3-30 times longer than he's indicating in the article. The fact that he's discussing continuous writes at max sata 3 speed suggests that he's really concerned with big ass databases that are writing continuously, and use SLC NAND. The consumer case is in fact much better than that, even despite MLC/TLC.

  • by Zorpheus ( 857617 ) on Tuesday February 19, 2013 @08:59AM (#42943905)
    But if your SSD is nearly full with data that you never change, wouldn't all the writing happen in the small area that is left? This would significantly reduce lifetime.
    • by Colonel Korn ( 1258968 ) on Tuesday February 19, 2013 @09:07AM (#42944021)

      But if your SSD is nearly full with data that you never change, wouldn't all the writing happen in the small area that is left? This would significantly reduce lifetime.

      I believe all the major brands actually move your data around periodically, which costs write cycles but is worth it to keep wear balanced.

      • by Luckyo ( 1726890 )

        Indeed. This is called "wear leveling" and is aimed at preventing a scenario where you have a chunk of data that is never moved or deleted taking a lot of drive space making all wear focus on small area which is worn out very quickly.

        • by afidel ( 530433 )

          There's also spare capacity, say you have 1TB of raw flash space, typically you'll only see 750GB of pre-formatted space available with the remaining 25% set aside for housekeeping tasks like wear leveling, sparing, and most importantly pre-erase which is the only way to get decent write performance on MLC.

    • Re: (Score:3, Informative)

      by Anonymous Coward

      actually they thought about that never SSD drives have special wear leveling algorithm that if it notices you write some parts a lot and remainder of disk is static they just move static part to used-up space and use underused (ex-static part of disk for writing stuff that changes a lot, more or less you can expect that every cell will be used equal number of times even if you write to just 1 file big 1MB and rest is static

      • If you use TRIM, then your drive will know what parts of the disk are empty, and what parts are not. With wear leveling, the SSD will always write to free blocks with the most write cycles available first, and it will just remap blocks in whatever order it wants (blocks don't need to be in linear order like on HDDs). I think they start moving data around once the cells get to the end of their write cycles or it thinks drive is full (no TRIM or the drive is actually full).

    • I could be wrong and it probably depends on the SSD itself. A lot of SSD's these days have a reserved area that's used when cells start to die (Which is why you'll see SSDs with say 120GB of storage instead of 128GB). They all attempt to evenly write over all of the cells as well, instead of just hammering a select few. Of course you're probably right about when the SSD itself is nearly full but as far as I'm aware, ultimately what starts to happen is either the space decreases slowly over time or the SSD j

      • SSDs with say 120GB of storage instead of 128GB

        And then they use drivermakers' gigabytes instead of regular ones, so people see a nice round number like 128 and assume they don't get cheated.

        Oh, sorry, they sponsored some commission to redefine pi^Hkilobyte, so when they get sued, they can claim they don't falsely advertise.

        • Appropriate username is appropriate.

        • by Jeremi ( 14640 )

          Oh, sorry, they sponsored some commission to redefine pi^Hkilobyte, so when they get sued, they can claim they don't falsely advertise.

          As annoying as it may be to admit it, the drive manufacturers have a point.

          The definitions of "kilo", "mega", "giga", etc, were defined quite explicitly back in the 18th century, to refer to powers of 10. Computer manufacturers later misused them to refer to powers of two, which was (and is) simply incorrect, no matter how comfortable computer people have become with it in the meantime. It's better to fix the problem now and have consistent, well-defined terms in the future than to live with ambiguity and

    • by higuita ( 129722 ) on Tuesday February 19, 2013 @09:13AM (#42944083) Homepage

      SSD should work at maximum of 75% of their capacity... 50% or less is recommended

      some chips try to move blocks to rotate the writes, have a lot of spare zones, so it can remap/use other sectors on write... but that is a problem, working in a full SSD will shorten its live

  • by urbanriot ( 924981 ) on Tuesday February 19, 2013 @09:02AM (#42943951)
    Our company experienced what we believe was its first age-related failure in October of 2012, an office PC with an Intel SSD drive in the value oriented line of 2008 (which was still high at the time). Basically the drive behaved as a mechanical drive would behave with an occasional bad sector and we were able to successfully image the data to a new one. Out of 200 Intel drives, that's pretty good. (We did have one failure in 2010 but that was an outright dead drive and we were able to RMA it). Not sure if this contributes anything to the conversation but I figured I'd throw this out there.

    The Intel X25's in my PC, from 2009, are still humming along nicely and my last benchmark produced the same results in 2012 as they did in 2010. But I've gone so far as to set environment variables for user temp files to a mechanical drive, internet temp files to a RAM drive and system temp files to a RAM drive, offsetting the wear leveling.
    • by Luckyo ( 1726890 )

      Aye, intel drives are known for two things: their reliability and their high prices.

      If you tried budget vendors like OCZ, you'd likely have a very different story to tell us.

    • by AmiMoJo ( 196126 ) *

      I had an X25 from 2008 that died after about 8 months with exactly the same problem. I do some sqlite database stuff from time to time, but otherwise mostly just browsing and normal C development. The replacement lasted about the same time before running out of blocks.

      It seems that certain workloads accelerate the ageing process massively.

    • I've had no failures yet, but I did as the OP - I put all the high write frequency stuff someplace else, be it a ram drive or a spinner. I only use the (intel) SSDs for write-only or write-mostly stuff, and they seem fine being used for that - put the write-pounding stuff someplace else. Linux makes that fairly easy to do, though I have had issues where it boots so fast off the SSD that some of the places it wants to write - on a spinner - haven't spun up yet and it takes some interesting sysadmin work to
  • by Anonymous Coward on Tuesday February 19, 2013 @09:09AM (#42944045)

    meaningful life specs are tough to come by for flash. Yes, as noted above, SLC NAND has a rated life of 100k erases/page on the datasheet, but that's really a guaranteed spec under all rated conditions, so in reality, it lasts quite a bit longer. If you were to write the same page once a second, you'd use it up in a bit more than a day.

    However, in real life, the "failure" criteria is when a page written with a test pattern doesn't read back as "erased" in a single readback. Simple enough, except that flash has transient read errors: that is, you can read a page, get an error, read the exact same page again and not get the error. Eventually, it does return the same thing every time, but that's longer than the "first error".

    There's also a very strong non-linear temperature dependence on life. Both in terms of cycles and just in terms of remembering the contents. Get the package above 85C and it tends to lose its contents (I realize that the typical SSD won't be hot enough that the package gets to 85C, although, consider the SSD in a ToughBook in Iraq at 45C air temp..)

    In actual life, with actual flash devices on a breadboard in the lab at "room temperature", I've cycled SLC NAND for well over a million cycles (hit it 10-20 times a second for days) without failure. This sort of behavior makes it difficult to design meaningful wear leveling (for all I know, different pages age differently) and life specs, without going to a conservative 100k/page uniform standard, which, in practice, grossly understates the actual life.

    What you really need to do is buy a couple drives and beat the heck out of them with *realistic* usage patterns.

    • The temperature dependence is a very strong factor that does seem to be missing from the analysis- to add to what the AC parent said, my experience is that the minimum number of erase cycles is when the device is at maximum temperature, take it down to room temperature, and the typical number of erase cycles goes up by an order of magnitude. Most computers have an internal temperature of over 40C when run in a normal environment,

      Your drive will fail, SSD or HD. You must be prepared for that.

      • by Luckyo ( 1726890 )

        In all the honesty, this is badly wrong. Laptops running under heavy load may clock these numbers on the hard drive temp (not ambient inside the case but temperature sensor on the hard drive which essentially all modern hard drives have). Hard drives generate significantly more heat then SSDs due to mechanical issues.

        I'm typing this on a machine that has 4x3.5" hard drives stacked on top of each other, and openhardwaremonitor pretty much instantly tells me which drives are on the top and bottom and which ar

    • Iraq can get hotter than that outdoors (record temp of almost 53C was set in Basra 3 years ago). But an enclosed space in any inhabited part of the world can get much hotter if exposed to sunlight. Here in Cleveland, on the Canadian border, we rarely see 40C outdoors, and get very little sunlight most days; yet the inside of a car may well be 15-20C hotter than the outside, any time of year, if it is directly in the sun. In most months this means hot enough to melt plastic, or to cause water to evaporate
  • by StoneyMahoney ( 1488261 ) on Tuesday February 19, 2013 @09:19AM (#42944131)

    Does anyone know whether the failure count for cells picks up along a nice smooth curve or is like running into a cliff? Intel seem to be suggesting in their spec sheets that the 20% over-provisioning on some of their SSDs (I'm assuming for bad-block remapping when failure is detected) can increase the expected write volume of a drive by substantial amounts:

    http://www.intel.co.uk/content/www/us/en/solid-state-drives/solid-state-drives-710-series.html [intel.co.uk]

    This seems to go against the anecdotal evidence of sudden total SSD failures being attributed to cell wear - something else must be failing in those, most likely the normal expected allotment of mis-manufactured units.

    • Re:Curve or Cliff? (Score:4, Informative)

      by Luckyo ( 1726890 ) on Tuesday February 19, 2013 @09:49AM (#42944445)

      Sudden failures are controller failures. Especially budget controllers tend to fail before flash does.

      Flash failure is "usually" about not being able to write to the disk, but being able to read from the disk. Problem is that when you're getting it, that means you've gone through all the reserve flash and controller no longer has any flash to assign to use from reserve. I.e. drive has been failing for a while.

      Modern wear leveling also means that failure would likely cascade very quickly.

  • Story should have been entitled "Taking a Solid Look At SSD Write Endurance".

    Badabing! I'll be here all week.

  • by AdmV0rl0n ( 98366 ) on Tuesday February 19, 2013 @09:51AM (#42944469) Homepage Journal

    SSD here has been rejected on multiple and continuous failure rates. Now it only gets given to end users who provide a 'light' write environment - and thats the only place where consumer level 25 and sub level nm write cycle gear can be used sanely (ie, without having a plan for swap out/replacement and higher costs).

    I'm expecting a fairly severe level of failure on new equipment shipping today that uses SSD as cache.

    I frankly love the speed. But the claims about how long an 'average' user would take to wear out these disks has failed with abysmal rate failures where I work. Admittedly, our users are mid to heavy use cases, but the failure rates have been high, and the life time shorter than anyone would contemplate.

    Either the cost of the drives has to fall (which to be fair - it has been), or the reliability question and write limits needs to change substantially.

    I no longer consider SSD for front line heavy use. And I'd need serious work to be convinced on contemplating it again with lower nm flash. And SLC level gear is simply beyond the cost level we can attain.

    • For clarity, info published.

      We purchased intel drives. At the time these were being held up as 'good' to choose from.
      So they went into DellD630 units, and were handed out to Windows Devs and Windows SCADA engineers. Not one drive survived more than 12 months. These drives were pruchased in the US and used in the UK and Intel refused to fulfill warranty. Frankly D630 workloads are poor compared to more modern machinery - its quite a poor show that D630 users took drives out at all.

      We have moved through other

  • I'm on the heavier end of the normal computer user and I still have a Vertex 1 drive still alive and kicking.
  • by skywire ( 469351 ) * on Tuesday February 19, 2013 @11:01AM (#42945225)

    > you'd still find yourself waiting several months or even years for that SSD to start dying on you

    How comforting!

  • Slashdot comments are always about how SSDs are unreliable and fail constantly, but manufacturers can't seem to crank out SSD equipped devices fast enough. Macbook Airs and the entire Ultrabook category are based around SSD storage. These complaints about endurance never seem to line up with real world experiences. I've been through three SSDs in desktops (Vertex 1, Vertex 2 and Vertex 4 running on both Windows and Linux) along with an SSD equipped Macbook Air (from 2010) and I've yet to have a single is
  • by m.dillon ( 147925 ) on Tuesday February 19, 2013 @02:42PM (#42947431) Homepage

    So far I see a lot of complaints from people who don't appear to even know how to run SMART tools to get write cycle and wear statistics from their SSDs... you know, so real actual numbers can be posted.

    So far none of my SSDs have failed, and I have almost 20 installed in various places. The one with the most wear is one of the first SSDs I purchased, an Intel 40G device:

    da0: Fixed Direct Access SCSI-4 device
    da0: Serial Number CVGB951600AC040GGN
    da0: supports TRIM

    Power on hours - 19127
    Power cycle count - 48
    Unsafe shutdown count - 32
    Host writes x 32MiB - 375697
    Workld media wear - 5120
    Available reserved - 99/99/10
    Media wearout - 91%

    Basically 12TB worth of writes on this 40G drive over the last 2.18 years. No failures. Media wearout indicator 99 -> 91. Estimated durability based on the wear indicator is around 132TB. Roughly comes to ~3300 cycles/cell. This vintage of SSD uses MLC flash whos cells are roughly spec'd at ~10000 cycles.

    While firmware issues are well documented for various SSD vendors over the last few years, and cell erase cycle life has gone down as the chips have gotten more dense, I would still expect the vast majority of failures to be due to wear-out.

    Lots of things can cause premature wear-out but probably the most common would be using the SSD for something really stupid, like to host a database doing a lot of random writes or with a high frequency of fsync()s, using the SSD for swap on a system which is paging heavily 24x7, using the SSD for WWW log files on a busy web server, formatting an unaligned filesystem on the SSD or a filesystem which uses too-small a block size, and any number of other things.

    Venerable but still mostly correct:

    http://leaf.dragonflybsd.org/cgi/web-man?command=swapcache [dragonflybsd.org]

    The only adjustment I would make is that as the Intel 40G continues running, the wear I'm getting on it is pointing closer to ~130TB of durability and not 400TB (400TB is the theoretical max at 10,000 cycles/cell). Still reasonable. Generally speaking, that's the older 34nm technology. The newer 24nm technology will get fewer cycles but devices tend to have more storage so, as I say in the manual, you could expect similar total wear out of a newer 120GB 310 series SSD whos flash cells have 1/3 the cycle life.

    -Matt

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (10) Sorry, but that's too useful.

Working...