Taking a Hard Look At SSD Write Endurance 267
New submitter jyujin writes "Ever wonder how long your SSD will last? It's funny how bad people are at estimating just how long '100,000 writes' are going to take when spread over a device that spans several thousand of those blocks over several gigabytes of memory. It obviously gets far worse with newer flash memory that is able to withstand a whopping million writes per cell. So yeah, let's crunch some numbers and fix that misconception. Spoiler: even at the maximum SATA 3.0 link speeds, you'd still find yourself waiting several months or even years for that SSD to start dying on you."
Holy idiocy batman (Score:4, Insightful)
100000 writes? 1M writes?
What the fuck is this submitter smoking?
Newer NAND flash can sustain maybe 3000 writes per cell, and if it's TLC NAND, maybe 500 to 1000 writes.
Re:Holy idiocy batman (Score:5, Informative)
Re:Holy idiocy batman (Score:5, Funny)
Obviously the TLC NAND is named for the Tender Loving Care you need to give it during use.
I think the Slack Lazy Careless stuff is more robust.
Re: (Score:3)
Obviously the TLC NAND is named for the Tender Loving Care you need to give it during use.
I think the Slack Lazy Careless stuff is more robust.
I will just stick with the My Little Crony NAND for now.
Re: (Score:2)
True, those are typical values for value oriented parts, there's also high endurance SLC at ~1M cycles and eMLC at ~30k cycles, the downside is a much higher $/GB so it only makes sense to use them in environments where you know you'll have long periods of high write intensity (like write cache for a SAN or ZIL for a ZFS volume).
Re:Holy idiocy batman (Score:5, Informative)
http://lmgtfy.com/?q=NAND+write+cycles# [lmgtfy.com]
Re:Holy idiocy batman (Score:5, Informative)
Re: (Score:2)
What you've written is great if you are worried about putting SSDs in your server farm. A decent RAID of 256GB SSDs should be safe for even the most writiest of server loads, for the expected lifetime of a server.
It would be great to see an article that including a spectrum of SSD write endurances for consumer use NAND (MLC and TLC), but also with office/consumer write expectations, e.g., 8 hours a day, average 10MB/s writes (i.e., perfect torrenting on a 100mbps internet link). Hell, do 100MB/s - that woul
Re: (Score:3, Insightful)
He referenced specific models. A hyperlink is not the only way to refer to a source. You were given enough information to find the source easily.
Re:Holy idiocy batman (Score:5, Insightful)
The AC is dead-on right. At 25nm the endurance for high-quality MLC cells is about 3,000 writes. That's a relatively conservative estimate so you are pretty much guaranteed to get the 3K writes and likely somewhat more, but it's a far far cry from the 100K writes you can get from the highly expensive SLC chips. Intel & Micron claimed that one of the big "improvements" in the 20nm process was hi-K gates that are claimed to maintain the 3K write endurance at 20nm, which otherwise would have dropped even more from the 25nm node.
The author of the article went to all the time & trouble to do his mathematical analysis without spending 10 minutes to find out the publicly available information about how real NAND in the real world actually performs....
Re: (Score:2)
The thing to keep in mind is that as write endurance decreases due to process shrinks, the capacity for the given area of chip increases so the overall write endurance (as measured in block erases per squared area) remains about the same.
An average capacity SSD can be written to at maximum erase rates for weeks without wearing them out.
Re: (Score:2, Interesting)
A quick glance at wikipedia tells me that you're being rather pessimistic...
"Most commercially available flash products are guaranteed to withstand around 100,000 P/E cycles before the wear begins to deteriorate the integrity of the storage. Micron Technology and Sun Microsystems announced an SLC NAND flash memory chip rated for 1,000,000 P/E cycles on 17 December 2008."
http://en.wikipedia.org/wiki/Flash_memory#Memory_wear [wikipedia.org]
Re: (Score:2)
Micron Technology and Sun Microsystems announced an SLC NAND flash memory chip rated for 1,000,000 P/E cycles on 17 December 2008."
Only if you're using SLC NAND, which is the fast, expensive, long lasting stuff. The other kinds (MLC/TLC) wear out much quicker.
Re:Holy idiocy batman (Score:5, Funny)
17 December 2008.
5 years? Might as well write a white paper on the benefits of drum memory over mercury delay lines.
Re: (Score:2)
On the other hand, while you're right that they're an order of magnitude and a half out with that, they're also deliberately 3-4 orders of magnitude or more out with the rate at which you write data, so in reality, the likelihood is actually lifespans much longer than those listed in the article.
Re:Holy idiocy batman (Score:5, Interesting)
Actually, NAND flash doesn't "die" when you try to do the N+1 erase-write cycle (it's cycles, not writes. A cycle consists of flipping bits from 1 to 0 (aka write), and then from 0 to 1 (aka erase)). In practically all controllers, you do partial writes. With SLC NAND, it's fairly easy - you can write a page at a time, or even half pages. MLC lets you do page at a time as well - given typical MLC "big block" NAND of 32 4k pages, a block can be written 32 times before it's erased (once per page - you cannot do less than a page at a time).
And... other dirty little secret - the quoted cycle life is guaranteed. It means your part will be able to be written and erased 3000 times. Most typically, they're an order of magnitude more conservative - so a 3000 cycle flash can really get you 30,000 with proper care and tolerance.
Of course, a really big problem with cheap SSDs is lame firmware because what you need is a good flash translation later (FTL) which does wear levelling, sector translations, etc. These things are VERY proprietary and HEAVILY patented. A dirt cheap crappy controller you might find on low end thumbdrives and memory cards may not even DO translation or wear levelling. The other problem is the flash translation table must be stored somewhere so the device can find your data (because of wear levelling, where your data is actually stored versus where your PC thinks it is different - again, the FTL handles this). For some things, it's possible to just scan the entire array and generate the table live, but generally it's impractical at the large scale because it requires time to perform the scan. So usually the table is stored in flash as well, which of course is not protected by the FTL. Depending on how things go, this part could corrupt itself easily leading to an unmountable device or basically, a dead SSD.
For some REAL analysis, some brave souls have been stressing cheap SSDs to their limits until failure - http://www.xtremesystems.org/forums/showthread.php?271063-SSD-Write-Endurance-25nm-Vs-34nm [xtremesystems.org]
Some of those SSDs are actually still going strong.
The best bet is to buy from people who know what they're doing - the likes of Samsung (VERY popular with the OEM crowd - Dell, Lenovo, Apple, etc.), Toshiba, and Intel - who all make NAND memory and thus actually do have experience on how to best balance speed and reliability. Everyone else is just using the datasheet and just assembling them together like they would any other PC part.
Re:Holy idiocy batman (Score:5, Informative)
Re:Holy idiocy batman (Score:5, Interesting)
RAM disks are cool and all, but except on live CDs they're usually unnecessary. The kernel's buffer cache and directory-name-lookup cache (in RAM) can often outperform RAM disks on second reads and writes.
(Claimer: I worked on file systems for HP-UX, and we measured this when we considered adding our internal experimental RAM FS to the production OS.)
Re: (Score:3)
100,000? (Score:5, Informative)
100,000 is only for SLC NAND. MLC, what is currently in most SSDs, is only 3,000, and TLC (found in usb drives, samsung 840, and probably more SSDs soon because it's cheaper) is only 1,000.
Is 1,000 fine for most people, yes.. but you should be aware of it. I have a fileserver that writes 200gb per day.. which would kill a Samsung 840 in about 6-7 months.
http://www.anandtech.com/show/6459/samsung-ssd-840-testing-the-endurance-of-tlc-nand [anandtech.com]
Re:100,000? (AWS?) (Score:2)
Which technology is Amazon using for their AWS instances? Their instance description page (http://aws.amazon.com/ec2/instance-types/) doesn't say one way or the other.
Re:100,000? (AWS?) (Score:4, Informative)
Almost certainly MLC. SLC is really only found in industrial SSDs these days. Enterprise and consumer SSDs are all MLC, with the exception of Samsung 840, the first SSD to use TLC.
Re: (Score:2)
There are plenty of enterprise SSD's that use SLC, both FusionIO and STEC offer SLC options and since STEC has been replaced by Samsung in many applications I assume they do as well. I know HP also offers them as an option for their Proliant servers.
Re: (Score:2)
There are plenty of enterprise SSD's that use SLC, both FusionIO and STEC offer SLC options
Yeah but who is using them?
These arent used for massive server farms because regular drive failures are inevitable regardless of what you use. SLC flash mainly sees industrial and embedded use where small amounts of space actually gets used. If you have enormous amounts of storage then it would be very foolish to use expensive SLC's.
Re: (Score:2)
Apparently our definitions of plenty are different. FusionIO has ONE product w/ SLC. STEC has a few, but they are marked as industrial SSDs, just as I said in my original post.. and they are priced like industrial SSDs too.. about $1140 for 100gb. Industrial SSDs are typically $5-10 per GB... so that's right in line.
Re: (Score:2)
Re: (Score:2)
Heavy database caching kills MLC SSDs in couple of months max. TLC wont last more than few weeks.
Re:100,000? (Score:5, Interesting)
Luckily, while he's about 30 times out for the write endurance on the bad side, he's about 100-1000 times out on the speed at which you're likely to ever write to the things, on the good side, so in reality, SSDs will last about 3-30 times longer than he's indicating in the article. The fact that he's discussing continuous writes at max sata 3 speed suggests that he's really concerned with big ass databases that are writing continuously, and use SLC NAND. The consumer case is in fact much better than that, even despite MLC/TLC.
Re:100,000? (Score:5, Informative)
I own 2 840s... they are fine. If you're really concerned, samsung has a tool that will let you adjust the spare space.. so you can take a 256gb drive, set aside 20gb to use for spares as cells wear out, and use 236gb for your data.
If you read the article I linked to, an 840 128gb drive will last for about 272TB in writes... or about 11.7 years at 10gb/day.
It's much more likely that another part will wear out before the cells do.
Re: (Score:2)
but nothing like the 1M talked about in the article.
You are right.. My guess is that he mixed up the cell write endurance with the MTBF of new SSDs. The MTBF for a crucial m4 is 1.2m hours, for example.
If that isn't it, then I have no idea where he got that number from.
If SSd is nearly full? (Score:3)
Re:If SSd is nearly full? (Score:5, Interesting)
But if your SSD is nearly full with data that you never change, wouldn't all the writing happen in the small area that is left? This would significantly reduce lifetime.
I believe all the major brands actually move your data around periodically, which costs write cycles but is worth it to keep wear balanced.
Re: (Score:2)
Indeed. This is called "wear leveling" and is aimed at preventing a scenario where you have a chunk of data that is never moved or deleted taking a lot of drive space making all wear focus on small area which is worn out very quickly.
Re: (Score:2)
There's also spare capacity, say you have 1TB of raw flash space, typically you'll only see 750GB of pre-formatted space available with the remaining 25% set aside for housekeeping tasks like wear leveling, sparing, and most importantly pre-erase which is the only way to get decent write performance on MLC.
Re: (Score:3, Informative)
actually they thought about that never SSD drives have special wear leveling algorithm that if it notices you write some parts a lot and remainder of disk is static they just move static part to used-up space and use underused (ex-static part of disk for writing stuff that changes a lot, more or less you can expect that every cell will be used equal number of times even if you write to just 1 file big 1MB and rest is static
Re: (Score:2)
If you use TRIM, then your drive will know what parts of the disk are empty, and what parts are not. With wear leveling, the SSD will always write to free blocks with the most write cycles available first, and it will just remap blocks in whatever order it wants (blocks don't need to be in linear order like on HDDs). I think they start moving data around once the cells get to the end of their write cycles or it thinks drive is full (no TRIM or the drive is actually full).
Re: (Score:2)
I could be wrong and it probably depends on the SSD itself. A lot of SSD's these days have a reserved area that's used when cells start to die (Which is why you'll see SSDs with say 120GB of storage instead of 128GB). They all attempt to evenly write over all of the cells as well, instead of just hammering a select few. Of course you're probably right about when the SSD itself is nearly full but as far as I'm aware, ultimately what starts to happen is either the space decreases slowly over time or the SSD j
Re: (Score:2)
SSDs with say 120GB of storage instead of 128GB
And then they use drivermakers' gigabytes instead of regular ones, so people see a nice round number like 128 and assume they don't get cheated.
Oh, sorry, they sponsored some commission to redefine pi^Hkilobyte, so when they get sued, they can claim they don't falsely advertise.
Re: (Score:2)
Appropriate username is appropriate.
Re: (Score:2)
Oh, sorry, they sponsored some commission to redefine pi^Hkilobyte, so when they get sued, they can claim they don't falsely advertise.
As annoying as it may be to admit it, the drive manufacturers have a point.
The definitions of "kilo", "mega", "giga", etc, were defined quite explicitly back in the 18th century, to refer to powers of 10. Computer manufacturers later misused them to refer to powers of two, which was (and is) simply incorrect, no matter how comfortable computer people have become with it in the meantime. It's better to fix the problem now and have consistent, well-defined terms in the future than to live with ambiguity and
Re:If SSd is nearly full? (Score:5, Interesting)
SSD should work at maximum of 75% of their capacity... 50% or less is recommended
some chips try to move blocks to rotate the writes, have a lot of spare zones, so it can remap/use other sectors on write... but that is a problem, working in a full SSD will shorten its live
Our first age-related failure was a 2008 drive. (Score:5, Interesting)
The Intel X25's in my PC, from 2009, are still humming along nicely and my last benchmark produced the same results in 2012 as they did in 2010. But I've gone so far as to set environment variables for user temp files to a mechanical drive, internet temp files to a RAM drive and system temp files to a RAM drive, offsetting the wear leveling.
Re: (Score:2)
Aye, intel drives are known for two things: their reliability and their high prices.
If you tried budget vendors like OCZ, you'd likely have a very different story to tell us.
Re: (Score:2)
If you want a reliable drive, you have to buy a HD at this point, still. Intel SSD are reliable by SSD standards, but not by HD standards.
We still have ways to go before SSD controllers get to level or reliability of HD controllers simply because of the level of complexity required in SSD controllers. The quick failures are typically controller failures. That's one part where intel shines, it installs expensive and reliable controllers in its drives.
Re: (Score:3)
Re: (Score:3)
Two OCZ SSDs that have been running for 1 and 3 years respectively, both still going strong.
The plural of anecdote is not data yadda yadda.
Re: (Score:2)
I had an X25 from 2008 that died after about 8 months with exactly the same problem. I do some sqlite database stuff from time to time, but otherwise mostly just browsing and normal C development. The replacement lasted about the same time before running out of blocks.
It seems that certain workloads accelerate the ageing process massively.
Re: (Score:3)
Life is tricky for flash (Score:5, Interesting)
meaningful life specs are tough to come by for flash. Yes, as noted above, SLC NAND has a rated life of 100k erases/page on the datasheet, but that's really a guaranteed spec under all rated conditions, so in reality, it lasts quite a bit longer. If you were to write the same page once a second, you'd use it up in a bit more than a day.
However, in real life, the "failure" criteria is when a page written with a test pattern doesn't read back as "erased" in a single readback. Simple enough, except that flash has transient read errors: that is, you can read a page, get an error, read the exact same page again and not get the error. Eventually, it does return the same thing every time, but that's longer than the "first error".
There's also a very strong non-linear temperature dependence on life. Both in terms of cycles and just in terms of remembering the contents. Get the package above 85C and it tends to lose its contents (I realize that the typical SSD won't be hot enough that the package gets to 85C, although, consider the SSD in a ToughBook in Iraq at 45C air temp..)
In actual life, with actual flash devices on a breadboard in the lab at "room temperature", I've cycled SLC NAND for well over a million cycles (hit it 10-20 times a second for days) without failure. This sort of behavior makes it difficult to design meaningful wear leveling (for all I know, different pages age differently) and life specs, without going to a conservative 100k/page uniform standard, which, in practice, grossly understates the actual life.
What you really need to do is buy a couple drives and beat the heck out of them with *realistic* usage patterns.
Re: (Score:2)
The temperature dependence is a very strong factor that does seem to be missing from the analysis- to add to what the AC parent said, my experience is that the minimum number of erase cycles is when the device is at maximum temperature, take it down to room temperature, and the typical number of erase cycles goes up by an order of magnitude. Most computers have an internal temperature of over 40C when run in a normal environment,
Your drive will fail, SSD or HD. You must be prepared for that.
Re: (Score:2)
In all the honesty, this is badly wrong. Laptops running under heavy load may clock these numbers on the hard drive temp (not ambient inside the case but temperature sensor on the hard drive which essentially all modern hard drives have). Hard drives generate significantly more heat then SSDs due to mechanical issues.
I'm typing this on a machine that has 4x3.5" hard drives stacked on top of each other, and openhardwaremonitor pretty much instantly tells me which drives are on the top and bottom and which ar
Re: (Score:2)
Curve or Cliff? (Score:3)
Does anyone know whether the failure count for cells picks up along a nice smooth curve or is like running into a cliff? Intel seem to be suggesting in their spec sheets that the 20% over-provisioning on some of their SSDs (I'm assuming for bad-block remapping when failure is detected) can increase the expected write volume of a drive by substantial amounts:
http://www.intel.co.uk/content/www/us/en/solid-state-drives/solid-state-drives-710-series.html [intel.co.uk]
This seems to go against the anecdotal evidence of sudden total SSD failures being attributed to cell wear - something else must be failing in those, most likely the normal expected allotment of mis-manufactured units.
Re:Curve or Cliff? (Score:4, Informative)
Sudden failures are controller failures. Especially budget controllers tend to fail before flash does.
Flash failure is "usually" about not being able to write to the disk, but being able to read from the disk. Problem is that when you're getting it, that means you've gone through all the reserve flash and controller no longer has any flash to assign to use from reserve. I.e. drive has been failing for a while.
Modern wear leveling also means that failure would likely cascade very quickly.
Hard or solid (Score:2)
Story should have been entitled "Taking a Solid Look At SSD Write Endurance".
Badabing! I'll be here all week.
Hmm (Score:3)
SSD here has been rejected on multiple and continuous failure rates. Now it only gets given to end users who provide a 'light' write environment - and thats the only place where consumer level 25 and sub level nm write cycle gear can be used sanely (ie, without having a plan for swap out/replacement and higher costs).
I'm expecting a fairly severe level of failure on new equipment shipping today that uses SSD as cache.
I frankly love the speed. But the claims about how long an 'average' user would take to wear out these disks has failed with abysmal rate failures where I work. Admittedly, our users are mid to heavy use cases, but the failure rates have been high, and the life time shorter than anyone would contemplate.
Either the cost of the drives has to fall (which to be fair - it has been), or the reliability question and write limits needs to change substantially.
I no longer consider SSD for front line heavy use. And I'd need serious work to be convinced on contemplating it again with lower nm flash. And SLC level gear is simply beyond the cost level we can attain.
Re: (Score:3)
For clarity, info published.
We purchased intel drives. At the time these were being held up as 'good' to choose from.
So they went into DellD630 units, and were handed out to Windows Devs and Windows SCADA engineers. Not one drive survived more than 12 months. These drives were pruchased in the US and used in the UK and Intel refused to fulfill warranty. Frankly D630 workloads are poor compared to more modern machinery - its quite a poor show that D630 users took drives out at all.
We have moved through other
Heavier user (Score:2)
Re: (Score:2)
Several Months to Live (Score:3)
> you'd still find yourself waiting several months or even years for that SSD to start dying on you
How comforting!
I don't get it (Score:2)
How about some real numbers (Score:4)
So far I see a lot of complaints from people who don't appear to even know how to run SMART tools to get write cycle and wear statistics from their SSDs... you know, so real actual numbers can be posted.
So far none of my SSDs have failed, and I have almost 20 installed in various places. The one with the most wear is one of the first SSDs I purchased, an Intel 40G device:
da0: Fixed Direct Access SCSI-4 device
da0: Serial Number CVGB951600AC040GGN
da0: supports TRIM
Power on hours - 19127
Power cycle count - 48
Unsafe shutdown count - 32
Host writes x 32MiB - 375697
Workld media wear - 5120
Available reserved - 99/99/10
Media wearout - 91%
Basically 12TB worth of writes on this 40G drive over the last 2.18 years. No failures. Media wearout indicator 99 -> 91. Estimated durability based on the wear indicator is around 132TB. Roughly comes to ~3300 cycles/cell. This vintage of SSD uses MLC flash whos cells are roughly spec'd at ~10000 cycles.
While firmware issues are well documented for various SSD vendors over the last few years, and cell erase cycle life has gone down as the chips have gotten more dense, I would still expect the vast majority of failures to be due to wear-out.
Lots of things can cause premature wear-out but probably the most common would be using the SSD for something really stupid, like to host a database doing a lot of random writes or with a high frequency of fsync()s, using the SSD for swap on a system which is paging heavily 24x7, using the SSD for WWW log files on a busy web server, formatting an unaligned filesystem on the SSD or a filesystem which uses too-small a block size, and any number of other things.
Venerable but still mostly correct:
http://leaf.dragonflybsd.org/cgi/web-man?command=swapcache [dragonflybsd.org]
The only adjustment I would make is that as the Intel 40G continues running, the wear I'm getting on it is pointing closer to ~130TB of durability and not 400TB (400TB is the theoretical max at 10,000 cycles/cell). Still reasonable. Generally speaking, that's the older 34nm technology. The newer 24nm technology will get fewer cycles but devices tend to have more storage so, as I say in the manual, you could expect similar total wear out of a newer 120GB 310 series SSD whos flash cells have 1/3 the cycle life.
-Matt
Re: (Score:2, Insightful)
I have never had a laptop hard drive last more than two years, and only had one last more than eighteen months. Maybe your spinning-metal-one-micron-away-from-the-drive-head drives work well in a stationary, temperature-controlled environment, I guess.
Re:Tried It - Disappointed (Score:4, Insightful)
I have a very old (I think I bought it circa 2004 or so, it has turion cpu). Display hinges failed in it as well as cooling so I can't play games on it anymore (discreet GPU).
Hard drive is trucking on fine.
Some hard drives obviously last less. However if you have systemic problem with hard drives lasting less then two years, it's time to take a look at the factor that remains the same between these hard drives: user.
Poor little hard drives (Score:3)
I have never had a laptop hard drive last more than two years, and only had one last more than eighteen months.
Then I would have to wonder what the heck you are doing to the hard drives. I'm not sure I've ever had one last less than that long in a laptop. I've had laptop hard drives last for 7 years and were still going strong when I stopped using the machine. In fact I usually have some other component die long before the hard drive does. I have several hard drives that work just fine from laptops with burned out system boards, defective keyboards, borked video and other problems.
Some people are quite hard on t
Re: (Score:2)
For laptop batteries I have been told that they (the batteries) will not get a memory. I have yet to find a rechargeable battery that doesn't get a memory. With a laptop it is easy to determine. You charge the laptop battery until fully charged. Then when running the laptop on the battery the low power warning pops up in 5-10 minutes (often less then 5 minutes). This is why I usually make a drain battery power setting plan. This power plan has no auto shut off. I can usually run the laptop with 0% battery l
Re: (Score:3)
You can over charge it (I did on older batteries) it you leave them charging for too long.
This is fully a problem with every laptop manufacturer skimping out on the charge controller design. It's apparently cheaper to let your customers burn out their batteries by leaving them plugged in "too much" rather than designing a power supply that cuts off the charging current when the battery is full.
But they still sell the laptops as "desktop replacement" devices, which to me implies that they should be able to be plugged in all the time without damage. Also, they're in computers. They should be ab
Re: (Score:2)
"Using" you computer (Score:3)
When you dont use a computer. That happens. And the fact that you are happy with a 2007 dell means you really dont use your computer.
Curious theory. The fact that I run a multi-million dollar company heavily using a half dozen computers between 6-9 years old must really mess with your world view. We run ERP , product test, shipping, time card management, several databases, some very large spreadsheets, CAD and quite a bit more but according to you we must not actually be using the computers for anything. Would a faster computer be nice? Sure but the marginal improvement would be well into diminishing returns.
I wear the letters off of a keyboard in 12 months.
So stop buying crappy key
Re: (Score:3)
Re:Tried It - Disappointed (Score:5, Informative)
Had an SSD in my laptop for just over a year and a half now, no issues what so ever. Daily use as well.
Re:Tried It - Disappointed (Score:4, Informative)
My desktop Intel X25 died after 8 months due to running out of spare blocks and an ADATA drive I had in my occasional use laptop lasted about a year and a half. My two anecdotes cancel out your anecdotes.
Re: (Score:2)
Re:Tried It - Disappointed (Score:5, Informative)
Obviious Troll is Obvious but... while SSDs can & do fail (just like old hard drives can & do fail), the reason for SSD failure in the real world is very rarely due to flash memory wear. Hint: If your flash drive suddenly stops working one day, that ain't due to flash wear, which would manifest as gradual failure over time.
Re: (Score:3)
The issue people point out is that "even if controller is good enough to last you until wear out, your SSD will fail much sooner then a hard drive".
Fact that controllers fail ridiculously often on budget drives doesn't improve SSD reliability. It is however somewhat understandable, as SSD controllers are significantly more complex then hard drive ones.
Re: (Score:2)
The problem is that people buy drives from companies that are known to have a terrible reputation for reliability (like OCZ) and then are surprised when they fail.
Generally, if you stick to the reputable manufacturers (Intel, Samsung, Crucial, etc) then you'll have a better chance. It doesn't mean it won't fail, it just means there's a lower chance.
Re: (Score:2)
The comparison here is very much in the red for SSD vs HD (original point) though.
Re: (Score:3)
You are right, they usually die of ... wait for it .... flash memory wear (most likely firmware not being able to recognize damaged cell and insisting on using it).
-electronic failure (power supply, rarely controller chip itself)
-firmware bug triggered by
Re:Tried It - Disappointed (Score:4, Interesting)
Actually, better SSD controllers sense that a page has reached its rewrite limit. The end effect of this is that the size of the overprovisioned space gets reduced by one page. (The controller stops ever writing to the used-up page.) The write performance of the SSD degrades until it goes below a certain amount of overprovisioned space, at which point it refuses to write any more. The disk is still entirely readable, so it's a binary failure mechanism, but a pretty safe one.
Gradual failure over time means either you have a crap controller or that your electronics are failing in ways other than running out of write cycles.
Re: (Score:3)
Cheaper? Maybe per GB but not for the IO.
How many platters am I going to have to raid to get even near what a single SSD can do? Am I ever going to be able to get random reads that high and fit it all in one WTX case?
Re: Tried It - Disappointed (Score:2)
Reliability always trumps speed, for me anyway.
Re: Tried It - Disappointed (Score:5, Insightful)
So then you only use magnetic tape for storage?
How long does it take to boot from that?
I have backups, so I can always restore.
Re: Tried It - Disappointed (Score:5, Funny)
No, magnetic tape is too vulnerable to EMP. He boots from punch card.
Re: Tried It - Disappointed (Score:4, Funny)
Fire susceptible.
I've implemented a filesystem on top of OpenCV that uses a laser to read bits carved into granite slabs.
If the laser fails, various sun alignments will allow the passive CdS sensor to take over, at a performance penalty of several years (about one IOP per year).
Re: (Score:2)
Magnetic tape is more reliable and slower.
Lots of computers over the years booted from magnetic tape.
Re: (Score:2)
Maybe. Sorta. Kinda. Not really.
Some of the hybrid systems are a nice compromise. A Momentus XT 750 for $129 has worked great for me. No, it isn't as fast as a SSD for all situations. And I really wish it had more than 8GB of flash. But for boot and launching some applications, it's fantastic. Price and storage volume are decent.
Until price, capacity, and robustness of SSD matches spinning media, we're going to see more of these hybrid systems.
Re: (Score:2)
I also have never had a SSD last more than 24 months. Most last less than a spinning hard drive.
I use them for the speed, but anyone claiming they are reliable are smoking some strong peyote.
Re: (Score:3)
I use them for the speed, but anyone claiming they are reliable are smoking some strong peyote.
Yep, just yesterday I had four embedded boxes on my desk that needed the SSD's pulled for replacement and reinstall. All four had Kingston SSDNOw drives in them and were 1-2 years old. We had much better luck back in the days of IDE CompactFlash adapters and those were less expensive parts than SSD's.
I'm under the impression now that it's because those were 90nm devices and the newer stuff is just crap. MLC SSD'
Re: (Score:2)
And were any of those failed SSDs near the end of their write lifecycle when they failed? I'm betting it was something else that caused them to fail.
Good SSDs from reputable companies (for example, not OCZ) are generally pretty reliable.
Re: (Score:2)
Each time I've tried an SSD it's failed after a year.
Stop buying shitty ones.
Re: (Score:2)
Don't expect a cell phone to have swap I/O demands of a server. Also, since most people chuck their cell phones after a couple of years anyway I don't think it would be a problem.
Re:What about swap? (Score:4, Informative)
I don't expect most servers to swap at all. If your server is swapping, buy more ram. Cell phones are still ram starved enough to need to do that.
Re: (Score:2)
OMG, it's like living in the 70s again? I don't want to start a swapping war, and man these are futile arguments.
You tune for appropriate needs of the server and allow the O/S to manage that. There was an operating system called CP6, owned by Honeywell. It was originally known as CP5 on the Xerox Sigma systems from the 70s. Anyway, it had a philosophy at the time, no swapping. https://en.wikipedia.org/wiki/Time-sharing [wikipedia.org] Yeah, a great O/S on a DPS 8/70 Honeywell Mainframe that had 8 Megawords (36 bit wor
Re: (Score:2)
I am sorry I was not more clear, I did not literally mean no swapping ever.
What I really meant was I expect on average my phone with only 1GB of RAM to swap more than my servers which I add RAM to if I notice considerable swapping. Unless I am limited by cost of course. These days though 128GB of RAM is pretty cheap in the server world.
Re: (Score:2)
;-) Well, that depends on who you get your memory from and what server you're talking about but yes prices have come down a bit. I remember when 8MW on a DECsystem 20 was hella expensive. $50,400 for 265K Words.. http://bitsavers.trailing-edge.com/pdf/dec/pdp10/lcg_catalog/LCG_Price_List_Jan82.pdf [trailing-edge.com] That would come out to be $1,612,800 for 32 modules not including the expansion cabinets necessary to hold it all.
SO for a DL360-G6... (not too old.)
8GB module, $75... From here.
Same module $162... From here. [costcentral.com]
Re: (Score:3)
"It obviously gets far worse" is referring to "how bad people are at estimating", not the lifespan of the Flash Memory.
Re: (Score:2)
Any reason you ignore wear leveling that all modern SSDs do? The drive controller will move the superblock, if that is the most written block, around. It will remap it so the OS is none the wiser.
Re: (Score:2)
It stores the remap info in reserved blocks. Yes they could wear out, but since it is so little it is unlikely. Most of these drives have a lot of reserved blocks.
Try understanding how these wear leveling systems actually work.
Re:Number crunching != empirical evidence (Score:4, Interesting)
Which is why most SSD drives implement some kind of wear leveling. They will move the often written sectors around the physical storage space in an effort to keep the wear even.
Rotating media drives do similar things and can physically move "bad" sectors too, but this usually means you loose data. Many drives actually come from the factory with remapped sectors. You don't notice it because these sectors are already remapped on the drive onto the extra space the manufacturers build into the drive, but don't let you see.
Reminds me of when I interviewed with Maxtor, years ago. They where telling me that the only difference between their current top of the line storage (which was like 250G at the time) and their 40 Gig OEM drive was the controller firmware configuration and the stickers. Both drives came off the same assembly line and only the final drive power up configuration and test step was different, and then only in the values configured in the controller and what stickers got put on the drive. If you had the correct software, you could easily convert the OEM drive to the bigger capacity, by writing the correct contents to the right physical location on the drive. The reason they did this was it was cheaper than having to stop and retool the production line every time an OEM wanted 10,000 cheap drives.
I'm sure drive builders still do that sort of thing today. Set up a 3Tb drive line, then just down size the drives which are to be sold as 1Tb drives.
Re: (Score:3)
Quite excellent testing by xtremesystems. I'm not doing anything nearly so formal, but my numbers for the two brands I use (Intel and Crucial) are roughly on track with their results. And it does give me more confidence in those two brands.
I have an OCZ as well which still works, but after all the negative issues came up I pulled it out of production boxes. And I only have one... never bought another one, every time I researched them out they just weren't up to snuff.
It should also be noted that SSD firm