Slashdot is powered by your submissions, so send in your scoop


Forgot your password?

Massive Storage Advances 279

pra9ma writes: "Scientists from Keele University, in England, have suceeded creating a system that enables up to 10.8 terabytes of data to be stored in an area the size of a credit card, with no conventionally moving parts. This along with 3 other forms of memory which could revolutionize storage. The company said the system could be produced commercially within two years, and each unit should cost no more than $50 initially, with the price likely to drop later. " I'm unconvinced about their compression algorithm, but if it works, this is gonna be amazing.
This discussion has been archived. No new comments can be posted.

Massive Storage Advances

Comments Filter:
  • by Anonymous Coward
    Great, just when napster is closing up shop. Now what do I use 10 terabytes for?
  • by Anonymous Coward
    I feel somewhat doubtfull......
    C.n y.. re.d th.s C.ngr.t......ns! Ent..p. is .nl. an av...g., bas.d on t.x., .nd wil. n.t w.rk f.r r.nd.m info.......!
  • by Anonymous Coward
    to Slashdot.

    Stop posting ridiculous stories like this, and you will save terrabytes in bandwidth and storage requirements for all the "you've been had" comments.

  • by Anonymous Coward
    lets say we have:

    11010111 10110100

    Those are our two bytes.
    How would you record the difference?

    The 'difference' would be:

    Now how is that take up any less space? Just a little food for thought.. no?

  • by Anonymous Coward
    Nothing like technology induced orgasms... yummy
  • What's with this obsession with text? Text is not at all significant in disk space usage. At a reading rate of 3 minutes/K, that means it takes you 10 000 years before you have finished your first Tb of text. Pah. (Oh, and I know scientists from Keele, the concept that they could produce something like this is laughable).
  • Go to AudioGalaxy []. Once you do, trust me, you'll just laugh that Napster is goin' down. I've converted many of my friends, all who say it's much better than Crapster.
  • Yeah, so that when they run out of oil,and have sapped us from all our money on oil, they'll bring that battery patent out of the basement, mass produce the things, and then suck money out of us using that.
  • They're talking about compressing text, which implies ordinary, lossless compression. MP3 isn't even in the same ballpark.

  • Just imagine what this would do to firms like StoreageTek if true. It will wipe them out. Why spent £0.5M on a Powderhorn solution (~400TB) when you can spend £2000 on 40 credit cards that can hold the same.

  • Which is why your PDA currently can't store thousands of hours of music and hundreds of thousands of images. And HTML versions of all of the O'Reilly books and all the RFCs. And (display permitting) full-length movies. And a several-hundred-meg "working set" of documents you may need access to on the road. And a backup of your hard drive, just in case.

    Sure, these things aren't vital, but they're certainly not useless.
  • Yawn.


  • "No conventionally moving parts"? How does the little fiber-optic-thingie-in-goo read the surface? Assuming you are talking about a single layer of storage here... 10 terabytes in something like 50 square cm? Is that something like 200 quadrillion bits per square meter? 1 bit of information requires only .00001 nanometers? Riiigghtt. Is that molecularly possible? Oh yes, sorry, I forgot, they're running it through WinZip first.

  • Its all part of the bullet-proof spam-proofing, my friend.
  • heh heh...

    he said barney miller

    heh heh...
  • Blah blah blah huge hardrives possible yadda yadda blah any minute now blah blah blah revolutionize blah blah blah you'll never actually see this technology in use anywhere yadda yadda blah blah....

  • As you (and I) said, not useless, but hardly "required" That's all I'm sayin'. And my pint about cellphones still holds true.

    Kevin Fox
  • I think the average slashdot thread would compress very well :)
  • I'm not quite as quick as most here to completely dismiss this stuff as impossible. They might be eggarating a bit, but more likely than not they've got some pretty clever ways of storing a lot of data on a small area over at Keele. People have proposed gee-whiz storage stuff for years, gotten working prototypes in the lab, but have yet to produce something which has the combo of cheapness and speed of the trusty harddrive. If they have figured out a way to get storage anywhere near what they are talking about, at anywhere near $50, conventional wisdom leads me to guess that the read speed has got to be abysmal. I'm not ready to throw out my harddrives quite yet...
  • "no conventionally moving parts" - yeah, that's the best bit of confus-o-matic speech. Then they go on to say it means that every cm^2 has a moving part, but it's not "conventional". I guess if it were a fiber+gear, it would be conventional. As it stands, maybe they just randomly push the fiber around until it illuminates some of the data you want?

    So to repeat what every other poster has already said, which putz put this story up? I could point you to stories from The Onion that make more sense.

  • Imagine a plank of wood, 2" x 2" x 6'
    Its crosssection is credit card size, they
    didn't specify a height.
  • This sounds a little familiar. I invented an algorythm that allowed compressing data over and over and over. This was back in the old Amiga days. It it took like 2 days to compress an already compressed 512K file 2 or 3 times to 100K though. I haven't got around to doing it on a PC of todays speed. It was pretty amazing really but when it takes such a long time I never bothered to find out the limits it would go to.

    I'm interested to see if it's anything that I have already done. Does anyone have any details on the patents?

  • Data access time is around 100 Mb/sec.

    Don't expect it to replace hard drives any time soon, let alone RAM. 100Mb/sec is pretty slow, compared to, say Ultra-2 SCSI (640Mb/sec), or Ultra ATA/66 (528Mb/sec).
  • While we're all dreaming here, let's do it big:

    AV recording capacity, 33.34MB / minute, 10.8TB = 323,935 minutes of recording for one 16 layer PC Card storage device. Woo. Fucking! Hoo. !!
    Becoming a gargoyle (a la Stephenson) would be a practical reality

    One of the major ways to save bandwith with client-server games is to have a large library of precomputed models, textures and animations. If the game ships on one of these cards, it would be possible to automatically pre-render every possible combination of every possible movement and texture of every model in your game to a near photo-realistic level of detail, letting you just transmit an index. And still have space left over for an abridged Library of Congress.

    Ever find yourself needing to look up something on the 'net, but lack access? Whip out a Net-on-Card and find what you need.

    I just hope it's not vapor. Please, PLEASE, don't let it be vapor.

  • Just in case someone accidentally clicks it in a drunken stupor. ;-)
  • I used to work for a company that made compression software. They used static (i.e. non-adaptive) compression models that where carefully constructed by hand in a high level computer language that was specially designed for writing compressors. You could build compressors for *specific* data that blew all general-purpose compressors out of the water. As a proof of concept we squeezed the 4.2MB text version of the King James bible into 800KB, so that you could carry it around on your Magic Link (the Magic Cap PDA). It was a hell of a job (we even provided fast random access and free text search capabilities) and I really don't think a generic model can do better than that.
  • A search on Google for "Ted Williams Keele University" returns pages on -- an unreachable site.

    But the cache once again comes through: here []. However, it's still light on details, though it does mention that the Prof is Professor Emeritus of Optoelectronics at KU, and that his "main focus over the last thirteen years has been the research and development of 3-dimensional magneto-optical recording systems."

    It appears that this has been in the news before, as early as September 1999, in The Register []. I can't say that I'm impressed with the other "scientific curiosities" they mention CMR promoting, like "Zodee," the "disposable toilet cleaning device which avoids the hygiene problems associated with conventional toilet brushes."

    And now that I look closer, it seems /. itself has posted about this before: back in August 1999 []. Nobody seemed to believe it then either.

    See also Unitel, Inc. [] They claim to be developing HOLO-1, "the first practical quantum-computing device, which can be economically manufactured and introduced into the current computer industry." The esteemed Prof. is listed in their subcontractors section complete with picture.

    -- gold23
  • It depends on what is in the source file.

    Your source file is probably generated by Microsoft. It is not unusual to see MS filesizes that are 100 to 1,000 times larger than the actual text. The padding can contain long strings of zeros, which can be highly compressed.

    Try compressing a text file generated in a plain ASCII editor. You might get different results.
  • I find that I get a tremendous compression ratio with ASCII text. For example, I compress the front page of the local paper with results nearly 1000:1. The compression scheme that I use is a LITTLE lossy, but still quite usable. Here for example is my compression of today's front page (and also of yesterday's front page):

    Unrest in the middle east;drug kingpin arrested;shooting in bad part of town;development protested;local colorful character is colorful;local sports team loses (or wins);50% chance of rain

  • "Was that sarcasm?"

    if he'd laid it on any thicker I would've suffocated. maybe the reason we have people posting crap about CmdrTaco raping animals is because 95% of the people here can't understand more subtle criticism.

    ...and now even ShoeBoy is no more....time to nuke the site from orbit, I think
  • Storage space per square cm always make me think of the same thing: What happens when there is a nanoscratch on the surface of my 18 terabytes-in-twelve-centimeter storage medium?

    Tighter storage media also needs to safeguard the data on it better. Heaven help us all when we back up all our word processor documents to a tenth of a millimeter and a fly sneezes on it.

  • Ok, ok, I read "two" as "ten". I guess hooked on phonics didn't work for me.

  • Ok, maybe not quite so vaporous, but the first thing that came to mind was the TCAP []:

    American Computer Company readying a new kind of semiconducting device which

    rivals the Transistor --the Transcapacitor: a 12-Teraherz Clock Speed Microprocessor
    & Storage "Building Block" Component which could Revolutionize Consumer Electronics
    and all forms of Computing and communications, by making low cost CPUs and Disk Drives run
    as much as 10,000 times faster, consume minute quantities of power and occupy 50 times less space.

    All that, and they packaged it in a Pentium II case! :)


  • Did anyone else notice about a dozen freaking user-tracking cookies were installed by the news website? Several cookies for every damned advertisement, plus more.

    Fortunately, I use Opera. It alerts me and lets me block 'em. :-)

  • NTFS is supposed to be able to handle a 16 exabyte partition. That shouldn't be difficult with a 64-bit block address. Do any of the common I/O interface schemes support such large block addresses? Another question is how well do the algorithms and data structures in the file system scale to really huge partition sizes.
  • 10.8TB = 1064 DVD's (presuming 10.4GB per DVD)

    MPAA must be pissed off.

    = 17,400 CD's (presuming 650MB per CD)

    So is RIAA.

    ...I like this technology already.


  • Damn /if/ these things are real I want one now! Or at $50 a pop I'd grab several and have myself a nifty 100TB's of space to toy with. And I was happy w/ a mere 50GB. Well not really, I keep having to add new drives, but 50GB drives for less than $200 sounded good. Waiting is hard! :)

    However I agree that a lot of people, including Slashdotters, seem to think anything that isn't out now is vaporware and anything that isn't likely to be out in less than two years is sci fi. Maybe it'd be nice if having an idea made it suddenly materialize but unfortunately a lot of it takes work and that takes time. That doesn't mean the idea is vapor, just that it isn't ready to slap in plastic bubble paper and post mark to every Sam, Dick, and Mary that knows how to order from
  • Well that's pretty unremarkable. They've written a compression algorithm.

    Yeah, mp3s are pretty unremarkable too...


  • Just a thought here, folks....

    I think it might be important that we get copy protection/copyright issues resolved before these new storage technologies arrive.

    As more proprietary stuff is produced -- and if it has killer-app serious storage capabilities -- several things will happen:

    1) people will realize that they can store all the movies they want to watch and trade on their peer-to-peer networks
    2) The media bullies of america will realize this too, and rather than develop a new business model and adapt, will demand draconian restrictions.
    3) It will be easier to slip the "protection" mechanisms into the emerging proprietary technologies....

    Bottom line: we need to make sure the issue is resolved sooner rather than later.

  • From the article: "Possible applications for the memory include hand-held computers and mobile phones, which require large amounts of memory in a compact form."

    Funny, I don't think of PDAs and cellphones as requiring large amounts of memory. My PDA has 2 megs, not 10 terabytes. My phone has about 32K, not 32 trillion K. Yet both seem to do their jobs pretty well...

    Besides, cellphones, by definition, have wireless connectivity. What do they need gigs and terrs of storage for?

    Kevin Fox
  • OK. Physics analogy. We have a system (a text file) which has a set amount of entropy and a set amount of total energy. The free energy, therefore, is effectively going to be the total energy minus the entropy. (This isn't quite analogous to physics, because of dimensions, but here we're dimensionless. If you really care, assign k=1J/bit, and put us at a temperature of 1) Now, for one byte, the total energy is 8, the total entropy is 1 (again, dimensionally energy is 8 J, entropy is dimensionless (temperature is energy) so there's technically a factor of k lying around, but I don't care), so the free energy is 8-1=7. We can extract 7 bit of 'energy' from the system. Thus, we can reduce a byte from 8 bits to 1 bit without changing the entropy: thus, we can do it reversibly.

    Hey. That sortof looks like a compression algorithm.

    I wish I had moderator points left over... pointing out that he's an idiot (besides the flaming profanity) is the best I can do.
  • P.S.: I'm preemptively repairing a mistake of mine - the {1,0,0,0} distribution function becoming a {0, inf, inf ,inf} information content is slightly wrong: that's technically surprise, not information. Getting a quarter doesn't surprise you at all, getting anything else surprises the hell out of you.

    Surprise and information are related somehow, but I can't remember how, and I think my text on this is at work right now. So, if someone could fix that, I would be quite grateful.

    How's that for a first on Slashdot? :)
  • No, it is not 'unrelated crap' at all - that's what information theory is about. "Pattern based" and "lookup based" compression is exactly identical to this style of compression, just using slightly longer 'objects', instead of individual bytes (chars).

    And, for your information, entropy means exactly the same thing in compression as it does in physics: it's the total information available in the system. You can't compress something past its entropic limit without the compression being lossy. (There's no physical analog to information loss without black holes being involved, and even that's questionable).

    What you're talking about is the fact that there are only "generic" compressors, rather than "format-specific" compressors - they treat all data as random byte-strings without any structure. I don't know if any 'format-specific' compressors exist: it seems that any useful program would have to include a whole lot of 'format-specific' types, and the main problem is that the things which store huge amounts of space *are* essentially random patterns of bytes.

    No one's really worried about their text files filling up their hard drive. Honestly, I think if you can find a kind of file which compresses poorly via standard methods and takes up huge amounts of space, I'd be surprised. Video? Already have video-specific compression. Images? Already have image-specific compression. Audio? Already have audio-specific compression.

    The problem with standard compressors nowadays has nothing to do with the method they use to compress: some extremely smart people are working on this, and they've found that this analog is exactly true - information theory works. Period.
  • Well, I don't have to address the problems in your argument so far, as other people have. But, as per the black hole example, yes, Hawking radiation might possibly be an anti-entropic process, which is very interesting. However, the point is hotly contended, as it's tied up in quantum effects, and you're dealing with a region where we don't really understand the quantum effects.

    It wasn't a bad example - it was a 'curious' example, because Hawking radiation is a 'curious' process. It is anti-entropic, at least to our understanding of it. Granted, we don't have a happy black hole to play with in the lab, but... who knows?

    Hey, after all this, someone actually might be able to figure out some way to generate a better-than-Carnot engine using a black hole. Its cycle time would probably be insanely high (of order billions of years).

    My personal guess, however, is that we're all smoking crack and there's quite a bit more to Hawking radiation than we think due to other effects. For instance... how exactly does the area of the black hole change upon emission of a Hawking quanta? Does it change perfectly radially? It can't - that would violate causality. It has to 'ripple' across the black hole - this will cause a gravitational wave as well. It's possible that this gravitational wave may contain additional information/entropy as well. I don't believe in perpetual motion machines.

    By the way, I will "make that argument", and if I use a word in a way that's counter-intuitive, sucks to be you. Change your intuition. But, as others have pointed out, I'm not using it counter-intuitively at all - entropy is information. Period. End of story.
  • Yes, but there's a way that they're related mathematically and I can't remember. There's a sum in there somewhere so you get a number rather than a distribution, and I can't remember. I want to say that the partition function is actually something like the sum of the probabilities in the distribution, but there'd have to be something multiplying the sum or you'd get 1. Quite unfortunately, I still can't find my text on this, so I can't check this.
  • Sounds like a litho-fab scanning tunneling microscope. Lots of people have been talking about them. It's about time someone talked about production.
  • This is a highly unconvincing attempt at hyping what is in all likelihood a non-existant product.

    Sound to me like a highly indexed hash table, with a large token space

    by comparing each word with its predecessor and recording only the differences between words

    Not enough details there, but 8:1 compression using a token/hash scheme sound reasonable. I've heard that web search engines (altavista, google, and their ilk) use a similar algorithm to obtain between 10:1 and 20:1 compression on web texts, since there is so much redundancy in web pages. Since most pages have identical lengthy string sequences (trashed slightly because I haven't the energy to figure out the /. html eater) similar to {HTML}{HEAD}{TITLE}foo{/TITLE}{/HEAD} they can be reduced to much tinier tokens, those 34 common ascii characters could be reduced to a 10-12 bit token, quite a savings.

    Since I work with a lot of already compressed data, I discount any media compression claims. I'd avoid any storage media which incorporated hardware level compression, because it would eventually lead to problems. Real databases maintain their own raw partitions on disks, since they can create a highly efficient file system for their own purposes. When the hardware starts returning varying free space results because compression isnt working, DBs either fall over hard (sybase) or fill the logs with errors (oracle).

    The magneto-optical-fluid disk sounds like they have a laboratory sized research project they hope to reduce to the footprint of a credit card, but they neglect to mention it towers 208 inches high :-)

    with no conventionally moving parts

    Whenever something sounds like a marketing press release, with modifying adjectives like conventionally, it pays to be skeptical, the forte of slashdot.

    the AC
  • Hasn't everyone been fooled so much by they just reroute it to in etc/hosts?
  • Hrm. I thought lzw was the sliding window approach, where compressesd text was either verbatim text or a (go back n chars and copy c chars) tuple. The window is how far back you search for a good match. I'm sure there are wonderful dynamic programming algorithms to speed that search up. It sounds more like you are constructing a huffman tree with a non-uniform depth.

    Or is that what LZW is? in which case what is the name of that sliding window approach?
  • The pigeon hole argument is a great way to get people to see the folly of this claim: Given all possible x byte long inputs, these should compress down to y bytes (y<x). The kicker is that there are fewer y byte messages than x byte ones, so

    EITHER some of the plaintexts must compress to the same shorter bit string (in which case how do you choose which one to decompress to) -- two pigeons in the same hole

    OR some of the plaintexts don't compress at all.

    So in either case, the press release is drivel. Caveat investor.
  • What else is all over the planet, and in the same size everywhere?
  • This is not the first Earth shattering memory technology announcement. Why is it that these things keep cropping up with all these promises of delivery in the short term and never materialize. Can we really expect anything more of this announcement? Won't is just fade away never to be seen like all the other ultra dense ultra cheap memory promises of the past?
  • You are assuming that this card thing is persistant storage. Without power its got notta.
  • From

    Keele University

    The University College of North Staffordshire was founded in 1949 to become the University of Keele in 1962.

    There was a deliberate aim to break away from the pattern of the specialized honours degree, avoiding as far as possible the divisions between different branches of study. Consequently, most students read four subjects in their degree course, two at honours level and two at subsidiary. At least one of these subjects must be from the arts or social sciences, and at least one from the natural sciences.

    Many students have taken a four-year course, beginning their studies with the Foundation Year, in which they follow a broad course covering the development of western civilization.

    Most students live on campus in halls of residence or in self-catering flats, and many staff also live on campus.

    The Keele Estate

    The University is situated on an estate of 650 acres, with extensive woods, lakes and parkland, formerly owned by the Sneyd family.

    The Sneyds can be traced back in north Staffordshire to the late 13th century, but they came into the posession of the Keele estate in the mid-16th century.

    The present hall was rebuilt in the 1850s for Ralph Sneyd (1793-1870) to the design of Antony Salvin at a cost of about £80,000. The grounds and gardens were magnificently laid out around it, and many interesting features survive today, such as the remarkable holly hedge, originally 199 yards long, 28 feet thick and 35 feet high.

    At the beginning of this century, the hall was let to the Grand Duke Michael of Russia, who entertained King Edward VII there. Later, however, it remained empty, and troops were stationed on the estate during the second world war.

    Publications are available which give the history of Keele in much greater detail:

    A book entitled The history of Keele (edited by C.J.Harrison) is now regrettably out of print, but copies may still be found in appropriate libraries.

    Pamphlets by J.M.Kolbert entitled The Sneyds; Squires of Keele, The Sneyds & Keele Hall, and Keele Hall; a Victorian country house are obtainable from Mrs D.Warrilow in Information Services (Tel. (01782) 583232).

    Off the record; a people's history of Keele by Angela Drakakis-Smith, published by Churnet Valley Books at £8.95 (ISBN 1-897949-21-9) should be obtainable through any bookshop Keele; an introduction by Michael Paffard, obtainable from Chris Wain at the Alumni Office, Keele University, ST5 5BB, (Tel. (01782) 583370) for £3 (ISBN 0 9534157 0 8)

    Keele; the first fifty years by J.M.Kolbert, published by Melandrium Books, obtainable from 11, Highway Lane, Keele, ST5 5AN, for £16.95 + p&p. (ISBN 1 85856 238 4)

  • with no conventionally moving parts

    As opposed to, say, unconventionally-moving parts??
  • Every few months another one of these comes out. And somehow we never see anything come of them. We have yet to see the holographic cube storage or the one that uses alien technology or any of that crap. This sounds like another one of those.

    Now if IBM comes out and says they've found a way to squeeze 10 terabytes into the space the size of a credit card, I'll be impressed.

  • The blatant reposts are getting annoying.

    This technology will be ready for market in two years? Is that two years from now or two years from when an almost word-for-word identical article about this [] was posted a year and a half ago? =)

  • What they don't tell....
    is that they've gotten 8:1 compression on *unicode* english ASCII text.
  • Bzzt, wrong answer. You also don't have the terms right. Based on experimental results the entropy per character of the English language is about 1.25 (the range is somewhere between 1 and 1.5). This means that English has a redundancy of approximately 75%, implying that with an appropriate encoding, English text can compress to about 1/4 of its original length.

    (Note: character != byte. That is only true of ASCII characters. If all you wanted to do is represent the 26 English letters it would only take 5 bits per character. We're talking language here, irrespective of representation.)

    Go read a good cryptography book and straighten out your terms and definitions.

  • You're compressing html - html is much more structured and redundant than english.
  • Well, first, 8:1 compression of English text isn't that big a deal, especially if the original is 8-bit bytes. Dictionary-based algorithms like LZW (i.e. "zip") often do that well on text.

    Using a liquid between the read/write head and the recording surface would help the optical coupling between the surface and head, but creates a whole new set of problems. Probably puts a ceiling on media speed, for example. A whole set of mechanical problems have to be overcome to turn that into a commercial technology. Whether it's worth the trouble remains to be seen. For a 4X improvement in MO drive densities, probably not.

    (There's a neat variation on this idea used for scanning photographic film, called a "wet-gate transfer". The film is immersed in a liquid with the same coefficient of refraction as the film base. This makes minor scratches disappear.)

  • Why the hell would this be usefule in cell phones and PDA's? This has to be the most vaporous and hype ridden article I have ever seen. This is worse than what the local news reports on Napter (you all know what I'm talking about, they know less than the average person who uses Napster). If this product was real why the HELL would they give a SHIT about cell phones or PDA's? It could change so many things not the least of which is an AVERAGE COMPUTER. They could easily sell it for more than $50 and large corporations, small corporations and computer illiterate people would still buy it in droves. If this were real they would already be rich from investor capital. They would have generated more press than this from the theoretical possibility.
  • Ok, the statement from the university was alot better than the actual article,( [] until I got to this

    wristwatches could have vastly more power than today's PC Computers.

    Can nothing about technology go without being tainted with sensationalism. I am not even going to point out why this wrong, as I am sure eveyone realized just how stupid it is without me having to say it.
  • Way enough for all my pr0n and mp3z, or is it?
  • $50/10Terabytes = $5/T = .5cents/G. Current tech gives us .5cents/M, doubling every year or so. Commercialization in 2002 means a claim of an eight year technological leapfrogging.

    The specific claimns are:

    • Better compression (1/8 on text.)
    • Impressive, but not impossible. This doesn't favor any specific hardware: any tech can use it. So now we are left with 6 years worth of hardware advances.
    • Claim 2: quad density read/writes on mostly conventional media.
    • Huh? No details given. Two year leapfrog from magic (coatings/software unchanged.) 4 years left to account for.
    • Claim 3: 30-fold increase due to new coatings and materials.
    • A five year advance.
    • Claim 4: 10T on a credit card sized device
    • This is an implementation, not an invention. No credit.
    Three advances give us -1 years of technological leapfrogging: so the manufacturing process in 2002 should be about twice as expensive as current disk drive fab. All the major storage firms are demostrating lab models with ultra-high bit/cm numbers. Now a minor university team has made major simulataneous advances in compression, r/w density, coatings/materials, packaging, and, above all, commercialization.

    Excuse me while I snort beer through my nose.

  • Score: -1 Redundant
    • The first invention is a method of compressing text stored in binary form, which expresses information as a series of noughts and ones, by comparing each word with its predecessor and recording only the differences between words. This compresses the data to an eighth of its normal size.

      The second invention involves a different way of recording and reading information, increasing four-fold the amount of data that can be held on magneto-optical disks, which are used for storing computerised data. The third invention provides new kinds of coatings and materials that can be used in disks, providing a 30-fold increase in capacity.

      The fourth and most interesting invention produces a memory system that enables up to 10.8 terabytes of data to be stored in an area the size of a credit card, with no conventionally moving parts.

    This is like a total technological troll.
  • You'r right, gzip isn't that good:

    C:\WINDOWS\Desktop>bzip2 -k page.htm

    C:\WINDOWS\Desktop>dir page*

    Volume in drive C has no label
    Volume Serial Number is
    Directory of C:\WINDOWS\Desktop

    PAGE_F~1 02-13-01 12:17a page_files
    PAGE HTM 59,243 02-13-01 12:17a page.htm
    PAGEHT~1 BZ2 8,098 02-13-01 12:20a page.htm.bz2
    2 file(s) 67,341 bytes
    1 dir(s) 3,892.71 MB free


    That's a 7.32 ratio.
  • Riot! I love it, but it's not half as good as my multivariate transaxial parser generator. It can recompile the kernel in 0.2 seconds on my 386.

  • Here is some accurate information:
    In computer memory format, the system has a capacity per sq cm in excess of 86 Giga Bytes of re-writeable RAM data - this equates to a memory capacity of 3400 Giga Bytes(3.4 TB) within the surface area of a credit card! Data access time is around 100 MB/sec. A single unit with this capacity, but using the computer's processor, has a physical size of about 3 cm x 3 cm x 1.5 cms (high). An additional advantage over existing data storage systems is that only 20% of gross capacity needs to be allocated for error correction, which is significantly less than the 40% for hard disks and 30% for optical storage. Production costs are anticipated to be less than £30 [$50] for such a unit.
    (Taken from the link posted by comment #58.)

    So access times are much slower than for a conventional hard disk.

  • 8-fold compression by only storing the difference between words... could someone tell me how this is possible? Now, I know some amazing compression things have been done (.the .product []) but this is just text.. I don't understand.
  • This man had way too little Ritalin this morning.
  • You know what was ironic? For alot of machines with slow hardrives, those compression tools speed up overall performance.

    I remember a 386-25 box i had, with a pair of 32 MB RLL hard drives. Those things were slow! But see, the time it took to read the file (much smaller, because of compression) and decompress was much smaller than the time to simply read the file at its entire normal size. This particular box was significantly faster when I was running my BBS off it, specifically because of the drive compression tools.

    Just my two cents.
  • I remember in a recent Information Theory course I did at Uni, we learnt that the information content of an ensemble with 26 different equally possible outcomes is 4.7 bits per symbol.

    That would be a very crude way to compress. LZW compression (and similar algorithms such as the one in gzip) find multiple-byte patterns, which are reduced to smaller and smaller bit representations as they occur more frequently. For example, if I had "ABCABCABCABCABCABCABCABC", it would figure out that "ABC" is being repeated and use a smaller number of bits to represent it.

    That's why English text can typically be reduced by 8-10:1 compression, because there is so much redundancy in words. Try doing a gzip on a log-style file with lots of redundancy and you'll often see 100:1 compressions.


  • Hmmm. So they say they could house 3.4 Tb on a unit the size of a credit card? Hmmm. Well, since they say it will have an access rate of 100 Mb/sec, I take it it would take 9 hours to read everything from a unit of that size? Hrm... Hot-syncing might take a little longer then, huh?
  • Take a look at Microsoft's latest release - Outlook Mobile Manager []. The bit of interest is how they compress text [screenshot []] using their *new* technology Intellishrink.

    You can choose various levels of text compression from none, remove spaces/punctuation to remove vowels...


  • Let's put the two facts together.
    1. They claim to have come up with a new way of compressing text.
    2. They claim to have a memory system that enables up to 10.8 terabytes of data to be stored in an area the size of a credit card, with no conventionaly moving parts

    Well, I just took out a business card, and wrote on the back "The letter 'a', repeated 10.8 x 2^40 times". Did I just store 10.8 terabytes of compressed data in an area the size of a credit card?

    Call the press!

    This may be impressive, even revolutionary, but we need more technical details.

  • This is almost as believable as the paper phones [] of a few weeks ago . . .
  • Yeah, uh, no. The absolute best text compression algorithm today (RK) achieves 1.42 bits per character. For some REAL STATS (!), see The Archive Comparison Test.

    Well, most people don't buy massive storage units so they can keep extra copies of the English translation of The Three Musketeers lying around. Data is often much more redundant than that. For example, in my job I routinely deal with very large log files (1GB+). These often compress to 1:100 or better due to the large amount of redundant content.

  • Every university has a few wack job Emeriti running around spewing garbage about something or other. Emeritus means "ok, you can still hang around, but stop bothering us."

  • It's not the technology, it's the human using it. Ever wonder why almost all paper is the same size (8.5x11, A4, legal 8.5x14)? It due to ergonomics. There are certain sizes we can work with easily, and those are the sizes that are either targeted or selected by market forces. Good example, the handheld computers. Palm devices have done wonderfully. Why? They can fit in a pocket and can be held in one hand and used in the other. The other handheld computers did poorly because they were too large to fit into a pocket, and they needed a surface to be used comfortably. (Well actually, I would like to meet someone who could comfortably type on those keyboard, but that's another story).

    Back to the question: Why credit card sized? Simple. Its easy to hold either with the fingertips or in a fist, its easy to carry, it would fit in a pocket or a wallet, they are lite enough not to be noticeable. Most people could carry one or two of these to work with no effort. I would like to see you do that with a standard hard drive, or a zip disk, or even a floppy disk.

  • " . . . 10.8 terabytes of data to be stored in an area the size of a credit card, with no conventionally moving parts... ...Each square centimetre of this memory system is a closed unit containing a metal oxide material on which data are recorded, and a reader made of a fibre optic tip suspended above the material in a lubricant."

    notice the language: no conventionally moving parts... plenty of unconventional movement, though. ;|

    Which brings me to my point: how can this invention be aimed at the mobile/palm markets if the read head is floating in lubricants?! here's to hoping they license some skip/shock technology from the walkman crowd...
  • The first invention is a method of compressing text stored in binary form ... by comparing each word with its predecessor and recording only the differences between words. This compresses the data to an eighth of its normal size.

    Really? Just working on the above quote, I do not see much in the way of compression, especially 1/8th in size. It might work for a dictionary, but actual useful text is going to be less similar.

    Another question I have is this actually REWRITABLE? I mean, I am reading this and they talk about recording and reading. However, is this write-once/read-many technology (in which case, it would be useful for technical reference)? OR is it write-many/read-many, in which I can upgrade my hard drive to 250x its current size for $50? I suspect it is the former, in which case, it is a nice idea but not as useful at first glance.

    Even if it is only write-once, the ability to have 10 terabytes for storage in say a cell phone (even if I cannot reuse the data space) is still impressive.

    what would one do with all that space? There isn't enough porn or music to actually download ... oh wait nevermind.
  • by KodaK ( 5477 ) <> on Monday February 12, 2001 @07:06PM (#436293) Homepage
    Personally, I would love to have one! Just think of all the Pron and l33t warez I could store :)

    Oh, get real. Both you and I know that by the time this technology (if it's real) makes it to market a standard OS install (take your pick, it won't matter) will be 5TB, using up half of it right off the bat. I, for one, will not be looking forward to buying Linux Kernel Internals -- 33rd printing, volumes 1-53.

    And, in ten years, I'll STILL be on a fucking 56k-when-hell-freezes-over-more-like-26.4 dialup while Suzy N'Syncempeethrees and Sammy Likestoforwardjokes III have blistering Ultra-DSL at 30Gbps. Grrr.

    Sorry for the rant.

  • I'm no compression expert but I regularly get 10:1 compression on text files using guess what? WinZip.
    As the other poster has pointed out, your text files are almost certainly *not* straight English ASCII text (they're HTML, Word files, C code, Unicode-encoded, or something else again). I'm sorry I wasn't more clear in my post to explain that I was referring to straight ASCII text, not anything else.

    As to my colleague, he'd read virtually all the published literature in the area, and he's a pretty smart cookie (he's now on a PhD scholarship at Princeton working with people like Tarjan). I think the thing I learned most from his efforts were that text compression is in a period of diminishing returns for improved algorithms - they're not likely to get much better.

  • I'm not at all surprised that they can get 8:1 compression of plain text.
    I am. One of my postgrad colleagues (back when I was a postgrad) did research into text compression. The best that he could get on the KJV Bible was a little over 1.8 bits per character (about 4.4:1 compression), and IIRC the best *anyone* has ever done with a general-purpose compression scheme is a bit over 1.7, and it turns out that the bible is generally a little more compressible than most other text ;) Generally, you'll struggle to get better than 4:1 on most text, and that's using using compressors that are substantially slower than gzip or even bzip2.

    While it is correct that studies with humans have indicated that English text has about one bit of entropy per byte, suggesting a natural limit of about 8:1 compression, humans have the use of a whole lot of semantic information (they understand the meaning of the text and can therefore predict words based on that) that no compression algorithm I'm aware of has used.

    I'm taking this with a large grain of salt, thanks.

  • by barawn ( 25691 ) on Monday February 12, 2001 @09:21PM (#436296) Homepage
    Wow, a chemist using a term wrong... amazing. To be specific, entropy doesn't come from chemistry, it comes from physics. Granted, these two were identical fields at the time, or at least, closely related, but the term came from studying thermodynamics, not chemistry.

    OK. A little background on information theory for you - you know, from Shannon, back in the early 1900s, I believe, though correct me if I'm wrong. There is an object in information theory called the partition function of an experiment - it is essentially the chance of getting any result from that experiment. There is then an object called 'information', which is proportional to the log of the partition function. The lower a chance of getting a result, the higher the information content gleaned from that experiment. For instance, if you had a box full of quarters, and you randomly pulled out a coin, the partition function would be (quarters, dimes, nickles, pennies) {1,0,0,0}. The information content of that experiment is klogW: (0,inf,inf,inf)- you don't learn anything if you pull out a quarter. You knew there were only quarters to begin with. If you get any of the other ones, damn, you're surprised.

    What does all of this have to do with entropy? Well, in thermodynamics, which is, you know, where the term COMES from, entropy is klogZ, where Z is the partition function of the system: essentially the same thing. k is Boltzmann's constant - it comes from the Celsius temperature scale.

    So, here's the news flash: Entropy is information. Period. Therefore, he was using the term CORRECTLY, not INCORRECTLY. Entropy is USEFUL information, not USELESS information. Guess what? This is the same in chemistry, too. The universe doesn't care whether or not you can use energy for work, and entropy has nothing to do with 'randomness'. A 'random' distribution of matter in a universe will collapse to a 'nonrandom' sphere, thanks to gravity: if entropy was randomness, then the universe would have just violated the second law - it went from 'random' energy to 'nonrandom' energy. (mass=energy, so don't even try it)

    Entropy is information. Period. Hence the second law of thermodynamics- entropy increases because the information content of the universe is increasing. If you doubt me on this, here's a simple bit to convince you: you have a system which goes from state 1 to state 2, both of which have the same entropy. Therefore, there is a reversible process which connects the two states, which means that one can go from state 1 to state 2 and leave no tracks inside the system that the change had happened - i.e., the information content of the system is static.

    I'm really getting sick of having to explain this constantly - I wish they would never teach entropy as 'useless energy' or 'unusable energy' - like the universe cares whether or not something can be used for work.

    The link between information and entropy is entirely well known and extraordinarily important. For instance, if an object falls into a black hole, is there no record of its existance anymore? Is all the information that was inside that object lost? No - a black hole's area is related to its entropy, which increases with mass. Therefore, the 'information' (as far as the Universe cares) in that object is now somehow stored in the black hole's event horizon. Curiously enough, an object which falls into a black hole is, from the outside world, constantly getting infinitesimally closer to the event horizon. This is a weak argument, yes, and changing a few words could make it stronger, but this is offtopic, so I don't care.

    In closing - you're wrong. Entropy is useful information. 1 bit of entropy out of 8 means an 8:1 compression ratio. Here, you've 'extracted' 7 bits of 'work' out of the system. The remaining 1 bit of entropy cannot be removed from the system, as entropy can never decrease. (or in this case, cannot decrease without destroying the system)
  • by lizrd ( 69275 ) <adam@bu[ ]us ['mp.' in gap]> on Monday February 12, 2001 @07:58PM (#436297) Homepage
    I think that "no conventionally moving parts" means that they are using a Wankel rotary engine to move the parts rather than the conventional 4-stroke design. I must admit that it is a pretty clever hack to figure out how to use mechanical storage to get that kind of density and even more curious that they chose the rather oddball Wankel design over battery power which is usually used for small devices.
  • by ca1v1n ( 135902 ) <snook AT guanotronic DOT com> on Monday February 12, 2001 @06:45PM (#436298)
    I'm not at all surprised that they can get 8:1 compression of plain text. It is a rule of thumb for encryption that plain text, at least in English and with ASCII, has only about one bit of entropy per byte. While it is impressive that they've managed to get rid of almost all of the slack, it doesn't strike me as that hard to believe.
  • by nomadic ( 141991 ) <nomadicworld@gmai l . c om> on Monday February 12, 2001 @07:32PM (#436299) Homepage
    If you don't want to read about vaporware, then you should probably read instead of
  • by PineHall ( 206441 ) on Monday February 12, 2001 @08:08PM (#436300)
    Here is some more info I got from Google's cache for


    UPDATE - November 2000 During 1999 Keele High Density Ltd. (KHD) announced that it had developed a very high density memory system capable of holding 2.3TB of memory in the space of a credit card. Further work since then has resulted in some significant upward changes to both the capacities previously stated and to the applications the KHD technology addresses. Some of this work is continuing, and there are further patent applications to be filed. The information available publicly is necessarily restricted until those patents have been filed. The very high data densities are achieved through a combination of many different factors - some relating to the physical properties of the recording media, and some to the way of processing and handling data. The physical memory system is a hybrid combination of magneto-optics and silicon. The KHD memory system is applicable to both rotating and fixed media, and is not dependent on the laser-based media-addressing system used. Following the work undertaken since last year, the following data capacities are achievable: a) For rotating media, at DVD size, a single-sided capacity of 245 GB using a red laser. b) For fixed media, a single-sided capacity of 45 GB/cm, giving a total capacity of 3.6 TB on the surface area of a credit card, double-sided and using a red laser. Using a violet laser (now being introduced), the capacity at credit card size will be 10.8 TB. In last year's announcement from KHD the primary focus was on the fixed media application, which with a novel form of laser addressing, could be described as 'near solid state' - involving no moving parts in the conventional sense. However, this aspect of the technology will require some further R&D work to bring it to a mass-production scale - although it is believed that this will not present insurmountable difficulties. These constraints do not apply to existing rotating media applications (for example, DVD), using conventional laser systems, and there are no reasons why the KHD technology cannot be implemented within a short timescale - measured literally in months. A major development arising out of KHD's work over recent months, is that the technology achieving these very high data density figures has application not just for memory systems, but will also produce significant enhancements for the transmission and processing of data generally. This means that KHD's technology can achieve an effective increase in bandwidth capacity, because the very high data density properties, which are in addition to those from conventional compression methods, allow so much more data to be transmitted over a given bandwidth. The same advantages are also felt in terms of processing speeds. Work on this aspect of KHD's technology is continuing, but the current calculations show that an effective eight-times increase in bandwidth capacity and processor speed can be achieved. KHD's development represents a fundamental advance in computing technology, with the benefits being felt across many industry areas. Following completion of the patenting position, KHD will be looking to license the technology to companies for mass-production, and for the ongoing R&D work needed to make the 'solid-state' memory commercially viable. The technology has been developed by Professor Ted Williams at Keele University, Staffordshire, England, over a period of thirteen years. PROFILE: Ted Williams is Professor Emeritus of Optoelectronics at Keele University, Staffs, England, and visiting Professor of Electronic Engineering at South Bank University, London. Professor Williams was Director of Research with Sir Godfrey Hounsfield, Nobel Prizewinner, working on the invention and creation of the first NMR Scanner at Hammersmith Hospital, London. He has also held directorships with major international companies. His main focus over the last thirteen years has been the research and development of 3-dimensional magneto-optical recording systems. KHD's licensing and funding arrangements are managed by Mike Downey, Managing Director of Cavendish Management Resources. CMR is a venture capital and executive management company, based in London. CMR has supported the development of this technology. Further information from: Mike Downey Managing Director CMR, 31 Harley Street, London W1N 1DA Tel: +44-(0)20-7636-1744 Fax: +44-(0)20-7636-5639 Email: [mailto] Web: []
  • by glyph42 ( 315631 ) on Monday February 12, 2001 @09:29PM (#436301) Homepage Journal
    Yeah, uh, no. The absolute best text compression algorithm today (RK) achieves 1.42 bits per character. For some REAL STATS (!), see The Archive Comparison Test []. Considering the amount of work that has gone into text compression in the past few years (going from 2.0 to 1.42!) and knowing the theory myself, I find it ludicrous that someone unknown in the compression scene would come up with such an algorithm, which by the sounds of the simple description would likely already be covered by many patents. It's definitely overstated, and likely far inferior to the current state-of-the-art in compression.
  • by freq ( 15128 ) on Monday February 12, 2001 @06:44PM (#436302) Homepage
    This article is pure crap. Professor soggybottoms invents ten fabulous new technologies that will instantaneously revolutionize the entire computer industry, all while fixing himself a ham sandwich...

    film at eleven...
  • by mduell ( 72367 ) on Monday February 12, 2001 @07:02PM (#436303)
    10.8TB = 1064 DVD's (presuming 10.4GB per DVD) = 17,400 CD's (presuming 650MB per CD) = 7,864,320 floppies (presuming 1.44MB per floppy) = 371,085,174,374 of those new MOT 256bit MRAM chips.

    Anyone want to come up with some other ratings ?

    Mark Duell
  • by sparrowjk ( 214769 ) on Tuesday February 13, 2001 @02:45AM (#436304)

    Let's see...

    $ nc 80
    HEAD /research/cmrkeele.htm HTTP/1.0

    HTTP/1.1 200 OK
    Date: Tue, 13 Feb 2001 11:39:15 GMT
    Server: Apache/1.3.12
    Last-Modified: Fri, 20 Aug 1999 12:16:30 GMT
    ETag: "239a2-f60-37bd471e"
    Accept-Ranges: bytes
    Content-Length: 3936
    Connection: close
    Content-Type: text/html

    Last modified 20 Aug 1999? Not what I'd call "breaking news"... If you don't believe the server date, try this corroborating evidence: eptember/000002.html []

    Why decided to post the story now, I couldn't say...

    == Sparrow

  • by ideut ( 240078 ) on Monday February 12, 2001 @06:52PM (#436305)
    This is a highly unconvincing attempt at hyping what is in all likelihood a non-existant product.
    The first invention is a method of compressing text stored in binary form, which expresses information as a series of noughts and ones, by comparing each word with its predecessor and recording only the differences between words

    Well that's pretty unremarkable. They've written a compression algorithm.

    Oh, by the way, they have also invented

    "a memory system that enables up to 10.8 terabytes of data to be stored in an area the size of a credit card, with no conventionally moving parts"

    If that were true, why are they bothering to even *think* about their text compression algorithm? Fifty dollars a go? Who wants compression? If these people are telling the truth, we are talking about a thousand-fold increase in gigabytes per dollar over the space of two years.

    The phrase "no conventionally moving parts" also brings to mind images of really whacky, non-linear moving parts flailing about. What the hell do they mean?

    Absolutely no technical detail is given in the article, and as far as I'm concerned, this is yet another false alarm on the long road to entirely solid-state computer systems.

  • by HomerJ ( 11142 ) on Monday February 12, 2001 @06:53PM (#436306)
    Why is it every piece of new tech is the size a a credit card? Can't be the size of a dollar bill? or what about a piece of sliced bread, considering all this new tech is the greatest thing since.

    I just want to know what every tech inventor's opbession is with everything being the size of a credit card. It's not like we are going to fit these in our wallets. "Sure Mr. Tanaka, I have my 20 terabyte database here in my wallet, care to swap?"

    I dunno, I just wish technology came in different sizes I guess.
  • by joshv ( 13017 ) on Monday February 12, 2001 @08:50PM (#436307)
    Man, I am so glad that I read slashdot. Without slashdot I would have to sift through tons and tons of bullshit every day just to find the new and amazing technological advances of the age. But no, I read slashdot, so I can come here and find the best of the best, such as this dandy invention.

    Wow 10.8 TB on a credit card, wahooo! What will they think of next? How do I send them guys my money? I couldn't find any address or nothing, but those english 'blokes' sure look like they is gunna go far with this invention - specially that text compression thingy - pretty damned original if I do say so myself. And then that storage mechanism 'no conventional moving parts' - I can't imagine how they got those conventional parts to stop movin, sound like quite a trick.

    Anyway, don't you slashdot guys let the criticism get you down. I am with you. Don't listen to them naddering nabobs of negativism. They always persecute the dreamers!

    I am looking forward to your next 'Light speed limit possibily violated' post with anticipation.

  • by Coward, Anonymous ( 55185 ) on Monday February 12, 2001 @08:20PM (#436308)
    what about a piece of sliced bread

    The size of bread slices varies widely from region to region, this prevents multinational corporations from referring to their products as the size of a piece of sliced bread. Although ANSI created a sliced bread standard in 1986 and updated their standard in 1992 to account for the coarseness of pumpernickel, this is an American standard which prevents any companies wishing to sell their product outside of the United States from using it and unfortunately the ISO has been dragging their heels on forming a sliced bread standard, so until the day when we get the ISO sliced bread standard you can expect many more credit card sized comparisons.
  • by stu72 ( 96650 ) on Monday February 12, 2001 @08:24PM (#436309)
    All right,

    lynx 4&mode=nested&threshold=-1 > slash.txt

    (no -source option because this is Slashdot, and as we all know too well, the content is much more redundant than repeating html tags, much, much more redundant)

    shelf:~$ ls -l slash.*
    -rw-r--r-- 1 stu users 20394 Feb 12 21:09 slash.bz2
    -rw-r--r-- 1 stu users 23750 Feb 12 21:09 slash.gz
    -rw-r--r-- 1 stu users 93867 Feb 12 21:09 slash.txt

    This gives a ratio of 0.22. Surprisingly, if you feed the same page to bzip2, but at +2, the ratio increases to 0.27, implying that there is more entropy and thus, more information, in higher scoring posts, which of course, we know to be false :)

    Perhaps with this firm mathematical footing, /. can proceed to a new chapter in moderation - moderation by bzip2. Articles which receive high compression ratios are marked down automatically. Of course, this would make it possible to earn a lot of karma, simply by posting random garbage. oh wait..

  • by Darkwraith ( 258716 ) on Monday February 12, 2001 @07:10PM (#436310)
    Here's a link from the university. []It sounds like it's real to me.

When you are working hard, get up and retch every so often.