Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Technology

Turing Award Winner On The Future of Storage 227

weileong writes "Ars Technica highlights an interview at ACM Queue with Jim Gray, a winner of the ACM Turing award *(among other things) by one of the pioneers of RAID (among other things). Many issues touched upon, including: "programmers have to start thinking of the disk as a sequential device rather than a random access device." "So disks are not random access any more?" "That's one of the things that more or less everybody is gravitating toward. The idea of a log-structured file system is much more attractive. There are many other architectural changes that we'll have to consider in disks with huge capacity and limited bandwidth." Actual interview has MUCH detail, definitely worth reading."
This discussion has been archived. No new comments can be posted.

Turing Award Winner On The Future of Storage

Comments Filter:
  • dupe (Score:5, Informative)

    by Anonymous Coward on Wednesday September 17, 2003 @09:56AM (#6985207)
    dupe dupe dupe [slashdot.org]
    • Damn, timothy, when it says June on the article it just might be a dupe, ya know? But it's nice to know that the future of disk access hasn't changed since then.
    • dupe dupe dupe [slashdot.org]

      We know it a dupe - but we're simulating stream storage. We can't use that old random access thingy to go back. That would be cheating.

    • by K-Man ( 4117 )
      Well, yes, but, according to Moore's law, we now have 36% more storage to store the dupe.
  • ...does anybody else think this sounds familar?

    I must have read an article earlier about this same thing, probably by this same guy. Can anybody confirm that?
    • by WIAKywbfatw ( 307557 ) on Wednesday September 17, 2003 @10:02AM (#6985271) Journal
      ...does anybody else think this sounds familar?

      I must have read an article earlier about this same thing, probably by this same guy. Can anybody confirm that?


      Thanks to my well-developed powers of telepathy, I can tell you that you have read a previous article on the topic by the same author. So I'm happy to confirm that for you.

      I can also tell you, thanks to my equally well-honed powers of clairvoyance, that this post will soon be modded up as funny.

      (Sheesh. And I thought that some recent "Ask Slashdot" questions were dumb.)
  • by caluml ( 551744 ) <slashdot@spamgoe ... minus herbivore> on Wednesday September 17, 2003 @09:57AM (#6985223) Homepage
    "programmers have to start thinking of the disk as a sequential device rather than a random access device."

    I think we'd all be better off when solid state, non-mechanical disks become commonplace.

    Is there any reason other than cost why we can't have 100Gb solid-state drives yet?

    • Is cost not a good enough reason for you?

      HDD = a buck a gig, solid-state = 100 bucks a gig.

      Though supposedly magical MRAM will come along and revolutionize the world. OLED screens too. And oh yeah, Duke Nuk'Em Forever.

      • Is cost not a good enough reason for you?

        I think someday cost will be less of an issue than convenience. Think of the state of monitors today: LCD sales are going well, and while they haven't replaced CRTs yet, they're on their way. Apple no longer sells CRTs at all. This is despite the fact that CRTs are cheaper for the same size screen, because LCDs have a significant edge in size, weight and power consumption.

        Flash memormay be around $100/GB right now, but if that drops low enough (say $20/GB), it'll
        • Flash memormay be around $100/GB right now, but if that drops low enough (say $20/GB), it'll be enough to replace HDDs...

          There's no way that people will pay $20/GB for primary storage. The cost of a HDD is around $1/GB and dropping fast. It would be exceedingly difficult to convince people to pay a 100% premium (2x the price) for solid state storage. $20/GB would be a 1900% price premium! Smaller size and lower energy consumption are all very nice and good, but $2000 for a 100GB drive seems a little steep
          • But on the bright side, chances of mechanical failure would be greatly reduced, no? :)

            Tho' I still don't think that would be enough to justify it.

            I'm actually afraid to move to the higher Gig'ed drives, I don't backup enough now and larger drives will just let me put it off even longer.

            Having 30 Gig's die would be bad. Having 100 snuffed out could kill me.
        • What "convenience in size"? Solid state storage is nowhere near achieving the densities of hard disks. I also doubt they provide an energy consumption advantage, especially if you use DRAM based solid state storage solutions (which, btw. are usually delivered in the form of large rack mounted boxes full of DIMMs with a tiny little corner occupied with a harddisk used to dump the data to if the box switches to UPS or for other reason may soon loose power).

          Sure, they may eventuelly catch up, but the attract

      • That's just not fair! I'm positive OLED screens and MRAM will actually see the light of day. Comparing it to the DNF is just plain mean.
    • by grub ( 11606 ) <slashdot@grub.net> on Wednesday September 17, 2003 @09:59AM (#6985247) Homepage Journal

      I think we'd all be better off when solid state, non-mechanical disks become commonplace.

      A company named SolidData [soliddata.com] sells solid state "drives".
    • To answer your question directly, cost is probably the biggest reason. But there's more...

      "programmers have to start thinking of the disk as a sequential device rather than a random access device."

      "I think we'd all be better off when solid state, non-mechanical disks become commonplace."

      Now that you mention it, I don't think so. In reality, there's a lot time spent dissing the latency (and even bandwidth) of DRAM. (any flavor) That's why caching is being elevated to a fine art form. That's why Intel is i
    • "programmers have to start thinking of the disk as a sequential device rather than a random access device." I think we'd all be better off when solid state, non-mechanical disks become commonplace.

      I believe that sometime in the future, we'll look back on our spinning disks and chuckle. I think we will eventually get to near-infinite storage, and sequential will be the way to go. There won't be any erasing necessary, you will just write to the next available space, move the pointer to it, and move on.

      • 30 years ago did you ever think a personal computer owner could say ".5 TB is not enough"? As long as it's physical, it's limited. And as long as it's limited some of us will reach those limits.
        • 30 years ago did you ever think a personal computer owner could say ".5 TB is not enough"?

          30 years ago there weren't any personal computers. :-)

          As long as it's physical, it's limited. And as long as it's limited some of us will reach those limits.

          Hence, my use of the term "near-infinite", and by that I mean "infinite for most uses". If storage goes to the organic molecular level, you would be talking about near-infinite storage, depending on the size of the media. But if we get to the point where w

  • by Do not eat ( 594588 ) on Wednesday September 17, 2003 @10:00AM (#6985253)
    This week: You can make a trade-off between latency and throughput!
    Next week: Cars that can haul less can be more fuel-effiecent!
    The week after: Algorithms that use more memory, but are faster to execute!
  • Huge disks (Score:5, Insightful)

    by heironymouscoward ( 683461 ) <heironymouscowar ... .com minus punct> on Wednesday September 17, 2003 @10:00AM (#6985254) Journal
    If I look at the trends of the last decades, while disk sizes increase exponentially, the actual number of top-level objects I store on my systems increases only linearly, and quite slowly. True, I still store individual documents, but I also store AVIs, ISOs, entire photo albums that take gigabytes each.

    It's still random access: I can choose and access an object, even individual photos, without scanning through large amounts of unwanted data.
  • by Ratface ( 21117 ) on Wednesday September 17, 2003 @10:02AM (#6985266) Homepage Journal
    I love his commenta about mailing disks to Europe and Asia..

    The biggest problem I have mailing disks is customs. If you mail a disk to Europe or Asia, you have to pay customs, which about doubles the shipping cost and introduces delays.

    Thereby adding a corrolary to the old adage "Never underestimate the bandwidth of a vanload of tapes barrelling down the highway"...

    "Never underestimate the bottleneck caused by a far-Eastern customs inspector." .-D
    • I've seen this a couple times before, but Google seems to come up with nothing useful for it. It doesn't help that every crappy musician who has made a tape sells it out of their crappy van or that so many scientist have the old prussion "van der something" in their names. But perhaps it's crappy musicicans and these van der scientists who really control the highspeed data transfer.
      • ["Never underestimate the bandwidth of a vanload of tapes barrelling down the highway"]

        I've seen this a couple times before, but Google seems to come up with nothing useful for it.

        That's because the original is "station wagon" (or "stationwagon"). Another common variant is "a 747 full of...". See e.g. this story [bpfh.net]

        And no, it's certainly not Tannenbaum 1996; it was (IIRC) mentioned in Bentley's "Programming Pearls" CACM column/book in the 1980s. It's unclear that anything original can be attributed t

        • Certainly 1980s, probably circa 1983 or 1984 at the latest. I came up with the phrase (which may well have been independently coined before me, at the time I was unaware of it) when we were setting up NETNORTH, the Canadian counterpart to BITNET (networks of typically college campus mainframes, not directly part of ARPANET). There was discussion about setting up the HQ at University of Guelph (where I worked at the time - west of Toronto) or Waterloo University.

          The highway in question (as in station wago
  • by Anonymous Coward on Wednesday September 17, 2003 @10:06AM (#6985304)
    Check out Jim Grey's info page on Microsoft Research [microsoft.com] He's done research on many diverse and interesting technologies such as distributed computing and sequential I/O performance. There are some nifty sites he has taken part in creating, such as a browsable photo of Earth [terraservice.net], and a map of the Universe [ctrl-c.liu.se]
    • ...Gray, head of Microsoft's Bay Area Research Center...

      And here he is singing the praises of open source software, MySQL, Linux, Posgresql, Oracle, IBM etc! He'll most likely be getting a visit from Balmer in person I think. Obviously the brainwashing didn't work on this guy.
  • Network speed (Score:4, Interesting)

    by CausticWindow ( 632215 ) on Wednesday September 17, 2003 @10:07AM (#6985311)

    they are part of Internet 2, Virtual Business Networks (VBNs), and the Next Generation Internet (NGI). Even so, it takes them a long time to copy a gigabyte. Copy a terabyte? It takes them a very, very long time across the networks they have

    Is this really true? Wasn't there a recent Slashdot story where researchers transfered a gigabyte of data, in fourteen seconds or so, on Internet 2 from California to the Netherlands?

    I suppose that disk access times will be limiting factor in both ends if you were to read and write the data from/to a disk.

    • Tweaked (Score:5, Informative)

      by Flamesplash ( 469287 ) on Wednesday September 17, 2003 @10:16AM (#6985381) Homepage Journal
      My prof talked about this in my networking class. Apparantly they tweaked the hell out of the data link layer to do this, so it was not a generic data transfer at all.
    • Re:Network speed (Score:4, Informative)

      by CausticWindow ( 632215 ) on Wednesday September 17, 2003 @10:17AM (#6985393)

      Couldn't find the article with the Slashdot search, but Google produced it. Here it is. [slashdot.org]

      The real numbers were 8,609 Mbps, which translates roughly into a DVD transfered every five seconds. Btw., it was Switzerland, not the Netherlands.

      Also, I don't understand the part where he mentions bandwidth costs of $1 per gigabyte. Maybe you have to pay that much on the Internet 2, but my DSL costs is somewhere in the region of $0.05 per gigabyte, i figure. Maybe I'm just spoilt.

      • The thing is that while the internet2 was supposed to be a research tool, it is actually being used for high bandwidth data transfers from academic units. This happens so much that it actually hinders networking research to an extent.
      • I don't quite understand network costs either. I work at a Fortune 100 company, and supposedly they pay 4 cents per megabyte that goes to/from the internet, or $41 a gig. Certainly there are firewall / antivirus / constantly-on-call-net-admin costs included in there, but I've always been puzzled at the difference between my cable modem costs and my workplace's costs.

        Now certainly, the broadband companies don't expect you to be downloading at maximum speed constantly, and if you were you'd be in the top

        • $41 !?!?!?!?!?

          You can easily buy bandwidth in the sub $1 pr gigabyte when buying bandwidth per gigabyte transferred to a colo. If you need to lease physical lines to your office, the cost may end up a bit higher. $41 per GB sounds like someone has been smoking crack.

          The reason there's a gap for your DSL, though, is exactly as you mention - that most users only utilize a fraction, so you're only paying for the average amount transferred per user plus some contingency.

          • Like I said, being a rather large company, we have a large restrictive firewall (only telnet and http/s go out, and even then only with a password) and spam/virus detection and such, and we need 100% uptime connectivity to our other sites around the world. Even a single ethernet drop at a desk costs each department $30 / month. That has to be administrative overhead, so I'm pretty sure that's where the $41 / gig comes from. Though a 41 times increase seems a bit much, eh?
      • I've seen the presentation by one of the people involved.

        They basically got alpha versions of a 10Gig ether cards from intel. Got the ISP company go give them a light path directly between two points. I think they got one of the underwater cables entirely to themselves for a few hours or something like that. The cost for that was what makes it so expensive.

        Yes, they did tweek the heck out of that software (by the way, they did use linux - don't remember which distro).
  • Ouch... (Score:5, Insightful)

    by cybermace5 ( 446439 ) <g.ryan@macetech.com> on Wednesday September 17, 2003 @10:09AM (#6985329) Homepage Journal
    Frankly the interview was painful every time Dave Patterson said something. How many times does he have to ask questions about the concept of mailing a computer? "We mail computers because transferring over the Internet is too slow for these massive data transfers." "Are they computers?" "Yes." "Do you mail them?" "Yes." "It's like a movie." "Uhh ok." "Is it a whole computer that you mail?" "Yes, it is a computer full of hard drives." "Why don't you just use the Internet?" "Because it is too slow."
    • Or:

      I hadn't thought about it the way you explained it. It isn't that the access times have been improving too slowly; it's that the capacity has been improving too quickly.

      He seems to be suggesting that rather than try to make access quicker, we should stop making hard drives bigger. ?!?!

  • pr0n (Score:5, Funny)

    by leomekenkamp ( 566309 ) on Wednesday September 17, 2003 @10:10AM (#6985336)
    We have a dozen doing TeraServer work; we have about eight in our lab for video archives, backups, and so on.

    That's a good excuse to use on my wife: "No honey, those are my ..., uhhm..., video archives."
  • by m1kesm1th ( 305697 ) on Wednesday September 17, 2003 @10:15AM (#6985372)
    Does that mean he managed to convince someone he was a computer?
    • by jc42 ( 318812 ) on Wednesday September 17, 2003 @12:02PM (#6986279) Homepage Journal
      Does that mean he managed to convince someone he was a computer?

      My wife likes to tell people that her first job, back in the late 70's, was with a Civil Engineering firm in New York, where her job title was "Computer". She did the calculations (and error checking ;-) for their engineering drawings. She used machines to do this, of course, but those machines were called "calculators".

      They've since changed the job title.

      Funny how quickly such terminology can change.

  • 2 quotes... (Score:3, Interesting)

    by leomekenkamp ( 566309 ) on Wednesday September 17, 2003 @10:18AM (#6985398)
    Two quotes from the article (emphasis mine):

    Gray, head of Microsoft's Bay Area Research Center, sits down with Queue and tells us (...)

    JG: If it is business as usual, then a petabyte store needs 1,000 storage admins. Our chore is to figure out how to waste storage space to save administration.

    MS bashers will have a field day on this one...

    • JG: If it is business as usual, then a petabyte store needs 1,000 storage admins.

      Tell that to the high energy physics community; they use petabyte size stores as local caches.
  • by panurge ( 573432 ) on Wednesday September 17, 2003 @10:19AM (#6985407)
    ...semi-seriously. Look at all the stuff about MySQL and Linux in the middle. It's as if a Microsoft Marketoid had suddenly taken over the interview. Or someone who didn't understand the difference between many thousands of developers working on Linux and the smaller number that work on MySQL.

    Apart from speculating as to whether this attempt at FUD was the real payload of the article, did it really say anything that most of us haven't already noticed? Whether Flash or fast SCSI, we could do with an intermediate layer of backing store, with faster random access than current IDE HDDs. And we are fast heading for removable IDE drives to be a better and cheaper tape replacement. And the Internet has limited bandwidth. I'm sorry, but you don't need a Turing prize to work any of that out.

    • Defending Jim Gray (Score:4, Insightful)

      by chrisd ( 1457 ) * <chrisd@dibona.com> on Wednesday September 17, 2003 @10:48AM (#6985628) Homepage
      I didn't really read that as fud or even invalid criticism of MySQL. Maybe I'm biased because of my previous work with Queue and since I have met Jim, but if you get the impression that Jim doesn't like MySQL (which I did not) then I would actually assume it is because he felt that way, not because of Microsoft. Jim is one of those guys that will never be looking for a job, his early work on databases were pivotable to the development of transactions and his work on fault tolerant systems is legendary, he really is beyond reproach.

      Chrisd

    • I'm going to confess that I have probably misunderstood the point. The precise bit of the article I was referring to was:

      The challenge is similar to the challenge we see in the OS space. My buddies are being killed by supporting all the Linux variants. It is hard to build a product on top of Linux because every other user compiles his own kernel and there are many different species. The main hope for Oracle, DB2, and SQLserver is that the open-source community will continue to fragment. Human nature bein

    • Personally, I find it to be rather telling evidence that a small team is more efficient than a large mob. With a small team of people, decisions can be made much more rapidly - they may later turn out to be wrong decisions, but it takes less time to disseminate information and corrections to the other team members.

      As for FUD about MySQL, I don't see it in the article. MySQL is lacking some features that keep it from competing in the same spaces as Oracle, but that's a decision on the part of the MySQL team
    • Look at all the stuff about MySQL and Linux in the middle. It's as if a Microsoft Marketoid had suddenly taken over the interview. Or someone who didn't understand the difference between many thousands of developers working on Linux and the smaller number that work on MySQL.

      His point was simply that DB2 and Oracle, and, to a lesser extent, SQL Server, are mature database products, and MySQL is just a baby. "Real" databases are optimized like crazy for each OS version/CPU combination. MySQL isn't there y
    • > Look at all the stuff about MySQL and Linux in the middle. It's as if a Microsoft Marketoid had suddenly taken over the interview. Or someone who didn't understand the difference between many thousands of developers working on Linux and the smaller number that work on MySQL.

      He's correct as far as he goes.

      MySQL and MS SQL Server actually have the same problem, and it is called SQL; both even go downhill from there.

      SQL is simply too complex to implement properly, and it only gets worse when you sta

  • LSFS (Score:4, Informative)

    by smd4985 ( 203677 ) on Wednesday September 17, 2003 @10:21AM (#6985420) Homepage
    For more info on (very-cool) Log-Structed File Systems, check out Mendel's original paper at:

    http://citeseer.nj.nec.com/rosenblum91design.htm l
    • Same paper, directly in PostScript format. [stanford.edu]

      Really good idea. The canonical criticisms, as described by OS teachers I had (hint: one of them WAS Mendel...):

      • Unnecessary - Unix FFS improved (a few years after LSFS came out) by adding clusters and cylendar clustars, reaching almost the same performance.
      • CPU-intensive. Requires a background daemon to reclaim disk space (~10% of disk access was this daemon, IIRC). Being Slashdotters who hate even the CPU cycles Winmodems consume...
      • Poor performance in commo
  • So I guess the disk algorithms from Knuth's TAOCP are still useful after all those years?

  • by master_p ( 608214 ) on Wednesday September 17, 2003 @10:33AM (#6985503)

    One final thing that is even more speculative is what my co-workers at Microsoft are doing. They are replacing the file system with an object store, and using schematized storage to organize information. Gordon Bell calls that project MyLifeBits. It is speculative--a shot at implementing Vannevar Bush's memex [http://www.theatlantic.com/unbound/flashbks/compu ter/bushf.htm]. If they pull it off, it will be a revolution in the way we use storage

    I've talked about it before [slashdot.org]. This guy thinks what Microsoft is doing is revolutionary. Come on all you people, can't you see the problem with today's file systems ? the problem is that the type information is lost!!! we need objects, and we need type information to be stored along those objects!!! This is the only way lots of problems will go away.

  • MRAM saves the day (Score:4, Interesting)

    by Markus Registrada ( 642224 ) on Wednesday September 17, 2003 @10:39AM (#6985560)
    All the tradeoffs will change radically when MRAM hits the streets. It's potentially denser than disk and DRAM, as fast as static RAM, nonvolatile, doesn't use power when it's not used, and can be made on regular silicon process machinery. Expect it first in cell phones next year, and then everywhere.

    This doesn't just affect file storage and virtual memory. It also changes the economics of cache and main memory, and makes deployment of 64-bit CPUs more urgent. It also makes system crashes much less tolerable, because turning the computer off and on doesn't involve long shutdown and boot procedures any more.

    • All the tradeoffs will change radically when MRAM hits the streets. It's potentially denser than disk and DRAM, as fast as static RAM, ...

      Yup. And Duke-Nukem Forever will eat Half-Life 2s panties.

  • by abulafia ( 7826 ) on Wednesday September 17, 2003 @10:47AM (#6985617)
    JG Twenty-megabyte disks were considered giant. I believe that the first time I asked anybody, about 1970, disk storage rented for a dollar per megabyte a month. IBM leased rather than sold storage at the time. Each disk was the size of a washing machine and cost around $20,000.

    So, one could rent a $20K device for $240/year? Those must have been the days...

    That can't be right.

    • Interesting Idea... (Score:3, Interesting)

      by polyp2000 ( 444682 )
      Interesting thought popped when i read your post,
      there is a current trend towards cramming as much storage into something the size of a 3in Hard drive.

      I wonder why they dont make larger harddrives in the physical sense? A hard drive the size of a washing machine using todays technology would store a phenomenal amount of stuff, but whatabout something more reasonable like a hard drive merely twice the physical size of todays. how much more storage could you get just by scaling up the platters? anyone here g
      • Imagine all the energy stored by huge disks turning at 10k rpm.... Imagine if one of those disks disintegrated.
  • by heironymouscoward ( 683461 ) <heironymouscowar ... .com minus punct> on Wednesday September 17, 2003 @10:50AM (#6985646) Journal
    Take this choice quote from the article:

    My buddies are being killed by supporting all the Linux variants. It is hard to build a product on top of Linux because every other user compiles his own kernel and there are many different species.

    Ain't it sweet? I count five lies:

    (1) people being killed by supporting (gasp) operating systems... gosh, horror and violence, not nice at all!

    (2) all the Linux "variants", are in fact pretty much one standard, LSB, with several skins

    (3) "hard to build a product on top of Linux", rather than, hmmm, Windows? Linux is incredibly easy to build for. I suspect the fact that it's very standard helps.

    (4) "every other user compiles his kernel"... maybe at Microsoft. I suspect less than 1 in 20 Linux users ever compiled a kernel.

    (5) compiling a kernel means you can't support it... WTF? The kernel is incredibly stable, since most changes are in external modules. And I can't remember a single case where a kernel change broke one of my apps.

    (6) (sorry, I was not counting well), "many different species"... well, AFAICS the only difference between the Linux distributions is that they have different packaging methods, different timelines as to their versions, and different UI tools for hardware detection, configuration, etc. Nothing at all that makes life hard.

    Look: I just installed Xandros, which is Debian with a nice face. On two different types of machine, and it installed without asking a single question about my hardware except whether the mouse was left or right-handed. Check my journal...

    Windows never worked this nicely. Where is the support issue?

    In the writing indistry we call this "to condemn with faint praise".

    Yeah, Windows kinda works, I mean, it'll run Office without crashing too often, but it's just killing by buddies to have to maintain Win2K, WinXP, and even some older Win98 machines, not to mention we have a whole cupboard simply filled with driver CDs for every PC we have.
    • From the Devil's Dictionary:

      FUD: The sound made by someone attempting to wish away inconvenient facts.

      http://www.eod.com/devil/archive/fud.html
    • by brunes69 ( 86786 ) <[slashdot] [at] [keirstead.org]> on Wednesday September 17, 2003 @11:56AM (#6986218)
      His basic idea is 100% correct, but the reson is all wrong. It *IS* much harder to develop an app Linux the myriad of flavours, not because of the kernel, but because every distro has its own versions of libraries. I work for a company that makes Linux software, and we only support RedHat, and even certain versions of RedHat at that. While our product would probably compile against any number of distros, and even the BSDs, we just don't have the time and manpower required to build, test, debug, package, and maintain 15 different releases for every sub-release or patchlevel we have in the product. With Windows products, at least, (unless you are doing some lower-level stuff) if you build something you can be reasonably assured it will run on Windows 2000, or Windows XP, or Windows 2003. Not the same if you build something with RedHat 9 and try to run it on Debian or Suse, etc. And before you go on about "release a source package", not all companies release everything GPL, and want to keep their IP theirs, since they like to put some money on the table at night. It's definitly not FUD to say it is much more effort to develop and release cross platform binaries in Linux than Windows.
      • First, DLL hell on Windows, shared libraries on Linux, same headaches. Consider static linking: larger binaries but fewer headaches.

        Second, it is trivial and cheap to build packages for RedHat, Debian, and SuSE as you need them, we do this automatically. See, when the OS is free, it costs you little to set-up development systems. If you're tight for hardware, use UML.

        Third, there are serious arguments against delivering binary-only packages, and in favour of building from source, and these arguments ar
      • What about statically linked releases? Joe
      • If they don't want to play the game and instead hoard IP, they can get left behind with the rest of the dinosaurs. No sympathy.
    • Well, I just responded to another troll, so I might as well respond to you too. If you knew anything about enterprise class databases, you'd know that they have many, many, many years of maturity behind them that MySQL doesn't have, and because of that, are optimized per each OS version & CPU combination that they're run on. MySQL is just a generic, primitve DB now, that hasn't been tweaked out, because nobody's going to buy a refrigerator-sized Sun array to run MySQL on. You do that with Oracle, and
      • Oh, please, keep those coming.

        I just love your sense of humour. I remember when we switched an ISAM application to Oracle in the mid 1990's, on a Unix box. A single record access by primary key was 20,000 times faster with the ISAM system than under Oracle.

        I retested this with later versions of Oracle and found that the performance was worse, not better.

        Now, I have a nice server under a desk here, and we reloaded an Oracle 9 database on it, it took something like 8 hours to rebuild. Since we make port
        • Oh, please, keep those (trolls) coming.

          Because, of course, performance is the sole indicator of a product's worth. Using that reasoning, everyone should be driving an Indy car instead of a station wagon.

          Sure, the Volvo ($enterprise_DBMS) may not be quickest off the line as a Ferrari (MySQL) but I'd rather be in the Volvo when something goes wrong.
          • No, of course performance is not the only indicator of a product's worth.

            Let me list my criteria for, e.g. a database product:

            1. accuracy
            2. performance
            3. ease of administration
            4. ease of installation
            5. price

            Not in any specific order. I've used Oracle databases for about 12 years, and on every single one of these counts, MySQL wins. Every single one, without exception.

            Oracle wins on a number of other criteria:

            1. profitability
            2. complexity
            3. need for expensive DBAs
            4. consumption of excess time
            5. image
            6
            • Wow, you've been on Oracle since version 6? What a beast that was!

              You have not used Oracle's database (unless you write low-level driver calls to the actual data itself) - you use Oracle's DBMS product.

              In any rate, I find it difficult that you can say that MySQL is more 'accurate' than Oracle (and by extension PostgreSQL, or MS SQL Server, or Sybase ASE or ...).

              The constraint handling is poor at best (you can only have very minimal constraints). You have no such thing a triggers or views. The datatype
    • This is a real issue, not FUD, especially for drivers, which RAID type people would have to deal with. Basically every distro I've used has modified the kernal, and the with the checksum name mangling in the 2.4 kernel, this means that a kernel module (ie driver) compiled for a RedHat kernel will not work for a Mandrake kernel. In fact the Mandrake 9.0 and 9.1 kernels are incompatible in this respect! The only solution it to provide dozens of binaries, or go with some elaborate scheme like NVIDIA does and c
  • by polyp2000 ( 444682 ) on Wednesday September 17, 2003 @10:51AM (#6985653) Homepage Journal
    Anyone know what happened to that bloke at keele who
    invented a way of cramming 3 Terrabytes on a credit card. Apparently it would have cost about 35 pounds to manufacture. this was a couple of years ago, why hasnt it happened yet?

    Surely something like this is the real future of storage ?

    Terrabyte on a credit card [cmruk.com]

  • by computerlady ( 707043 ) on Wednesday September 17, 2003 @10:52AM (#6985660) Journal

    "Sneaker net" was when you used your sneakers to transport data?

    Oh my. How old I feel when someone has to ask what "sneaker net" was. And someone has to answer...

  • AMAZING!!! (Score:5, Funny)

    by X86Daddy ( 446356 ) on Wednesday September 17, 2003 @10:54AM (#6985687) Journal
    This is a *MAJOR* breakthrough! Most Turing Test contestants don't even win, but this one can eloquently discuss topics and give complex answers, rather than just turning back the question, Eliza-style.

    Can we download a copy of this "Jim Gray" yet?
  • >programmers have to start thinking of the disk as a sequential device rather than a random access device

    This is partially already true for classic UNIX userspace behavior. You pipe the data from the input file(s) trough a filter and generate the output, sequentially.

    A completely different model from the FS drivers or a SQL database.
  • New File System (Score:3, Interesting)

    by Archangel Michael ( 180766 ) on Wednesday September 17, 2003 @10:56AM (#6985710) Journal
    What current file systems need is meta data in them. That is that the File system itself stores the MetaData about the file. Think about the Mac File system, with the Meta data contained in the file itself, as the "resource fork". Now imagine a systemized, extensable meta file system, that organized files by what the Meta Data said about them.

    Imagine, media files stored in such a way that both random and sequential access was optimized, where the file structure was automagically defragmented and organized behind the scenes.

    Imagine a computer that watched what files were used at bootup, and organized them so that the hard drive streamed the bootup data sequentially, straight into memory.

    Imagine being able to start PRELOADING applications before you even finish the second of your double clicks on the datafile.

    Imagine Database files that were automagically indexed as part of the file system.

    Imagine Security and encryption being built into the filesystem beyond today's capabilities, where the security and encryption does not rely upon a master controller or centralized security policies, but rather has the ability to follow the file, seemlessly.

    I am sure that I haven't even begun to tap the possibilities.
  • IDE replaces DVD (Score:5, Interesting)

    by G4from128k ( 686170 ) on Wednesday September 17, 2003 @11:02AM (#6985761)
    With an ever growing collection of digital photos, I've come to the same conclusion as Jim Gray. Hard disks are superior for backups.

    I currently have about 100 GB of images and it takes more than 20 4.7 GB DVD-R discs to create a full backup. Although DVD media is still slightly cheaper than new large capacity IDE drives, the added time and hassle factor of burning 20 disks far out weighs any minor costs savings. Moreover a 3.5" drive in a padded anti-static bag takes up less room in the safe deposit box than 20 DVDs (especially if you have the DVDs in protective jewel cases). And if HD-based-backup lets me avoid some future artists tax on burnable media, so much the better.

    A Firewire enclosure and a rotating collection of IDE drives is the way to go.
    • I came to that conclusion 3 years ago. What's taking you guys so long?

      CD and DVD media is great as a replacement for the floppy disk. But harddrives have been the only affordable transportable mass-storage media for years. DVD has never been an option for me, but I suspect it will replace CDRW media within a couple years.

      $1/GB or less since 2001.
  • Does that mean Jim Gray proved to someone over a computer terminal that he was human?

  • What Gray is talking (mostly) about is what we used to call the "Roadmaster Scenario." When I worked for [a major electronics company], we had a data center in Dallas and a redundant site about 30 miles away in Lewisville. Every Sunday the entire IMS database was archived to mag tape and shipped to the other data center for a second level of redundancy. This begged the question, why not just copy them over the T1 lines (this was 1980) to the other site's tape drives directly? The answer, of course, was that
  • by leandrod ( 17766 ) <{gro.sartud} {ta} {l}> on Wednesday September 17, 2003 @12:01PM (#6986260) Homepage Journal

    > To some extent you can think of Codd's relational algebra as an algebra of punched cards. Every card is a record. Every machine is an operator.

    Interesting how the guy literally wrote the book on transactions, yet grossly misrepresents Codd's work, which BTW wasn't simply the relational algebra, but even higher level: the relational model of database management, including the relational calculus.

    While the algebra is somewhat procedural, the calculus is set-oriented, and they are fully equivalent. The idea is exactly not looking at records and operators, but describe what you want -- just leave the relational system set the procedures to get that in the most efficient way it can.

    Incidentally this has a big impact on all Gray is discussing -- without a fairly simple and powerful data model, so much data is basically a waste. He's thinking too low level, including the object stuff he touts, but we will only find use for so much data the day we get proper relational implementations, and this excludes SQL in general and MySQL in particular.

    • He isn't grossly misrepresenting Codd's work.

      You said it yourself:

      While the algebra is somewhat procedural, the calculus is set-oriented, and they are fully equivalent.

      and, uncoincidentally, the isomorphism extends further to machines that manipulate physical punch cards. You go on to say:

      The idea is exactly not looking at records and operators, but describe what you want -- just leave the relational system set the procedures to get that in the most efficient way it can.

      Right. And what Gray ha
  • "The algorithms are simple enough so most implementers can understand them, and they are complicated enough so most people who can't understand them will want somebody else to figure it out for them. It has this nice property of being both elegant and relevant."

    Or what happens when the concept of taking complexity (made up of simpler things) and automating it such that it is easy to use and reuse, by the user, is applied here?
  • Sit down, turn off your cellphone, and prepare to be fascinated. Clear your schedule, because once you've started reading this interview, you won't be able to put it down until you've finished it.

    Actually I got about halfway through and decided to skip the rest of it.

If you think the system is working, ask someone who's waiting for a prompt.

Working...