Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?

XFS 1.0 is Released 173

Isldeur was the first of many to note that SGIs now open source Journaling File System "XFS" has announced the release of version 1.0. It, Reiser, the new ext format continue to be an area of debate, but regardless, Journaling file systems are nice to eliminate those slow fsck boot ups, and to protect all your pr0n when you lose power and realize that you plugged the UPS into your stereo by mistake (not that I've done that. No sir.)
This discussion has been archived. No new comments can be posted.

XFS 1.0 is Released

Comments Filter:
  • by Anonymous Coward
    SGI has a modified version of the anaconda installer from Redhat 7.x that allows you to install a 100% XFS system. I have three servers running all XFS filesystems and I love it. Full lilo support, no problems. It would be nice to have XFS added to the stock kernel. Maybe someday.
  • by Anonymous Coward
    A lot of servers have been running reiser for many months. not a bad proving ground
  • by Anonymous Coward
    There really isn't any reason for your concern. You may remember that EXT2 was not Linux's first and only filesystem. There have always been competitors and predecessors: minixfs, xiafs, ext. If you care to flesh your worries out and explain SPECIFICALLY why and in what situations you think having more than one excellent filesystem for Linux is going to be a problem, it could then be discussed further - otherwise you're either posting mindless drivel or deliberately trolling. Many people I know use Reiserfs or EXT2 in combination with Reiserfs. And so far there's yet to be any compatibility problem at all between them. Reisefs has incompatibities with certain software RAID setups and NFS, so if those things are more important to you than Reiser's journalling, you use ext2. XFS is a truly excellent fs, as is IBM's JFS. EXT3 may provide people with EXT2 systems a painless conversion path to a journalling system. They are all better at different things.

    This filesystem agnosticism is a wonderful and fairly unique feature of Linux, not a problem, and it's not a new thing - just the luxury of many journalling systems is new.

  • by Anonymous Coward
    The kernel can only ask the hard drive to flush the data to disk. The disk need not comply, despite returning a "yes I did" result. And as large drives have 5 and 10MB caches now, how can the consumer really know what the drive decides to do? It may do write caching so marketroids can boost performance specs. This stuff is not document on the box the hard drive comes in nor on the mfg web site.
  • by Anonymous Coward
    Hmmm... so how do you repair the filesystem if the drive is going bad, your IDE controller got UDMA133 turned on accidently, or you ran a buggy version of the kernel?

    ext3, JFS, and ReiserFS all have "real" fsck programs

  • by Anonymous Coward
    Can we please /please/ have an "Uninformed" or "Ignorant" or "Just Plain Wrong" moderation option for posts like the above? That crap got modded /up/ because the moderators don't have a clue either, but it's just /full/ of misinformation.
  • by Anonymous Coward
    Could you PLEASE come to the realization that "microsoft or any other company" CANNOT "close" open source BSDL'd software? Software is not physical object. When someone takes a copy, regardless of what they do to it, the original is STILL THERE.
  • by Anonymous Coward on Tuesday May 01, 2001 @08:43AM (#253452)
    Yes, XFS is proven on IRIX, XFS is not proven on Linux, Reiserfs is proven on Linux (shipping with SuSE for almost two years now).
  • by Anonymous Coward on Tuesday May 01, 2001 @08:44AM (#253453)
    Had they placed it under a BSD license perhaps it might actually get some use. But oh well.

    Had they placed it under a BSD license, effort that has been put into producing an open and free filesystem could be closed by a company such as Microsoft. Why should I let them profit if they don't contribute or at least acknowledge my work?

    As it is, xfs is under the GNU GPL and is thus protected from being made proprietary. The GPL protects the rights of free software authors. Myself, and thousands of other free software developers worldwide, wouldn't have it any other way.

  • by Anonymous Coward on Tuesday May 01, 2001 @08:52AM (#253454)
    I just upgraded my home network to 100baseT. Now, how the hell am I going to saturate a 100Mbps link with a shitty IDE hard drive, or even a faster SCSI?

    #hdparm -Tt /dev/hda

    Timing buffer-cache reads: 128 MB in 0.89 seconds =143.82 MB/sec
    Timing buffered disk reads: 64 MB in 2.94 seconds = 21.77 MB/sec

    Observe the time taken during the buffered disk read test - 21.77 MB/sec. This is on my year-old Athlon system. AFAIK 100mbit networks don't tend to transfer data at speeds faster than 10 MB/sec. Perhaps you meant that you upgraded your home network to gigabit. Or not.

    Use hdparm to ensure that your hard drive is set to use DMA.

  • by Anonymous Coward on Tuesday May 01, 2001 @08:42AM (#253455)
    For those of you looking for comparisons, why not check shtml which appears to have links to information on a variety of filesystems (most of the journalled FSs under Linux) and even NTFS.
  • by Anonymous Coward on Tuesday May 01, 2001 @08:57AM (#253456)
    SGI is going to put Linux on their Big Systems(tm) when the Itanium-class CPUs start shipping. They've been planning this for a while now. The current generation of Onyx/Origin boxen are designed with multiple CPU architectures in mind -- e.g. you will be able to have a MIPS system or an IA-64 system just by swapping a single brick.

    The eventual plan is to have Linux for the Intel servers and IRIX on the MIPS ones, with IRIX being phased out over a long period of time so as to keep the old customers from getting paranoid. There's even rumors internally about servers with *BOTH* intel and MIPS processors in them running Linux. If you watch SGI's Linux pages, you'll notice that more and more support is made available for running Linux on R10K, R12K and other heavy-duty processors, not to mention SGI's memory architectures (e.g. ccNUMA).

    My own theory is that the now-EOLed 320/540 workstations were an experiment to see how SGI's customer base would react to non-MIPS/IRIX workstations and to get everyone warm to the idea of SGI branching out.

    SGI is a company to watch over the next few years, and releasing things like open-sourced XFS for Linux are just teasers of what's to come.
  • Someone once told me SGI has a smart disk controller backed up with a battery,...

    That's not unique to SGI. Any qualtity hardware RAID controller has an onboard battery backup. Even the ones meant for PeeCees.

  • So one can only patch a stock Linus kernel with XFS? What about the Red Hat 7.1 kernel? Would be nice if there was a patch for that (although I doubt I'd use it now unless there was an easy way to convert existing partitions).
  • Sigs don't appear during the metamoderation. If you moderate #1 up, you will likely be MM'd as unfair.
  • by Wakko Warner ( 324 ) on Tuesday May 01, 2001 @08:24AM (#253461) Homepage Journal
    For more about XFS, SGI's official page is here [].

    - A.P.

    Forget Napster. Why not really break the law?

  • by gjohnson ( 1557 ) <> on Tuesday May 01, 2001 @10:11AM (#253465)
    Well, I found its webpage. []

  • by gjohnson ( 1557 ) <> on Tuesday May 01, 2001 @09:00AM (#253466)
    Speaking of journaling filesystems, what ever happend to tux2? Was any code ever released?

    tux2 looked really good. Supposed to be faster than traditional journaling, and preserves file data as well as metadata.

  • by jd ( 1658 ) <> on Tuesday May 01, 2001 @09:19AM (#253467) Homepage Journal
    There are a LOT of journalling filesystems for Linux. Excluding extensions (which effectively double the number of unique systems), there are five "genuine" journalling filesystems for Linux.

    (I don't count NTFS, because that is hard-pushed enough to be called a genuine filesystem, never mind a journalling one.)

    Feel free to reply to this, adding any that I've missed.

    The Logging filesystem does much the same thing as Ext3 - it is an extension to Ext2 - but it looks like it would be a lot more useful than Ext3. IMHO, it'd be much better if neither of them were so FS-specific and could be used as a generic wrapper. SnapFS does exactly this, for example.

    Anyway, on with the list of Journalling Filling systems...

    ... -IN- the main kernel tree:

    ... at a stable release:

    • XFS []

    ... at a developmental release:

    ... currently abandoned:

    ... extensions for:

  • So I find myself suspicious of some of the claims in the parent post, and do a google search to see if I can find some benchmarks. What do I find instead? A nearly identical post [], written by a different user, from last February. Would someone please moderate AntiBasic into deserved oblivion? Thank you.
  • test-2, test-3, and 1.0 have been released since then, and that includes an acl fix. Without knowing for sure what your problem is, hard to say for sure if your specific problem has been resolved...
  • Can you convert a drive you've already got data on? Could I simply point at my disk drive and say, "turn that into an XFS drive," edit a few boot params, and be done?

    No, sorry. :) that'd be quite an undertaking. You can dump/restore between filesystems, or just copy over, but there is no magic "ext2 to xfs converter."
  • XFS features extended attributes, so you could use this for mime types, I suppose. Any application would need to be aware of this, though, and would need to support it as well. Interesting idea though...
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @08:57AM (#253476) Homepage
    At this point, SGI has only provided an unsupported Red Hat system installer for XFS. However, there are a couple people in the Linux community who have been working on Debian packaging & installers, and also someone working on slackware. Check the xfs mailing list archives for more info...
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @09:43AM (#253477) Homepage
    FWIW, GRIO and realtime subvolumes (err... partitions in the Linux case) are not yet implemented in Linux.
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @11:51AM (#253478) Homepage
    So one can only patch a stock Linus kernel with XFS?

    You can patch whatever you want, it's a question of how many conflicts you need to resolve. :) As with any patch, you will have varying success patching source trees that differ from that which was used to generate the patch.

    What about the Red Hat 7.1 kernel? Would be nice if there was a patch for that. le ase-1.0/patches/RHlinux-2.4.2-core-xfs-1.0.patch.g z
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @09:08AM (#253479) Homepage
    The kernel & userspace utils are packaged several different ways - cvs, patches, tarballs, rpms etc.

    Go to for the info.
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @09:34AM (#253480) Homepage
    We have a system installer that works with Red Hat 7.1 to do exactly what you're asking about. Grab our iso, burn a cd, boot from it, and you're on your way. You'll need the Red Hat 7.1 cds as well.

    The other option, of course, is to have lots of extra space, install your distro, boot an XFS capable kernel, make some XFS filesystems, and copy everything over.
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @08:44AM (#253481) Homepage
    Lilo in the MBR works just fine with XFS. There are no issues, I have 3 machines that boot that way.
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @09:21AM (#253482) Homepage
    A lot of people have been complaining that there is no 2.4.4 patch - but bear in mind that 2.4.4 is only 3 days old. We'd be a bit nuts to release 1.0 on a kernel as untested as that.

    On the other hand, the devel cvs tree is usually updated within a few days of a new kernel release. As soon as the kinks get out of XFS+2.4.4, it'll be in the devel cvs tree.

    The majority of our 1.0 testing has been done on 2.4.2, so we have the most confidence in XFS there. We also have a 2.4.3 patch which should be fine, although it has not had as much direct testing.

    We realize that there are issues with 2.4.2 (loop device, anyone?) If you're concerned about fix-ups, and you run an RPM-based systems, you might take a look at the Red Hat kernel RPMs we offer - those include a ton of patches from Red Hat - essentially the same kernel as shipped with 7.1, with XFS added.

    If you're concerned about netfilter, just get the patch - I would be very surprised if it conflicted in any way with an XFS-enabled kernel.
  • by Booker ( 6173 ) on Tuesday May 01, 2001 @11:03AM (#253483) Homepage
    So when will you submit it to Linus?

    It's certainly a priority, but I can't really give you a timeline - we're working on it.

    XFS is so big and touches so much

    XFS is big, no getting around that, but we're making efforts to keep our modifications in the kernel to a minimum (actually, a lot of that work is done already). The patch is also now split into 2 pieces, one for "core linux" changes and one for the filesystem itself. That was done for a couple of reasons, but a nice side effect is that it's a little easier to see how much is XFS itself, and how much is linux changes.
  • by Mr. Frilly ( 6570 ) on Tuesday May 01, 2001 @11:33AM (#253484)
    Yeah, but they can make their version of a BSDL'd software incompatible with what's in the open. And if they control 90% of the market, guess who's getting screwed?

    For a different scenario, imagine a BSD licensed unix. Now imagine several large corporations taking that great technology and using it. Sounds great, right? Now fast forward a couple of years, and you find that every one of these corporations has expanded on the original BSD licensed unix in a proprietary fashion in attempts to maintain and expand their customer base. Even if most of the corporations would have preferred to maintain their software as BSD licensed, their hands are forced when the first of the corporations starts spitting out proprietary, incompatibly feature enhanced versions. Admins find themselves trapped, having to either understand and maintain several incompatible systems, or going with one vendor and getting gouged for prices. Suddenly, WinNT 3.51 pops up, and although much worse technology, it runs on cheap hardware, costs less, and is far easier to administer.

    Sure, you can say the customer should have used the original, still BSD licensed software, but in reality, most customers can't code, and are going to go with the commercially supported software, because the added features and/or lower administrative costs of the commercial software is (at least initially) cheaper then going with the BSD stuff.
  • Right. That was worth a waste of your breath (and Slashdot's bandwidth).

    Umm, you asked, little brain. I responded. Yes, it was worth a breath to explain that an open standard filesystem has more value than a proprietary filesystem on a shitOS that's a fair to middlin' product when configured "optimally". I wouldn't expect a naive schoolgirl to understand that.
  • You forgot to mention the Global File System [] (GFS). Not only is it a journalling file system, but it also allows speedy access to network drives. Its NFS on 'roids.

    I hope we shall crush in its birth the aristocracy of our monied corporations ...
  • by geoffeg ( 15786 ) <> on Tuesday May 01, 2001 @10:24AM (#253493) Homepage
    I keep hearing little tidbits about Tux2 (Tux2: The Filesystem That Would Be King []). I can't find Daniel Phillips's website anymore nor have I seen any more information about it in the last few months. What I have read and heard about it sounded very interesting (I would say it's sounds promising but I don't know enough about file systems to know how good an idea it is realistically).

    Would this be under "currently abandonded" or "abducted by aliens"?


  • I have not used Reiser, so I can't speak to it's performance.

    However, I can tell you that XFS is a really great filesystem. We have over 100 Irix/XFS systems deployed at television stations around the world. These systems are very I/O intensive, and I have never had a corrupt filesystem. Performance is also very good.

    I would really like to see SGI copy more features from Irix into Linux, such as fine control over process priorities, something standard linux distributions are severely lacking, imo.

  • Since some file systems fits some purposes better than other file systems, and other file systems fits other purposes better than some file systems, what criterias do you have to consider when selecting a file system from another?
  • All this moderation is getting bad posts modded up waaay too much...

    As it stands right now the majority of their hardware run Linux

    Umm, not really []. And the hardware that does run Linux runs it with beta-type quality.

    the last version of Irix released was to mainly fix bugs.

    No shit. They release quarterly maintenance releases - in the 6.5.X series - up to 6.5.11 now.

    They will probably drop IRIX someday down the road. But since the workstation marked imploded on them, SGI is trying to make money off of servers. I don't think that Linux is going to be running (release quality) 256-way Origin servers any time in the "near future".

  • 11 quarters = 2.75 years. IIRC, NT 4.0 was released in 1996 or so, and 6+ service packs later we have Win2K :)

    IRIX 6.5 was a major release - a lot changed from 6.2, which was their previous major release.

    I'm not sure if SGI is planning any major new releases for IRIX in the immediate future - you can read about their roadmap here [] (pdf). My nick is irix becuase it is the first UNIX I used back when. I'd change it now if I could ;)

  • GRUB is great. I love GRUB. However, GRUB has this problem _even more_ than LILO. LILO is FS agnostic. It tries to load the kernel from a specified _disk block_.

    GRUB, on the other hand, actually reads the file system, like the BSD bootloaders, and at this point, it doesn't know how to read XFS.
  • It works fine, but is a little unhappy because it does not recongize the filesystem and spits out information asking if locking is ok or not. Kind of sucks, but other than that it works.
  • by Spyffe ( 32976 ) on Tuesday May 01, 2001 @08:37AM (#253512) Homepage
    It seems like one of the nastiest problems when you want to promote a new filesystem is getting LILO, SILO, MILO... to load a kernel image off of the filesystem. What are the issues involved here? Do these loaders really only support ext2fs? If so, this would prevent a user from having a completely journalled system, right? Perhaps there are ways of fixing this (like backups of the /boot partition on a journaled fs) but it would be cool (I think) to have a mini-fsck run on the boot partition before the kernel boots. There may be issues here; perhaps a MD5 sum or something of the sort might be better to check that the boot partition is uncorrupted. The sum would be checked against... what? This post is as much an RFC as anything else. Go at it!
  • So does anyone know (benchmark wise that is) how XFS performs compared to reiser, ext3, etc.?

  • Wouldn't you just be mirroring if you wrote user data to the log?

    Not quite. The log/journal is structurally different than the main data areas, with different synchronization and performance characteristics. Writing once to the log and once to the main data area is quite different than writing twice to the main data area.

    However, an observation very similar to yours is behind log-structured filesystems. In other words, if you're going to write all the data to the log in a highly robust etc. way, why not just make the log the authoritative copy of the data? There's a whole lot of gunk that has to be worked out after that, such as how you find data and how you reclaim log space, but it all flows pretty cleanly from that initial idea. The result is pretty nifty for some kinds of workloads, but in general changing OS structures and their effects on I/O patterns have sort of left log-structured filesystems behind.

    If you're interested in exploring further, the seminal papers in this area are The Design and Implementation of a Log-Structured File System [] by Rosenblum et al, and (IMO even better) An Implementation of a LogStructured File System for UNIX [] by Seltzer et al. Enjoy!

  • Someone once told me SGI has a smart disk controller backed up with a battery, so in the event of a blackout, the controller would keep for some hours the data still not written on the disk, flushing it on the disk on the next power up.

    Interesting. I dunno about the SGI product, but the EMC Symmetrix takes a different approach. It has enough reserve power so that if it detects loss of external power it will immediately flush its cache to special areas on disk. Then, the first thing it does when it comes back up is slurp all that data back into cache - which not only ensures data stability but preloads the cache for you as well. Cool. I've heard that in a simulated blackout in a big data center everything would get eerily quiet *except* for the Symmetrix, which would actually get extra-loud as it does the flush.

    Disclaimer: I work for EMC. I don't speak for them, they don't speak for me, yadda yadda yadda.

  • The kernel can only ask the hard drive to flush the data to disk. The disk need not comply, despite returning a "yes I did" result.

    That's an important issue. I'll try to provide a couple of answers.

    how can the consumer really know what the drive decides to do?

    Well, there are at least two ways:

    1. Turn off write caching.
    2. Set the "Force Unit Access" (FUA) bit on the Write command, if it's a SCSI/FC disk.

    SCSI gives you other options as well. For example, if you're using tagged command queuing, you can set FUA only on the last command of a sequence (e.g. a transaction). That way, you can allow the disk or storage subsystem to do appropriate reordering, combining, etc. and you'll still be sure that by the time that last command completes all the commands logically ahead of it (as specified by the tags) have completed as well. It's tres cool, and it's one of SCSI's biggest benefits compared to IDE.

    Tagged command queuing also comes in handy if you have to force write caching off - which BTW is common and not particularly difficult on either SCSI or IDE drives. Since you're now forced to deal with full rotational latency, the importance of overlapping unrelated operations (by putting them on different queues) becomes even greater.

    This stuff is not document on the box the hard drive comes in nor on the mfg web site.

    Tsk tsk, that's a shame. It's pretty common knowledge among storage types, but still far from universal. Go look on and you'll see a recurring pattern of people finding this out for the first time and sparking a brief flurry of posts by asking about it.

    The problem with having the drive notify the host that a write has been fully destaged is that target-initiated communication (aside from reconnecting to service an earlier request) is poorly supported even in SCSI. Hell, it's even hard to talk about it without tripping over the "initiator" (host) vs. "target" (disk) terminology. Most devices lack the capability to make requests in that direction, and most host adapters (not to mention drivers) lack support for receiving them. AEN was the least-implemented feature in SCSI.

    There's also a performance issue. Certainly you don't want to be generating interrupts by having the disk call back for *every* request, but only for selected requests of particular interest. So now you need to add a flag to the CDB to indicate that a callback is required. You need to go through the whole nasty SCSI standards process to determine where the flag goes, how requests are identified in the callback, etc. Then you need every OS, driver, adapter, controller, etc. to add support for propagating the flag and handling the callback. Ugh.

    It's a great idea, really it is. It's The Right Way(tm). But it's just never going to happen in the IDE world, and it's almost as unlikely in the SCSI/FC world. 1394 seems a little more amenable to this, but I have no idea whether it's actually done (I doubt it) because even though I know they exist I've never actually seen a 1394 drive close up.

    I hope all this helps shed some light on the subject.

  • The difference is that Reiser is NOT a journaling filesystem (well, not any more that, say, NT or BSD UFS filesystems are), since it only journals the meta data

    So does XFS. From one of SGI's own presentations []:

    5.6. Supporting Fast Crash Recovery
    ...To avoid these problems, XFS uses a write ahead logging scheme that enables atomic updates of the file system. This scheme is very similar to the one described very thoroughly in [Hisgen93].
    XFS logs all structural updates to the file system metadata. This includes inodes, directory blocks, free extent tree blocks, inode allocation tree blocks, file extent map blocks, AG header blocks, and the superblock. XFS does not log user data.

    [emphasis added]

    This is *normal* for a journaling filesystem. Very very few actually log or otherwise protect file data, because of the cost. Maintaining a metadata-only log is already a significant performance limiter, and journaling data as well would just be prohibitively expensive. Most users wouldn't even want it, if they had to pay the performance cost.

  • by ttfkam ( 37064 ) on Tuesday May 01, 2001 @09:09AM (#253520) Homepage Journal
    I have been using XFS on my home machines since v0.9. The installer has had a couple of glitches in the past (0.9 left me without access to the network and my cdrom drive by default). The recent beta fixed a lot of problems and was based on RedHat 7.1 (as opposed to 7.1 betas from earlier releases).

    I haven't tried the 1.0 release yet. There's only so many hours in the day. On the other hand, the last install I did with the beta, after installing everything I wanted, I fired up a dozen programs such as Mozilla, GIMP, Nautilus, etc. While the drive was churning, I hit the power switch. For those of you who have used ReiserFS, I'm sure this is no big deal. ;)

    It should be noticed that on my Athlon 800MHz w/ 128MB of RAM and a 27GB hard drive, I almost missed the filesystem check as it scrolled by on bootup. That had me sold forever on journaling filesystems.

    I haven't seen any visible performance differnece though. There may be, but so much has changed on my system that any subjective comparisons are almost impossible/meaningless. For example, devfs is enabled by default, there's a more up-to-date kernel and the drive has a different partition layout. Who could tell what the FS performance difference may be. I definitely don't need to go back to ext2 just to see if my switchover was justified. Any more info will just be icing.

    If someone wants to post "real" benchmarks (lies, damn lies, and all that) I'd love to see them too.
  • Does anyone have a good comparison of XFS, Ext3 and other journaling filesystems? I know Ext3 is being used on [], also know as [], and it works very well there. I'd like to give XFS or Ext3 a whirl but I don't have time to search out all the gotchas myself.


  • I've not looked at the docs, but this question is kind of applicable to all the journalled file systems and something I've been curious about:

    How do you get them to work with, say, RedHat, from an installation standpoint? (I imagine it's relatively easy to convert an extra disk attached to an already installed Linux box, but what about making your whole system with the new FS?)

    At no point does RedHat ask me which filesystem I'd like to install, so that option is out (except for Mandrake and Suse?).

    Can you convert a drive you've already got data on? Could I simply point at my disk drive and say, "turn that into an XFS drive," edit a few boot params, and be done?

    Surely, it's more complicated.

    Have any of you done something similar?

    Any recommendations on how to get it working with the least amount of hassle?

    Just curious.
  • Wow. This is MS-qualify FUD.

    I wonder if folks over at SGI plan on dropping Irix in the near future for Linux entirely. As it stands right now the majority of their hardware run Linux, and the last version of Irix released was to mainly fix bugs.

    The only SGI systems that run Linux right now are the low-end Intel workstations (230, 330, and 550) and the Intel rack-mount servers (1100, 1200, 1400, 1450) - certainly not a "majority of [our] hardware.

    IRIX on MIPS is not going anywhere. Take a look at SGI's IRIX/MIPS roadmap. []

  • by chrysalis ( 50680 ) on Tuesday May 01, 2001 @09:05AM (#253528) Homepage
    XFS is still an external patch, it's not included in the official kernel. And it seems that there is a delay between a new kernel release and a new XFS version for that kernel.
    XFS 1.0 is against kernel 2.4.2 . Or 2.4.3, but SGI says it may be instable with this version.
    But the current kernel is 2.4.4 (or 2.4.4-ac2) .
    And 2.4.4 fixes important problems that previous kernels had. For instance, it fixes serious security flaws in Netfilter.
    So, today, you can either run XFS, or get a fixed kernel. Not both.
    This is why I'll stay with ReiserFS, until XFS get officially included in the kernel.
  • by NetJunkie ( 56134 ) <jason DOT nash AT gmail DOT com> on Tuesday May 01, 2001 @08:38AM (#253530)
    I used it on my companies web servers at my last place. We had millions of tiny files, and EXT2 wasn't cutting it. ReiserFS worked great.

    And if you want someone better... SourceForge's FTP site is half on ReiserFS. So it works for them.
  • by NetJunkie ( 56134 ) <jason DOT nash AT gmail DOT com> on Tuesday May 01, 2001 @08:49AM (#253531)
    I used some of the install disks someone made to install Debian to a 100% ReiserFS system. Does anyone know of any disks to do this for XFS?

    The Debian disks are on Freshmeat and work GREAT.
  • I really don't know how much of an issue this is. Normally, I don't ever bother using the reboot command in BeOS, I simply hit the reboot key on my computer. I've been doing this for more than a year now, and I have yet to lose any data.
  • IIRC, SGI had to hack the VM a little bit to allow it to notify the filesystem when a page was actually flushed to disk.
  • Don't forget attributes.
  • There are some known problems with earlier version of xfs (did try Test-1 myself, and it did not work :)

    so did anyone try it yet ? does it work ?
  • Avoiding any possible Microsoft-jokes, I think this [] qualifies... Although I'm not 100% sure it's bigger, documentation-wise. Who knows, your favorite might be implemented as a symlink to this classic? ;^)
  • Now, if only RedHat (or SGI) would integrate support for installing to an XFS filesystem on an LVM volume.

    For those of you who don't know what LVM is: with LVM, you make a disk into a pile of storage blocks, then you put those blocks into a pool. That pool is a block device, and you can create a file system on it.

    The nice thing is that you can add blocks from many different drives into the pool, so a volume can span multiple physical disks. Need more space on root? Just pop a new disk into the system, add it to the pool, and expand the XFS filesystem into the new space. All without losing any data (indeed, even without downing the system...)

    This is different from RAID in that with RAID, you cannot add disks to the array and get more storage. You can add more disks to serve as hot spares, but you don't get any more space without rebuilding the array (and losing your data).

    Of course, the best thing is RAID-->LVM-->XFS, but....
  • I was running 2.4.3 with reiser and knfsd patches for a week or so with NO problems. I first tried NFS over reiser 6 months ago (or so) and this is the FIRST TIME that I've had no problems.

    I also, just today, compiled 2.4.4/w knfsd patches; seems fine too.
  • Anonymous Coward writes: upmoderating generic technical-sounding hooha delivered with a URL that you do not check is BAD.

    Uhm, I just tried the link. It's not broken. It works. Maybe you should restart your proxy or something.
  • hahaha

    Seriously, anyone who is so uneducated to be confused or bewildered by having to choose from Gnome and KDE will not be fiddling around with filesystems anyway. They'll use whatever mandrake gives them on "beginner" or "automatic" install.
  • by barneyfoo ( 80862 ) on Tuesday May 01, 2001 @09:10AM (#253543)
    Unlike Reiser, it currently works with NFS.

    Yes this was an issue with Reiser, but they have had patches for it since 2.4.2 to work with NFS, and I beleive that full NFS support might be in 2.4.4 (not sure).
  • by barneyfoo ( 80862 ) on Tuesday May 01, 2001 @09:34AM (#253544)
    Minor quibble, I checked the reference, and that is reiserfs3.5 not 3.6 (the difference is that 3.5 is the linux2.2 reiser, and 3.6 is for linux2.4). When looking at 3.6 results, they appear marginally better, but your point still holds.

    Create____203.88 / 171.95 = 1.19
    Copy______411.67 / 384.59 = 1.07
    Slinks____3.23 / 2.81 = 1.15
    Read______1165.61 / 1291.76 = 0.90
    Stats_____1.49 / 1.17 = 1.27
    Rename____1.81 / 1.32 = 1.37
    Delete____14.46 / 3.95 = 3.66

    As an aside, it's pretty hard to get much faster than ext2 for this statistic, reading of bulk files greater than 10k less than 100k. You need to weigh what reiserfs gives you against what it could slow down. Reiserfs has truely awesome small file speed, and very nice tail packing.
  • by barneyfoo ( 80862 ) on Tuesday May 01, 2001 @08:32AM (#253545)
    I dont know how it compares to XFS. But go here [] to see how ReiserFS compares to Ext2, Ext3. (Hint: it kicks its ass). Add in journaling and you have a killer combo. XFS is a little more industrial strength as opposed to general purpose. If you're streaming gigabyte files and processing them on the fly, I imagine XFS is the way to go.
  • The future, IMHO, is a log structured file system with NO journaling and atomic updates. This creature already exists, and it is called FFS with Soft Updates, from the FreeBSD developers. Here is the breakdown.

    Journalling is tricky, as it requires lots of intervention at other places in the kernel. You need to keep something synchronous - journalling just makes that something very small. Atomic updates avoid synchronous issues altogether. Instead, they structure the file system in groups of data and metadata. In each group, there is an atomic bit. When set, it means the group is intact. So, upon looking through the groups, you can immediately determine which ones are intact and which are incomplete. Recovery is REALLY fast after a power outage, in theory even faster than a journal recovery.

    ReiserFS and XFS are also really great, so these have log structure (or btree) and journalling. However, ReiserFS is broken with NFS constanly, and that is a BIG problem. Not to mention the version in 2.4.x is incompatible with the version in the 2.2.x tree. Don't let the XFS 1.0 version fool you. Ever see the fallout when Alexander Viro (kernel VFS hacker) takes a newly merged filesystem to task ?? It is not pretty.

    Tux2 is still vaporware. But it will be great when it comes out. Ext3 has some advantages. It has been running stably for a long time now under development. It is journaled, and has a small code base. It also only exists for the 2.2 kernel series. Phillips is also making a judgment call. He wants to build on ext2 with tux2. Ext2 is not log structured, which is why ReiserFS can beat it in well-structure benchmark tests run by Hans.

    And the future for linux file systems?? I don't know, it is always interesting to see where things will head. The world is clamoring for easy crash recovery, and ext2's days are numbered. I think most people would be quite happy to simply add journaling to ext2. Or atomic updates. So I predict, after consulting the crystal ball, that tux2 develops a large following after release, and that Phillips then adds btree searches and log structuring, making it the first linux file system with all that. That would then bring the state of the art file systems for linux up to par with those of FreeBSD. Of course, in linux at that time you can also use JFS, XFS, ReiserFS, or ext3 journaled file systems. But journaling is worse than atomic updates, both for complexity and speed. Soft updates are more flexible than journaling, and - with a filesystem whose basic structures are designed to take advantage - perform better than journaling. I find it just slightly weird that there's so much focus on journaling when a superior alternative is known.

  • Software and hardware RAID support is already in the kernel and in use. LVM supports data striping if you don't want to go the full RAID route.

    I don't see how things can possibly get much better.
  • by sheckard ( 91376 ) on Tuesday May 01, 2001 @08:44AM (#253550) Homepage
    ext3 is a hack to add journaling to ext2. An ext3 partition is backwards-compatible with ext2, so in a worst-case scenerio you could just mount it as ext2 and lose nothing but journaling. However, the support right now is 2.2 only, and personally, I don't think it's such a great idea to maintain backwards compatibility when so many underlying things change. This will only lock us into any bad compromises that were made in the design of ext2/3.
  • by sheckard ( 91376 ) on Tuesday May 01, 2001 @08:39AM (#253551) Homepage
    Well, the biggest difference is that XFS is proven and Reiser isn't yet. XFS has been the IRIX filesystem for something like 6 years now, and the on-disk filesystem format does not change between revisions, even during the development stage. You can even mount an IRIX disk under linux and read and write normally. The only thing in development in XFS were the userland and kernel-space tools. Compare that the Reiser where things tend to change a fair bit much.
  • Um, 100baseT is 100 megabits per second. That's 12.5 megabytes per second. Even a "shitty ide hard drive" can saturate that.


  • Very fast?

    I don't know what world you're living in, but I get 20mb/sec throughput on my ibm 75gxp. And that's on a plain old ATA33 controller with dual celerons 366.

    5400rpm drives haven't been "very fast" for at least 5 years.


  • Well, thats not _entirely_ true :) There was one absolutley devastating patch in the IRIX 6.2 time frame where 6.2 boot media couldn't boot machines that had the XFS patch applied to it.

    That sucked :)

    But i agree with you in general: XFS rocks. We were one of the first XFS customers on irix 5.3, and it didn't rock quite as much back then, but by the time irix 6.2 shipped it was pretty fantastic :)

    I remember reading a post in comp.sys.sgi.{something} from one of the SGI guys... to the effect of "we have XFS doing sustained write performance of 2gb / second here in the lab"

    That rules. :)
  • by bmajik ( 96670 ) <> on Tuesday May 01, 2001 @09:40AM (#253556) Homepage Journal
    its meant to be a hi perf filesystem, from the start.

    I mentioned this in another post, but SGI claimed internally to have it sustaining 2gb/sec of _write_ performance across a suitably large number of spindles.

    Also, one thing i dont see people mentioning is XFS's support for GRIO (guarnateed rate I/O). No linux filesystem has that, and the linux kernel plumbing to support it i think is SGI contributed (if its on xfs for linux yet, i can't recall).

    The idea of grio is an app says ahead of time "i need this much disk performance - figure it out", and the OS will say "yes, i can hook you up" or "sorry, throw more money at the problem".
  • by dbrower ( 114953 ) on Tuesday May 01, 2001 @10:28AM (#253557) Journal
    And the sistina GFS [], which also supports clusters of machines having concurrent access to SAN storage.


  • Could you PLEASE come to the realization that "microsoft or any other company" CANNOT "close" open source BSDL'd software?

    What he means is that they could do an M$-kerberos on it. Take it, tweak it to be incompatible to external parties, and then M$ effectively gets a free, better performing proprietary filesystem, for close to zero research dollars.

  • Any decent RAID card'll have memory with a battery backup. The PERC 2 cards in my Dell servers have 128 megs of RAM, and a three day battery.
  • by rgmoore ( 133276 ) <> on Tuesday May 01, 2001 @09:30AM (#253560) Homepage
    ReiserFS really shines with lots of small files. (your mp3 collection for example)

    Since when do mp3s count as small files? Most of the interesting ones are several MB in size- much larger than a typical block size- so the extent to which ReiserFS would actually help is minimal. Where something like ReiserFS would be really helpful is in dealing with directories like /etc and /dev that are full of a large number of very small files. It might also be useful for /var, as ISTR that it has some anti-fragmentation aspects to it that would help with the rapidly changing data there.

  • by Srin Tuar ( 147269 ) <> on Tuesday May 01, 2001 @08:48AM (#253562)
    Itll work with LILO installed in the MBR. But if you want LILO in your root partition it wont work. Mostly this is due to the fact that SGI wishes XFS to be disk-compatible across systems.

    Its probably still a way to go until its well integrated with the distributions, but I think this FS has potential. Unlike Reiser, it currently works with NFS.

    I guess its a race to see which of these will ultimately become the common denominator FS for linux. Reiser currently has the lead, due to Suse and being in the kernel.

  • IRIX 6.5 can best be described by the neat poster SGI made about a year ago... it features a train chugging along past 5.3 all the way to 6.5 and through the 6.5.X updates. The front of the locomotive sports the designation "6.5.oo" (infinity). While 6.5 most likely won't be around forever, there are NO current plans to release a 6.6 or a 7.0.


    Becasue 6.5 is the best thing SGI has ever done. There is an update released quarterly, on time, along two different streams: [M]aintenance and [F]eature. No patch dependency hell (though urgent patches are posted between 6.5.X updates). The Feature release gets new goodies every quarter, some of which are slowly rolled into Maintenance. Both streams are HEAVILY tested, tuned, reviewed, and compiled using the latest available MIPSpro compilers. In fact, SGI likes to stay a release or two ahead of their users, using it on most of their production systems to ensure a rock solid release.

    The (large) IRIX 6.5 team has been plodding away ever since 6.5.0. When asked "where's 6.6" they will usually respond: "you didn't ask us to break application compatibility, so we aren't working on a 6.6".

    I wouldn't want it any other way. Aside from a few small networking issues early on, 6.5 has been rock solid for me over the years. Each quarterly update has been surprise-free and without incident.

    SGI MIPS/IRIX Roadmap: mi ps.html

    The Mandate of Application Compatibility in SGI IRIX 6.5
    (An excellent whitepaper on the goals and future of IRIX 6.5, written by an IRIX 6.5 engineer) se .cgi?coll=0650&db=bks&cmd=toc&pth=/SGI_Developer/m andate_IRIX

  • by The Night Watchman ( 170430 ) <smarotta@gmail . c om> on Tuesday May 01, 2001 @08:30AM (#253567)
    Yes, finally! Vince McMahon has come through to give us XFS: the eXtreme File System! Complete with new rules and directory structures, this will appeal to even the most hard-core file system fans. Under the new rules, all crosslinked files WILL be deleted on the spot, multiple programs attempting to write to the same block will be penalized for thirty million clock cycles, and all deletions are FINAL. And just check out the i-nodes on the cheerleaders. I think you'll agree this will be the new pop phenomenon.

    Oh. Wait. Journaling file system? oops... never mind.

    /* Steve */
  • by Sir_Real ( 179104 ) on Tuesday May 01, 2001 @08:21AM (#253571)
    What's the big diff (pun intended) between Reiser and XFS? Which is better? (I realize that this may start a holywar, but I want the brief synopsis and analysis since I'm not a sysadmin.)
  • by Alien54 ( 180860 ) on Tuesday May 01, 2001 @09:11AM (#253573) Journal
    Since some file systems fits some purposes better than other file systems, and other file systems fits other purposes better than some file systems, what criterias do you have to consider when selecting a file system from another?

    Some basic info and a couple of links for folks:

    • file system - basic defition -the general name given to the logical structures and software routines used to control access to the storage on a hard disk system. Operating systems use different ways of organizing and controlling access to data on the hard disk, and this choice is basically independent of the specific hardware being used--the same hard disk can be arranged in many different ways, and even multiple ways in different areas of the same disk.

    • Journaled file system - Basic definition (as seen here [])

      A file system in which the hard disk maintains data integrity in the event of a system crash or if the system is otherwise halted abnormally. The journaled file system (JFS) maintains a log, or journal, of what activity has taken place in the main data areas of the disk; if a crash occurs, any lost data can be recreated because updates to the metadata in directories and bit maps have been written to a serial log. The JFS not only returns the data to the pre-crash configuration but also recovers unsaved data and stores it in the location it would have been stored in if the system had not been unexpectedly interrupted.

    • IBMs JFS webpage [] on their system, along with links for for downloads and turtorials online,etc

    There is an awfull lot of info at the SGI site []. Just poke around.

    As far as the question about how to choose file systems, that is often a matter of what the OS will let you get away with, and your needs. Using FAT 16 is recommended if you need to maintain compatibility with MSDOS, for example. Usually, this is something like if you have a multi boot scenario, and which OSen can mount which partitions with which partitions. MS is notoriously picky in this regard, with a "My way or the Highway approach". For example, if you have a single hard drive hooked up to your computer for configuration purposes, You cannot just create anextended partition unless that drive is a salve with another master. If you want to create just an extended partition it will not permit, and tell you that you can only create a primary dos partition instead.

    So you Live and you Learn

    Check out the Vinny the Vampire [] comic strip

  • by fatphil ( 181876 ) on Tuesday May 01, 2001 @08:56AM (#253575) Homepage
    Be wary of statistics...

    For example, picked pretty much at random from the mongo results, Linux-2.4.2 Ext2 vs. ReiserFS-3.6:


    Create 203.88 / 187.01 = 1.09
    Copy 411.67 / 411.28 = 1.00
    Slinks 3.23 / 2.99 = 1.08
    Read 1165.61 / 1325.27 = 0.88
    Stats 1.49 / 1.48 = 1.01
    Rename 1.81 / 1.30 = 1.39
    Delete 14.46 / 5.64 = 2.56

    So the total time of the test is 1802.15 / 1934.97
    = 0.93. (i.e. Reiser is 7% slower performing the whole test.)

    I don't care if they make the thing that takes a tenth of a millisecond twice as fast, it's the reading of the bulk of the file that takes the most time, and for that part, Reiser is slower.

    However, each individual has got to look at what is most important for them, for me it's 99% file read time on medium to large files (30K source code, 200K log files, that kind of thing), and judge accordingly.

  • by Thax ( 189063 ) on Tuesday May 01, 2001 @08:49AM (#253576)
    One of the big plusses that XFS has over Reiser and ext3 is that it is time tested. It has been used on IRIX machines for a long time and is rock solid on those platforms, its been running under linux for a long time in a beta stage and now SGI appears to believe that its as robust and useable in production environments.

    So what is one of its strongest strengths over the other journaling fs's?

    Time tested reliability.

  • by SuperBug ( 200913 ) on Tuesday May 01, 2001 @08:55AM (#253579) Homepage Journal
    XFS has a few things that Reiser doesn't. One of the biggest things I can think of is Syncrhonous/Asynchronous IO, Modifiable Journal sizes(at create time), and the journals are entirely different. So, while reiser is WAY kewl, XFS offers more in the way of the capabilities Veritas offers. This I like because it's more like having a FREE version of veritas. Most people don't use all the capabilities of that slow bloated beast anyway, so something which still journals like a champ, and a bit faster overall than reiserfs, is ok by me. I'd been using ReiserFS for several months now, and am running it on my oracle server. I recently reinstalled with the RedHat XFS install cd that SGI put out and I must say it is definitely faster in many respects than reiser. Also, the error instance boot time startup log checks are almost 'un-noticeable'. This I feel is a much needed change compared to that of reiserfs or ext3. I still await Tux2 However, to hopefully be an inline replacement to ext2. But for now, my systems will use both Reiserfs, and XFS for sure! Nothing but goodness so far!
  • Well.. as long as it's not affiliated with the XFL I guess it will be okay! Seriously, how does this file system compare to ext3? Should I just wait? :)
  • by Cardhore ( 216574 ) on Tuesday May 01, 2001 @05:09PM (#253591) Homepage Journal
    I've tried ReiserFS since Linux 2.4.1 . . . big mistake. 2.4.1 corrupted the filesystem. 2.4.2 corrupted every one of my Resier filesystems beyond recovery. Then I switched to Mandrake 8.0 (which caused an extremely large amount of corruption). It corrupted my EXT2 filesystems as well (which were fixable) due to IDE problems. 2.4.3 finally fixed these problems, but it had a new set of problems with threaded processes that got hung in the 'D' state. (A process in the 'D' state can not be killed; it is necessary to reboot the machine. Processes like Mozilla were especially suseptible.) Also, if there were a problem with the filesystem, it was near impossible to fix with the ReiserFS tools.

    Finnaly, 2.4.4 was released, and it is fixed: it's the first "stable" kernel in the new series.

    I never read a single bad review of ReiserFS until I actually used it--it worked "flawlessly" for everyone who had tried it. I didn't find out that it had these problems, and that it doesn't work over NFS, until it was too late.

    The thing I learned is that when things--especially filesystems--claim stability, the user still has to test things out for himself.

    ReiserFS is a good filesystem; don't get me wrong, but it may not be the best for you. (In fact, Red Hat does not plan to use ReiserFS in its distribution, because in the event of filesystem failure, it is near impossible to recover the filesystem with standard tools.)

    I have used XFS in the past on Irix machines and have been very happy with it. But be careful before you deploy this filesystem--even on your home machine--without thoroughly testing it. And not simply creating two files and saying, "Hey! They're still there! I guess it's stable." I fell into that trap.

    I would highly reccomend anyone running the 2.4 kernel to upgrade to at least 2.4.4, especially if he uses IDE or ReiserFS.

    If you're going to use XFS, test it first.

    By the way, does anyone know what's going on with moderation? I've had mod points three times this week, and there are a huge amount of +5 comments.

  • by mojo-raisin ( 223411 ) on Tuesday May 01, 2001 @09:14AM (#253592)
    My favorite utility [] of the xfs distribution. Where else could you find so much joy about a program that does nothing?
  • by Xibby ( 232218 ) <> on Tuesday May 01, 2001 @08:59AM (#253594) Homepage Journal
    You can boot off of ReiserFS just fine with lilo. Ealier versions or lilo and Reiser required that you used the --notail option when mounting the root partition. Since this negates the usefulness of reiser, it was recomended that you lop off a /boot partition and mount that with the notail option. I belive newer versions of lilo don't need the notail option, but the ReiserFS docs haven't been updated.

    ReiserFS really shines with lots of small files. (your mp3 collection for example) You'll generally reclaim some space on your drive when you go from ext2 or vfat to reiser.

    XFS is good when high performance is needed when dealing with large file systems (terabytes) and large files (1,2 gb files.) For a standard home user, it's overkill. :) But many slashdot readers like overkill...
  • by Daniel Phillips ( 238627 ) on Tuesday May 01, 2001 @02:13PM (#253596)
    I keep hearing little tidbits about Tux2 (Tux2: The Filesystem That Would Be King). I can't find Daniel Phillips's website anymore nor have I seen any more information about it in the last few months. What I have read and heard about it sounded very interesting (I would say it's sounds promising but I don't know enough about file systems to know how good an idea it is realistically). Would this be under "currently abandonded" or "abducted by aliens"?

    I took a several-month detour to build a new directory indexing system (Htree) for Ext2, something it needs badly. Now it's back to Tux2, keep tuned. The new homepage for the project is:

    href= []

    The mailing list is still hosted by innominate, but I am not with innominate any more. When I get time I'll move the list and resubscribe everybody.

  • by cmowire ( 254489 ) on Tuesday May 01, 2001 @09:31AM (#253599) Homepage
    If you watch SGI's strategy, they seem to be moving towards that direction. They are keeping Irix around primarily because Linux isn't ready and there aren't any good 64-bit processors out that fit with their business model other than a MIPS, right now.

    I mean, think about it. Why bother writing all of the kernels and utilities when you can have the hackers of the world pick up the slack? SGI can't put as many developers on Irix as MS can put on Windoze. So they are developing Irix only for the MIPS machines and keeping Linux for their Intel machines.

    And the strategy is pretty evident. They have been very supportive of good OpenGL under Linux. They have XFS, clustering software, etc. All of the Irix advantages are getting ported over.

    The problem is that they haven't been able to move over to the Intel platform properly. Their first attempt was a fiasco. The Onyx 3000 series was designed to be a transitional system. It can work with either a MIPS processor or an Itanium. But the Itanium delays are making that hard. And, unlike the desktop workstations, you can't stuff a Pentium 4 in a Onyx because you need 64 bit addressing to make their NUMA architecture work -- each processor gets a piece of the address space. With a 512 processor Onyx 3000, that makes 8 megs of RAM per processor. So Intel is holding up SGI's full migration to Linux.

    Now, as far as the stability of SGI, I'm not entirely sure. They are still bleeding money, and at a faster rate than last year, too. Given the downturn in the tech economy, they are going to be hit with it, too. It's very shakey.
  • Avoiding any possible M$ jokes? You must be kidding. M$ is capable of producing software that will do nothing, unsuccessfully, a la "An unknown program has terminated in an unknown module with an unknown error. Please reboot your system."
  • by deran9ed ( 300694 ) on Tuesday May 01, 2001 @08:42AM (#253603) Homepage

    I wonder if folks over at SGI plan on dropping Irix in the near future for Linux entirely. As it stands right now the majority of their hardware run Linux, and the last version of Irix released was to mainly fix bugs.

    Its a shame that SGI has done pretty poor the past few years, when they're such kick ass machines, and personally I think they should kick the marketing teams asses.

    I know previously they've used a customized version of Windows exclusively on their 320/540 servers, I guess they changed em all around to avoid fireselling them at crackhead prices. Maybe someday I'll see a BSD running on an Irix machine to see how it would run in comparison to Linux (don't bother to troll this post this is not an OS war-penis-envious-linux-vs-bsd-post) as far as benchmarking is concerned. As for XFS support I though it was supported for reading and not writing? Oh well I don't use Linux anymore ;)

  • In theory, yes (i.e. if you could stream data straight off the HD and out onto the wire). In practice, no. The data must first be read by an OS, is seldom contiguous, has additional FS overhead to be read, there are additional delays in network protocols (e.g. in the protocol layering (SMB/TCP/IP/802.3, also for a reliable protocol like TCP, every byte that gets sent must be acknowledged), in processing interrupts, the computer must also process other things (GUI, mouse etc, and perform scheduling). Also, multiply all the overhead by 2, because the other computer needs to be *reading* that data, and usually also saving it to a hard disk. Add to that network collisions - on an 802.3 100 MB LAN, the high number of collisions that start to occur as the LAN approaches > 70% usage starts to seriously degrade the network (there will be collisions even with only 2 PCs on the LAN copying between each other, because remember, you have addition protocol overhead for ACKs etc).

    So in practice it is nearly impossible to copy files from one PC to another over a LAN at anywhere near 10 MB/s. If you manage between 3 and 5 MB /s, then you are doing quite well. If you don't mind unreliable and you are streaming the data in one direction (e.g. with UDP), then you may be able to do a bit better than that, but in my experience I've found that it is *very difficult* for one computer to even approach saturation of 100 MB, even with data unicast with UDP that isn't being read off a hard disk. You can try this yourself, write a simple UDP sockets app that just sends data on one end, and receives data on another, then output some stats on the number of bytes sent/received.

    See Tanenbaums "Computer Networks" (3rd Ed), he has a useful section on the performance of networks, basically discussing why most of the latency is in software, in the protocols etc.

  • by janpod66 ( 323734 ) on Tuesday May 01, 2001 @02:57PM (#253612)
    As far as I'm concerned, journalling file systems are pretty useless. They only protect against a limited set of failures, so you still need backups or RAID for important data. They do save the overhead of running fsck at startup, but you spend more time in aggregate overhead than the occasional fsck. And journalling file systems generally only journal file system structure, not content, which means that partially written files may contain garbled content. Some file systems that claim they are journalling (NTFS, for example) don't even guarantee that operations are carried out (and journalled) in order, so they make essentially no meaningful guarantees compared to a non-journalling file system other than an unproven belief that they may be more robust.

    In addition to the overhead, you also have to deal with the risks to your data from the fact that both the file system code itself is more complex and that utility programs and administrative tools may do the wrong thing with journalling file systems.

    Altogether, I think you are better off with a RAID and a UPS; unless you have some serious failure, that will pretty much avoid the need for running fsck. If you have really critical needs, you will want a hot backup system that you can switch to if your primary system goes down anyway; that takes care of a lot of other problems and also lets you spend however much time you need on fsck.

    (As an aside, fast reboots can't have been a driving factor for JFS on AIX: while JFS may have spared people the time for an fsck on reboot, many AIX server machines spent minutes or hours (!) scanning their SCSI buses on each reboot. I think many people who use journalling file systems don't do it because they need it but because it sounds "safer".)

  • This becomes a serious question if you consider things like gigabit ethernet. We're also starting to see some hardware out there that makes IDE raid systems a little more reasonable.

    I looked into this a little bit for my research group at UIUC. We were wanting to buy some more disk space, somewhere between 400GB and 1TB. There were two options I considered.

    • Buy a bunch of SCSI disks and put them in our existing SCSI controller which has some free space. We would get a set of 6 drives, either 70GB or 160GB each. One would be redundant.
    • Buy an IDE raid server and run Linux on it. We could connect 6 80GB IDE drives to a 3ware IDE SCSI card or some such thing. Since IDE drives are about 1/4 the cost of SCSI drives, and 6 80GB drives cost about the same as the computer to support them, this ends up being half as expensive per GB. Some collaborators at Vanderbilt did this [] a year ago.

    In our situation we wanted to be able to process data as fast as possible. We have a growing collection of dual-PIII "compute servers" and divide our data amongst the computers. Typical jobs will run on a dozen of these computers (24 CPUs) and rip through data in either minutes, hours, or even months depending on the job. We are often I/O-bound.

    We went with the SCSI disks for a few reasons:

    • SCSI disks have their own internal cache and can read or write chunks out of sequence to minimize head travel. We were only guessing that this could be a big deal since we were often reading and writing a few dozen data streams at once, saturating the server. But we haven't done any tests so I don't know how big a factor this is in reality.
    • SCSI disks were hot-swappable - no downtime.
    • This solution is more scalable and convenient. One doesn't want to manage several disk servers if it's not necessary.
    • Our Sys. Admin. insisted, for these reasons and more.

    Of course without the infrastructure of our existing RAID box, the economy would slant much more toward the IDE RAID solution. And for a home environment I think smaller-scale things like the ABIT KT7A-RAID card might also become very handy. Last I heard, the RAID controller it used wasn't fully supported in Linux, but that information is probably out of date by now.

    We are currently using OSF1 for our server instead of Linux primarily because of the advanced filesystem: a 64-bit filesystem, ACLs, partitions that span multiple disks, and so on. It's good to hear that most of these advantages are now available to Linux, and XFS looks extremely promising. Keep up the good work, everyone!

  • by pat_1729 ( 416011 ) on Tuesday May 01, 2001 @08:50AM (#253614)

    XFS has ACLs, unlike ReiserFS and ext3 (last time I checked).

    Also, XFS comes with xfsdump and xfsrestore, which can back up and restore the ACLs. I believe it is the only ACL-enabled file system for Linux which has such utilities (unless you count AFS).

    So for a production environment where you want ACLs, XFS is the only choice right now. And it seems likely to remain so for a while.

    Also, Samba 2.2 has built-in integration with XFS ACLs, making Linux+XFS+Samba a very interesting option for replacing NT file servers.

  • by Professor Calculus ( 447783 ) on Tuesday May 01, 2001 @10:00AM (#253617)
    ReiserFS is a unique(AFAIK) new design intended to boost performance for small file accesses. It is highly optimized not only to reduce the number of seeks necessary to get to a small file but also to continually re-arrange the data for faster access using very advanced(and kinda slow) algorithms. It's approach is to use spare disk and CPU time to decrease file access time. It is not focused on large files and does little optimization for them.

    XFS, on the other hand, was designed with today's multimedia systems in mind. It supports systems in the millions of terabytes range, and has highly scalable, optimized data structures for metadata and journaling. Thus it is able to run on multiple CPU and RAID servers, with the intention to actually be serving multiple raw video or other high bandwidth streams. It has a special subsystem that handles Guaranteed Rate I/O, both soft(error checked) and hard(will deliver bad data if needed) data rate guarantees. I'm not sure if the GRIO stuff is fully supported in linux right now, but it requires special system calls which linux software obviously doen't have support for yet anyway.

    ReiserFS and XFS are both VERY cool, but they address completely different areas of file system optimixation, desktops and multimedia servers. Use the right tool for the right task, etc. means that XFS is the only choice for high bandwidth linux data servers at this point. That's why this is a big deal.

"We don't care. We don't have to. We're the Phone Company."