XFS 1.0 is Released 173
Isldeur was the first of many to note that SGIs now open source Journaling File System "XFS" has announced the release of
version 1.0. It, Reiser, the new ext format continue to be an area of debate, but regardless, Journaling file systems are nice to eliminate those slow fsck boot ups, and to protect all your pr0n when you lose power and realize that you plugged the UPS into your stereo by mistake (not that I've done that. No sir.)
Re:Booting (Score:1)
Re:okay okay.... I'm not informed... (Score:1)
Re:Standardization (Score:1)
This filesystem agnosticism is a wonderful and fairly unique feature of Linux, not a problem, and it's not a new thing - just the luxury of many journalling systems is new.
Real issue is HARD DRIVE CACHEs. (Score:1)
Re:do nothing, successfully (Score:1)
ext3, JFS, and ReiserFS all have "real" fsck programs
Re:okay okay.... I'm not informed... (Score:2)
Re:Nice FS, shame about the license (Score:2)
Re:okay okay.... I'm not informed... (Score:3)
Re:Nice FS, shame about the license (Score:3)
Had they placed it under a BSD license, effort that has been put into producing an open and free filesystem could be closed by a company such as Microsoft. Why should I let them profit if they don't contribute or at least acknowledge my work?
As it is, xfs is under the GNU GPL and is thus protected from being made proprietary. The GPL protects the rights of free software authors. Myself, and thousands of other free software developers worldwide, wouldn't have it any other way.
Re:journaling is nice, but how about a better RAID (Score:3)
#hdparm -Tt /dev/hda
/dev/hda:
Timing buffer-cache reads: 128 MB in 0.89 seconds =143.82 MB/sec
Timing buffered disk reads: 64 MB in 2.94 seconds = 21.77 MB/sec
Observe the time taken during the buffered disk read test - 21.77 MB/sec. This is on my year-old Athlon system. AFAIK 100mbit networks don't tend to transfer data at speeds faster than 10 MB/sec. Perhaps you meant that you upgraded your home network to gigabit. Or not.
Use hdparm to ensure that your hard drive is set to use DMA.
Filesystem info (Score:5)
Re:okay okay.... I'm not informed... (Score:5)
The eventual plan is to have Linux for the Intel servers and IRIX on the MIPS ones, with IRIX being phased out over a long period of time so as to keep the old customers from getting paranoid. There's even rumors internally about servers with *BOTH* intel and MIPS processors in them running Linux. If you watch SGI's Linux pages, you'll notice that more and more support is made available for running Linux on R10K, R12K and other heavy-duty processors, not to mention SGI's memory architectures (e.g. ccNUMA).
My own theory is that the now-EOLed 320/540 workstations were an experiment to see how SGI's customer base would react to non-MIPS/IRIX workstations and to get everyone warm to the idea of SGI branching out.
SGI is a company to watch over the next few years, and releasing things like open-sourced XFS for Linux are just teasers of what's to come.
Re:okay okay.... I'm not informed... (Score:1)
That's not unique to SGI. Any qualtity hardware RAID controller has an onboard battery backup. Even the ones meant for PeeCees.
Re:2.4.4 is 3 days old. (Score:1)
Careful moderators!!!!! (Score:2)
SGI's XFS page: (Score:4)
- A.P.
--
Forget Napster. Why not really break the law?
Re:What happend to Tux2? (link) (Score:3)
http://www.kernelnewbies.org/~phillips/ [kernelnewbies.org]
What happend to Tux2? (Score:4)
tux2 looked really good. Supposed to be faster than traditional journaling, and preserves file data as well as metadata.
Anyone?
The Current Tally... (Score:5)
(I don't count NTFS, because that is hard-pushed enough to be called a genuine filesystem, never mind a journalling one.)
Feel free to reply to this, adding any that I've missed.
The Logging filesystem does much the same thing as Ext3 - it is an extension to Ext2 - but it looks like it would be a lot more useful than Ext3. IMHO, it'd be much better if neither of them were so FS-specific and could be used as a generic wrapper. SnapFS does exactly this, for example.
Anyway, on with the list of Journalling Filling systems...
-1, Plagarist karma whore (Score:2)
Re:works with samba 2.2 acl's ? (Score:2)
Oh, and as for fs conversion... (Score:2)
No, sorry.
Re:Mime types built into the file system? (Score:2)
Debian Install Disks? (Score:3)
GRIO / Realtime (Score:3)
Re:2.4.4 is 3 days old. (Score:3)
You can patch whatever you want, it's a question of how many conflicts you need to resolve.
What about the Red Hat 7.1 kernel? Would be nice if there was a patch for that.
ftp://linux-xfs.sgi.com/projects/xfs/download/R
Re:Tar Ball, etc (Score:4)
Go to http://oss.sgi.com/projects/xfs for the info.
Get SGI's installer (Score:4)
The other option, of course, is to have lots of extra space, install your distro, boot an XFS capable kernel, make some XFS filesystems, and copy everything over.
Re:Booting (Score:5)
2.4.4 is 3 days old. (Score:5)
On the other hand, the devel cvs tree is usually updated within a few days of a new kernel release. As soon as the kinks get out of XFS+2.4.4, it'll be in the devel cvs tree.
The majority of our 1.0 testing has been done on 2.4.2, so we have the most confidence in XFS there. We also have a 2.4.3 patch which should be fine, although it has not had as much direct testing.
We realize that there are issues with 2.4.2 (loop device, anyone?) If you're concerned about fix-ups, and you run an RPM-based systems, you might take a look at the Red Hat kernel RPMs we offer - those include a ton of patches from Red Hat - essentially the same kernel as shipped with 7.1, with XFS added.
If you're concerned about netfilter, just get the patch - I would be very surprised if it conflicted in any way with an XFS-enabled kernel.
Re:2.4.4 is 3 days old. (Score:5)
It's certainly a priority, but I can't really give you a timeline - we're working on it.
XFS is so big and touches so much
XFS is big, no getting around that, but we're making efforts to keep our modifications in the kernel to a minimum (actually, a lot of that work is done already). The patch is also now split into 2 pieces, one for "core linux" changes and one for the filesystem itself. That was done for a couple of reasons, but a nice side effect is that it's a little easier to see how much is XFS itself, and how much is linux changes.
Re:Nice FS, shame about the license (Score:3)
For a different scenario, imagine a BSD licensed unix. Now imagine several large corporations taking that great technology and using it. Sounds great, right? Now fast forward a couple of years, and you find that every one of these corporations has expanded on the original BSD licensed unix in a proprietary fashion in attempts to maintain and expand their customer base. Even if most of the corporations would have preferred to maintain their software as BSD licensed, their hands are forced when the first of the corporations starts spitting out proprietary, incompatibly feature enhanced versions. Admins find themselves trapped, having to either understand and maintain several incompatible systems, or going with one vendor and getting gouged for prices. Suddenly, WinNT 3.51 pops up, and although much worse technology, it runs on cheap hardware, costs less, and is far easier to administer.
Sure, you can say the customer should have used the original, still BSD licensed software, but in reality, most customers can't code, and are going to go with the commercially supported software, because the added features and/or lower administrative costs of the commercial software is (at least initially) cheaper then going with the BSD stuff.
Re:NT (Score:2)
Umm, you asked, little brain. I responded. Yes, it was worth a breath to explain that an open standard filesystem has more value than a proprietary filesystem on a shitOS that's a fair to middlin' product when configured "optimally". I wouldn't expect a naive schoolgirl to understand that.
Don't forget GFS (Score:2)
--
I hope we shall crush in its birth the aristocracy of our monied corporations
Tux2 (Score:4)
Would this be under "currently abandonded" or "abducted by aliens"?
GeoffEG
Re:okay okay.... I'm not informed... (Score:2)
However, I can tell you that XFS is a really great filesystem. We have over 100 Irix/XFS systems deployed at television stations around the world. These systems are very I/O intensive, and I have never had a corrupt filesystem. Performance is also very good.
I would really like to see SGI copy more features from Irix into Linux, such as fine control over process priorities, something standard linux distributions are severely lacking, imo.
Criteria how to choose file systems? (Score:2)
Re:pondering (Score:2)
All this moderation is getting bad posts modded up waaay too much...
As it stands right now the majority of their hardware run Linux
Umm, not really [sgi.com]. And the hardware that does run Linux runs it with beta-type quality.
the last version of Irix released was to mainly fix bugs.
No shit. They release quarterly maintenance releases - in the 6.5.X series - up to 6.5.11 now.
They will probably drop IRIX someday down the road. But since the workstation marked imploded on them, SGI is trying to make money off of servers. I don't think that Linux is going to be running (release quality) 256-way Origin servers any time in the "near future".
Re:IRIX 6.5.11? (Score:2)
11 quarters = 2.75 years. IIRC, NT 4.0 was released in 1996 or so, and 6+ service packs later we have Win2K :)
IRIX 6.5 was a major release - a lot changed from 6.2, which was their previous major release.
I'm not sure if SGI is planning any major new releases for IRIX in the immediate future - you can read about their roadmap here [sgi.com] (pdf). My nick is irix becuase it is the first UNIX I used back when. I'd change it now if I could ;)
Re:Booting (Score:2)
GRUB, on the other hand, actually reads the file system, like the BSD bootloaders, and at this point, it doesn't know how to read XFS.
Re:VMWare (Score:2)
Booting (Score:5)
Performance (Score:2)
Re:okay okay.... I'm not informed... (Score:3)
Not quite. The log/journal is structurally different than the main data areas, with different synchronization and performance characteristics. Writing once to the log and once to the main data area is quite different than writing twice to the main data area.
However, an observation very similar to yours is behind log-structured filesystems. In other words, if you're going to write all the data to the log in a highly robust etc. way, why not just make the log the authoritative copy of the data? There's a whole lot of gunk that has to be worked out after that, such as how you find data and how you reclaim log space, but it all flows pretty cleanly from that initial idea. The result is pretty nifty for some kinds of workloads, but in general changing OS structures and their effects on I/O patterns have sort of left log-structured filesystems behind.
If you're interested in exploring further, the seminal papers in this area are The Design and Implementation of a Log-Structured File System [nec.com] by Rosenblum et al, and (IMO even better) An Implementation of a LogStructured File System for UNIX [nec.com] by Seltzer et al. Enjoy!
Re:okay okay.... I'm not informed... (Score:3)
Interesting. I dunno about the SGI product, but the EMC Symmetrix takes a different approach. It has enough reserve power so that if it detects loss of external power it will immediately flush its cache to special areas on disk. Then, the first thing it does when it comes back up is slurp all that data back into cache - which not only ensures data stability but preloads the cache for you as well. Cool. I've heard that in a simulated blackout in a big data center everything would get eerily quiet *except* for the Symmetrix, which would actually get extra-loud as it does the flush.
Disclaimer: I work for EMC. I don't speak for them, they don't speak for me, yadda yadda yadda.
Re:Real issue is HARD DRIVE CACHEs. (Score:4)
That's an important issue. I'll try to provide a couple of answers.
Well, there are at least two ways:
SCSI gives you other options as well. For example, if you're using tagged command queuing, you can set FUA only on the last command of a sequence (e.g. a transaction). That way, you can allow the disk or storage subsystem to do appropriate reordering, combining, etc. and you'll still be sure that by the time that last command completes all the commands logically ahead of it (as specified by the tags) have completed as well. It's tres cool, and it's one of SCSI's biggest benefits compared to IDE.
Tagged command queuing also comes in handy if you have to force write caching off - which BTW is common and not particularly difficult on either SCSI or IDE drives. Since you're now forced to deal with full rotational latency, the importance of overlapping unrelated operations (by putting them on different queues) becomes even greater.
Tsk tsk, that's a shame. It's pretty common knowledge among storage types, but still far from universal. Go look on comp.arch.storage and you'll see a recurring pattern of people finding this out for the first time and sparking a brief flurry of posts by asking about it.
The problem with having the drive notify the host that a write has been fully destaged is that target-initiated communication (aside from reconnecting to service an earlier request) is poorly supported even in SCSI. Hell, it's even hard to talk about it without tripping over the "initiator" (host) vs. "target" (disk) terminology. Most devices lack the capability to make requests in that direction, and most host adapters (not to mention drivers) lack support for receiving them. AEN was the least-implemented feature in SCSI.
There's also a performance issue. Certainly you don't want to be generating interrupts by having the disk call back for *every* request, but only for selected requests of particular interest. So now you need to add a flag to the CDB to indicate that a callback is required. You need to go through the whole nasty SCSI standards process to determine where the flag goes, how requests are identified in the callback, etc. Then you need every OS, driver, adapter, controller, etc. to add support for propagating the flag and handling the callback. Ugh.
It's a great idea, really it is. It's The Right Way(tm). But it's just never going to happen in the IDE world, and it's almost as unlikely in the SCSI/FC world. 1394 seems a little more amenable to this, but I have no idea whether it's actually done (I doubt it) because even though I know they exist I've never actually seen a 1394 drive close up.
I hope all this helps shed some light on the subject.
Re:okay okay.... I'm not informed... (Score:5)
So does XFS. From one of SGI's own presentations [sgi.com]:
[emphasis added]
This is *normal* for a journaling filesystem. Very very few actually log or otherwise protect file data, because of the cost. Maintaining a metadata-only log is already a significant performance limiter, and journaling data as well would just be prohibitively expensive. Most users wouldn't even want it, if they had to pay the performance cost.
XFS installer (Score:5)
I haven't tried the 1.0 release yet. There's only so many hours in the day. On the other hand, the last install I did with the beta, after installing everything I wanted, I fired up a dozen programs such as Mozilla, GIMP, Nautilus, etc. While the drive was churning, I hit the power switch. For those of you who have used ReiserFS, I'm sure this is no big deal.
It should be noticed that on my Athlon 800MHz w/ 128MB of RAM and a 27GB hard drive, I almost missed the filesystem check as it scrolled by on bootup. That had me sold forever on journaling filesystems.
I haven't seen any visible performance differnece though. There may be, but so much has changed on my system that any subjective comparisons are almost impossible/meaningless. For example, devfs is enabled by default, there's a more up-to-date kernel and the drive has a different partition layout. Who could tell what the FS performance difference may be. I definitely don't need to go back to ext2 just to see if my switchover was justified. Any more info will just be icing.
If someone wants to post "real" benchmarks (lies, damn lies, and all that) I'd love to see them too.
Ext3 (Score:2)
--
But how do you USE it? (Score:2)
How do you get them to work with, say, RedHat, from an installation standpoint? (I imagine it's relatively easy to convert an extra disk attached to an already installed Linux box, but what about making your whole system with the new FS?)
At no point does RedHat ask me which filesystem I'd like to install, so that option is out (except for Mandrake and Suse?).
Can you convert a drive you've already got data on? Could I simply point at my disk drive and say, "turn that into an XFS drive," edit a few boot params, and be done?
Surely, it's more complicated.
Have any of you done something similar?
Any recommendations on how to get it working with the least amount of hassle?
Just curious.
Re:pondering (Score:2)
I wonder if folks over at SGI plan on dropping Irix in the near future for Linux entirely. As it stands right now the majority of their hardware run Linux, and the last version of Irix released was to mainly fix bugs.
The only SGI systems that run Linux right now are the low-end Intel workstations (230, 330, and 550) and the Intel rack-mount servers (1100, 1200, 1400, 1450) - certainly not a "majority of [our] hardware.
IRIX on MIPS is not going anywhere. Take a look at SGI's IRIX/MIPS roadmap. [sgi.com]
Unofficial patch (Score:4)
XFS 1.0 is against kernel 2.4.2 . Or 2.4.3, but SGI says it may be instable with this version.
But the current kernel is 2.4.4 (or 2.4.4-ac2) .
And 2.4.4 fixes important problems that previous kernels had. For instance, it fixes serious security flaws in Netfilter.
So, today, you can either run XFS, or get a fixed kernel. Not both.
This is why I'll stay with ReiserFS, until XFS get officially included in the kernel.
Yes. (Score:3)
And if you want someone better... SourceForge's FTP site is half on ReiserFS. So it works for them.
Install Disks? (Score:3)
The Debian disks are on Freshmeat and work GREAT.
Re:Real issue is HARD DRIVE CACHEs. (Score:2)
Re:okay okay.... I'm not informed... (Score:2)
Re:okay okay.... I'm not informed... (Score:2)
works with samba 2.2 acl's ? (Score:2)
so did anyone try it yet ? does it work ?
Re:do nothing, successfully (Score:2)
RedHat + XFS + LVM (Score:2)
For those of you who don't know what LVM is: with LVM, you make a disk into a pile of storage blocks, then you put those blocks into a pool. That pool is a block device, and you can create a file system on it.
The nice thing is that you can add blocks from many different drives into the pool, so a volume can span multiple physical disks. Need more space on root? Just pop a new disk into the system, add it to the pool, and expand the XFS filesystem into the new space. All without losing any data (indeed, even without downing the system...)
This is different from RAID in that with RAID, you cannot add disks to the array and get more storage. You can add more disks to serve as hot spares, but you don't get any more space without rebuilding the array (and losing your data).
Of course, the best thing is RAID-->LVM-->XFS, but....
Re:Booting is tough (Score:2)
I also, just today, compiled 2.4.4/w knfsd patches; seems fine too.
Re:Performance (Score:2)
Uhm, I just tried the link. It's not broken. It works. Maybe you should restart your proxy or something.
Re:Standardization (Score:2)
Seriously, anyone who is so uneducated to be confused or bewildered by having to choose from Gnome and KDE will not be fiddling around with filesystems anyway. They'll use whatever mandrake gives them on "beginner" or "automatic" install.
Re:Booting is tough (Score:3)
Yes this was an issue with Reiser, but they have had patches for it since 2.4.2 to work with NFS, and I beleive that full NFS support might be in 2.4.4 (not sure).
Re:Performance (Score:4)
Create____203.88 / 171.95 = 1.19
Copy______411.67 / 384.59 = 1.07
Slinks____3.23 / 2.81 = 1.15
Read______1165.61 / 1291.76 = 0.90
Stats_____1.49 / 1.17 = 1.27
Rename____1.81 / 1.32 = 1.37
Delete____14.46 / 3.95 = 3.66
As an aside, it's pretty hard to get much faster than ext2 for this statistic, reading of bulk files greater than 10k less than 100k. You need to weigh what reiserfs gives you against what it could slow down. Reiserfs has truely awesome small file speed, and very nice tail packing.
Re:Performance (Score:5)
Don't bother with journalling! (Score:2)
Journalling is tricky, as it requires lots of intervention at other places in the kernel. You need to keep something synchronous - journalling just makes that something very small. Atomic updates avoid synchronous issues altogether. Instead, they structure the file system in groups of data and metadata. In each group, there is an atomic bit. When set, it means the group is intact. So, upon looking through the groups, you can immediately determine which ones are intact and which are incomplete. Recovery is REALLY fast after a power outage, in theory even faster than a journal recovery.
ReiserFS and XFS are also really great, so these have log structure (or btree) and journalling. However, ReiserFS is broken with NFS constanly, and that is a BIG problem. Not to mention the version in 2.4.x is incompatible with the version in the 2.2.x tree. Don't let the XFS 1.0 version fool you. Ever see the fallout when Alexander Viro (kernel VFS hacker) takes a newly merged filesystem to task ?? It is not pretty.
Tux2 is still vaporware. But it will be great when it comes out. Ext3 has some advantages. It has been running stably for a long time now under development. It is journaled, and has a small code base. It also only exists for the 2.2 kernel series. Phillips is also making a judgment call. He wants to build on ext2 with tux2. Ext2 is not log structured, which is why ReiserFS can beat it in well-structure benchmark tests run by Hans.
And the future for linux file systems?? I don't know, it is always interesting to see where things will head. The world is clamoring for easy crash recovery, and ext2's days are numbered. I think most people would be quite happy to simply add journaling to ext2. Or atomic updates. So I predict, after consulting the crystal ball, that tux2 develops a large following after release, and that Phillips then adds btree searches and log structuring, making it the first linux file system with all that. That would then bring the state of the art file systems for linux up to par with those of FreeBSD. Of course, in linux at that time you can also use JFS, XFS, ReiserFS, or ext3 journaled file systems. But journaling is worse than atomic updates, both for complexity and speed. Soft updates are more flexible than journaling, and - with a filesystem whose basic structures are designed to take advantage - perform better than journaling. I find it just slightly weird that there's so much focus on journaling when a superior alternative is known.
Re:journaling is nice, but how about a better RAID (Score:2)
I don't see how things can possibly get much better.
Re:XFS (Score:4)
Re:okay okay.... I'm not informed... (Score:5)
Re:journaling is nice, but how about a better RAID (Score:2)
Jeremy
--
Re:journaling is nice, but how about a better RAID (Score:2)
I don't know what world you're living in, but I get 20mb/sec throughput on my ibm 75gxp. And that's on a plain old ATA33 controller with dual celerons 366.
5400rpm drives haven't been "very fast" for at least 5 years.
Jeremy
--
Re:okay okay.... I'm not informed... (Score:3)
That sucked
But i agree with you in general: XFS rocks. We were one of the first XFS customers on irix 5.3, and it didn't rock quite as much back then, but by the time irix 6.2 shipped it was pretty fantastic
I remember reading a post in comp.sys.sgi.{something} from one of the SGI guys... to the effect of "we have XFS doing sustained write performance of 2gb / second here in the lab"
That rules.
Re:Performance (Score:3)
I mentioned this in another post, but SGI claimed internally to have it sustaining 2gb/sec of _write_ performance across a suitably large number of spindles.
Also, one thing i dont see people mentioning is XFS's support for GRIO (guarnateed rate I/O). No linux filesystem has that, and the linux kernel plumbing to support it i think is SGI contributed (if its on xfs for linux yet, i can't recall).
The idea of grio is an app says ahead of time "i need this much disk performance - figure it out", and the OS will say "yes, i can hook you up" or "sorry, throw more money at the problem".
Sistina/UMN GFS too (Score:3)
-dB
Re:Nice FS, shame about the license (Score:2)
What he means is that they could do an M$-kerberos on it. Take it, tweak it to be incompatible to external parties, and then M$ effectively gets a free, better performing proprietary filesystem, for close to zero research dollars.
Re:okay okay.... I'm not informed... (Score:2)
Re:Booting (Score:4)
Since when do mp3s count as small files? Most of the interesting ones are several MB in size- much larger than a typical block size- so the extent to which ReiserFS would actually help is minimal. Where something like ReiserFS would be really helpful is in dealing with directories like /etc and /dev that are full of a large number of very small files. It might also be useful for /var, as ISTR that it has some anti-fragmentation aspects to it that would help with the rapidly changing data there.
Re:Booting is tough (Score:4)
Its probably still a way to go until its well integrated with the distributions, but I think this FS has potential. Unlike Reiser, it currently works with NFS.
I guess its a race to see which of these will ultimately become the common denominator FS for linux. Reiser currently has the lead, due to Suse and being in the kernel.
Re:IRIX 6.5.11? (Score:2)
Why?
Becasue 6.5 is the best thing SGI has ever done. There is an update released quarterly, on time, along two different streams: [M]aintenance and [F]eature. No patch dependency hell (though urgent patches are posted between 6.5.X updates). The Feature release gets new goodies every quarter, some of which are slowly rolled into Maintenance. Both streams are HEAVILY tested, tuned, reviewed, and compiled using the latest available MIPSpro compilers. In fact, SGI likes to stay a release or two ahead of their users, using it on most of their production systems to ensure a rock solid release.
The (large) IRIX 6.5 team has been plodding away ever since 6.5.0. When asked "where's 6.6" they will usually respond: "you didn't ask us to break application compatibility, so we aren't working on a 6.6".
I wouldn't want it any other way. Aside from a few small networking issues early on, 6.5 has been rock solid for me over the years. Each quarterly update has been surprise-free and without incident.
SGI MIPS/IRIX Roadmap:
http://www.sgi.com/developers/feature/2000/irix
The Mandate of Application Compatibility in SGI IRIX 6.5
(An excellent whitepaper on the goals and future of IRIX 6.5, written by an IRIX 6.5 engineer)
http://techpubs.sgi.com/library/tpl/cgi-bin/bro
A new pop craze? (Score:4)
Oh. Wait. Journaling file system? oops... never mind.
/* Steve */
okay okay.... I'm not informed... (Score:3)
Thanks
Re:Criteria how to choose file systems? (Score:5)
Some basic info and a couple of links for folks:
A file system in which the hard disk maintains data integrity in the event of a system crash or if the system is otherwise halted abnormally. The journaled file system (JFS) maintains a log, or journal, of what activity has taken place in the main data areas of the disk; if a crash occurs, any lost data can be recreated because updates to the metadata in directories and bit maps have been written to a serial log. The JFS not only returns the data to the pre-crash configuration but also recovers unsaved data and stores it in the location it would have been stored in if the system had not been unexpectedly interrupted.
As far as the question about how to choose file systems, that is often a matter of what the OS will let you get away with, and your needs. Using FAT 16 is recommended if you need to maintain compatibility with MSDOS, for example. Usually, this is something like if you have a multi boot scenario, and which OSen can mount which partitions with which partitions. MS is notoriously picky in this regard, with a "My way or the Highway approach". For example, if you have a single hard drive hooked up to your computer for configuration purposes, You cannot just create anextended partition unless that drive is a salve with another master. If you want to create just an extended partition it will not permit, and tell you that you can only create a primary dos partition instead.
So you Live and you Learn
Check out the Vinny the Vampire [eplugz.com] comic strip
Re:Performance (Score:4)
For example, picked pretty much at random from the mongo results, Linux-2.4.2 Ext2 vs. ReiserFS-3.6:
parameters:
files=15168
base_size=10000
bytes
dirs=86
Create 203.88 / 187.01 = 1.09
Copy 411.67 / 411.28 = 1.00
Slinks 3.23 / 2.99 = 1.08
Read 1165.61 / 1325.27 = 0.88
Stats 1.49 / 1.48 = 1.01
Rename 1.81 / 1.30 = 1.39
Delete 14.46 / 5.64 = 2.56
So the total time of the test is 1802.15 / 1934.97
= 0.93. (i.e. Reiser is 7% slower performing the whole test.)
I don't care if they make the thing that takes a tenth of a millisecond twice as fast, it's the reading of the bulk of the file that takes the most time, and for that part, Reiser is slower.
However, each individual has got to look at what is most important for them, for me it's 99% file read time on medium to large files (30K source code, 200K log files, that kind of thing), and judge accordingly.
FatPhil
--
Time tested (Score:4)
So what is one of its strongest strengths over the other journaling fs's?
Time tested reliability.
Re:okay okay.... I'm not informed... (Score:5)
yes, XFS and RAID are ok (Score:2)
http://linux-xfs.sgi.com/projects/xfs/faq.html [sgi.com]
XFS (Score:2)
Re:my exp with reiser (Score:3)
Finnaly, 2.4.4 was released, and it is fixed: it's the first "stable" kernel in the new series.
I never read a single bad review of ReiserFS until I actually used it--it worked "flawlessly" for everyone who had tried it. I didn't find out that it had these problems, and that it doesn't work over NFS, until it was too late.
The thing I learned is that when things--especially filesystems--claim stability, the user still has to test things out for himself.
ReiserFS is a good filesystem; don't get me wrong, but it may not be the best for you. (In fact, Red Hat does not plan to use ReiserFS in its distribution, because in the event of filesystem failure, it is near impossible to recover the filesystem with standard tools.)
I have used XFS in the past on Irix machines and have been very happy with it. But be careful before you deploy this filesystem--even on your home machine--without thoroughly testing it. And not simply creating two files and saying, "Hey! They're still there! I guess it's stable." I fell into that trap.
I would highly reccomend anyone running the 2.4 kernel to upgrade to at least 2.4.4, especially if he uses IDE or ReiserFS.
If you're going to use XFS, test it first.
By the way, does anyone know what's going on with moderation? I've had mod points three times this week, and there are a huge amount of +5 comments.
do nothing, successfully (Score:5)
Re:Booting (Score:4)
ReiserFS really shines with lots of small files. (your mp3 collection for example) You'll generally reclaim some space on your drive when you go from ext2 or vfat to reiser.
XFS is good when high performance is needed when dealing with large file systems (terabytes) and large files (1,2 gb files.) For a standard home user, it's overkill.
Re:Tux2 (Score:4)
I took a several-month detour to build a new directory indexing system (Htree) for Ext2, something it needs badly. Now it's back to Tux2, keep tuned. The new homepage for the project is:
href=http://nl.linux.org/~phillips/tux2 [linux.org]
The mailing list is still hosted by innominate, but I am not with innominate any more. When I get time I'll move the list and resubscribe everybody.
--
Re:pondering (Score:5)
I mean, think about it. Why bother writing all of the kernels and utilities when you can have the hackers of the world pick up the slack? SGI can't put as many developers on Irix as MS can put on Windoze. So they are developing Irix only for the MIPS machines and keeping Linux for their Intel machines.
And the strategy is pretty evident. They have been very supportive of good OpenGL under Linux. They have XFS, clustering software, etc. All of the Irix advantages are getting ported over.
The problem is that they haven't been able to move over to the Intel platform properly. Their first attempt was a fiasco. The Onyx 3000 series was designed to be a transitional system. It can work with either a MIPS processor or an Itanium. But the Itanium delays are making that hard. And, unlike the desktop workstations, you can't stuff a Pentium 4 in a Onyx because you need 64 bit addressing to make their NUMA architecture work -- each processor gets a piece of the address space. With a 512 processor Onyx 3000, that makes 8 megs of RAM per processor. So Intel is holding up SGI's full migration to Linux.
Now, as far as the stability of SGI, I'm not entirely sure. They are still bleeding money, and at a faster rate than last year, too. Given the downturn in the tech economy, they are going to be hit with it, too. It's very shakey.
Re:do nothing, successfully (Score:2)
pondering (Score:4)
I wonder if folks over at SGI plan on dropping Irix in the near future for Linux entirely. As it stands right now the majority of their hardware run Linux, and the last version of Irix released was to mainly fix bugs.
Its a shame that SGI has done pretty poor the past few years, when they're such kick ass machines, and personally I think they should kick the marketing teams asses.
I know previously they've used a customized version of Windows exclusively on their 320/540 servers, I guess they changed em all around to avoid fireselling them at crackhead prices. Maybe someday I'll see a BSD running on an Irix machine to see how it would run in comparison to Linux (don't bother to troll this post this is not an OS war-penis-envious-linux-vs-bsd-post) as far as benchmarking is concerned. As for XFS support I though it was supported for reading and not writing? Oh well I don't use Linux anymore
Re:journaling is nice, but how about a better RAID (Score:2)
In theory, yes (i.e. if you could stream data straight off the HD and out onto the wire). In practice, no. The data must first be read by an OS, is seldom contiguous, has additional FS overhead to be read, there are additional delays in network protocols (e.g. in the protocol layering (SMB/TCP/IP/802.3, also for a reliable protocol like TCP, every byte that gets sent must be acknowledged), in processing interrupts, the computer must also process other things (GUI, mouse etc, and perform scheduling). Also, multiply all the overhead by 2, because the other computer needs to be *reading* that data, and usually also saving it to a hard disk. Add to that network collisions - on an 802.3 100 MB LAN, the high number of collisions that start to occur as the LAN approaches > 70% usage starts to seriously degrade the network (there will be collisions even with only 2 PCs on the LAN copying between each other, because remember, you have addition protocol overhead for ACKs etc).
So in practice it is nearly impossible to copy files from one PC to another over a LAN at anywhere near 10 MB/s. If you manage between 3 and 5 MB /s, then you are doing quite well. If you don't mind unreliable and you are streaming the data in one direction (e.g. with UDP), then you may be able to do a bit better than that, but in my experience I've found that it is *very difficult* for one computer to even approach saturation of 100 MB, even with data unicast with UDP that isn't being read off a hard disk. You can try this yourself, write a simple UDP sockets app that just sends data on one end, and receives data on another, then output some stats on the number of bytes sent/received.
See Tanenbaums "Computer Networks" (3rd Ed), he has a useful section on the performance of networks, basically discussing why most of the latency is in software, in the protocols etc.
journalling file systems are pretty useless (Score:3)
In addition to the overhead, you also have to deal with the risks to your data from the fact that both the file system code itself is more complex and that utility programs and administrative tools may do the wrong thing with journalling file systems.
Altogether, I think you are better off with a RAID and a UPS; unless you have some serious failure, that will pretty much avoid the need for running fsck. If you have really critical needs, you will want a hot backup system that you can switch to if your primary system goes down anyway; that takes care of a lot of other problems and also lets you spend however much time you need on fsck.
(As an aside, fast reboots can't have been a driving factor for JFS on AIX: while JFS may have spared people the time for an fsck on reboot, many AIX server machines spent minutes or hours (!) scanning their SCSI buses on each reboot. I think many people who use journalling file systems don't do it because they need it but because it sounds "safer".)
Investigating RAID solutions (Score:2)
I looked into this a little bit for my research group at UIUC. We were wanting to buy some more disk space, somewhere between 400GB and 1TB. There were two options I considered.
In our situation we wanted to be able to process data as fast as possible. We have a growing collection of dual-PIII "compute servers" and divide our data amongst the computers. Typical jobs will run on a dozen of these computers (24 CPUs) and rip through data in either minutes, hours, or even months depending on the job. We are often I/O-bound.
We went with the SCSI disks for a few reasons:
Of course without the infrastructure of our existing RAID box, the economy would slant much more toward the IDE RAID solution. And for a home environment I think smaller-scale things like the ABIT KT7A-RAID card might also become very handy. Last I heard, the RAID controller it used wasn't fully supported in Linux, but that information is probably out of date by now.
We are currently using OSF1 for our server instead of Linux primarily because of the advanced filesystem: a 64-bit filesystem, ACLs, partitions that span multiple disks, and so on. It's good to hear that most of these advantages are now available to Linux, and XFS looks extremely promising. Keep up the good work, everyone!
Re:okay okay.... I'm not informed... (Score:5)
XFS has ACLs, unlike ReiserFS and ext3 (last time I checked).
Also, XFS comes with xfsdump and xfsrestore, which can back up and restore the ACLs. I believe it is the only ACL-enabled file system for Linux which has such utilities (unless you count AFS).
So for a production environment where you want ACLs, XFS is the only choice right now. And it seems likely to remain so for a while.
Also, Samba 2.2 has built-in integration with XFS ACLs, making Linux+XFS+Samba a very interesting option for replacing NT file servers.
Re:okay okay.... I'm not informed... (Score:3)
XFS, on the other hand, was designed with today's multimedia systems in mind. It supports systems in the millions of terabytes range, and has highly scalable, optimized data structures for metadata and journaling. Thus it is able to run on multiple CPU and RAID servers, with the intention to actually be serving multiple raw video or other high bandwidth streams. It has a special subsystem that handles Guaranteed Rate I/O, both soft(error checked) and hard(will deliver bad data if needed) data rate guarantees. I'm not sure if the GRIO stuff is fully supported in linux right now, but it requires special system calls which linux software obviously doen't have support for yet anyway.
ReiserFS and XFS are both VERY cool, but they address completely different areas of file system optimixation, desktops and multimedia servers. Use the right tool for the right task, etc. means that XFS is the only choice for high bandwidth linux data servers at this point. That's why this is a big deal.