Large File Problems in Modern Unices 290
david-currie writes "Freshmeat is running an article that talks about the problems with the support for large files under some operating systems, and possible ways of dealing with these problems. It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."
Re:Why large files (Score:5, Insightful)
And compressing video on-the-fly isn't feasible if you're going to be tweaking with it, so that's why people use raw video.
-Mark
Re:Why large files (Score:2, Insightful)
data warehouse, and any database for that matter (Score:5, Insightful)
the production database that drives the sites is like 100GB
welcome to last week. 2GB is tiny.
Its funny how some lamers dont listen... (Score:3, Insightful)
I can just laugh at them now...
Re:data warehouse, and any database for that matte (Score:2, Insightful)
I am not agreeing (or disagreeing) with the original post, but having a database > 2 GB has nothing to do with having a single file over 2 GB. A db != a file system (except for MySQL perhaps).
Re:Why large files (Score:3, Insightful)
I can think of some:
And that's just without thinking twice...there are probably many more reasons why people would want files >2 GB.
Re:Wrong point of view. (Score:5, Insightful)
Video Editing
Daniel
Have you ever seen some people's email? (Score:5, Insightful)
Umm, scientific computing (Score:1, Insightful)
Think beyond the little toy that you use. These projects are using Unix (Solaris, Linux, BSD and even MacOSX) on clusters of hundreds or thousands of nodes.
Re:Wrong point of view. (Score:1, Insightful)
As opposed to a million 4k files that are each 1k of header?
Re:Wrong point of view. (Score:5, Insightful)
Re:Why large files (Score:2, Insightful)
Re:Why large files (Score:4, Insightful)
Who moded that as Insightful? Sure, if you are using a filesystem designed for floppy disks, it might not work well with 2GB files. In the old days where the metadata could fit in 5KB a linked list of diskblocks could be acceptable. But any modern filesystem uses tree structures which makes a seek faster than it would be to open another file. Such a tree isn't complicated, even the minix filesystem has it.
If you are still using FAT... bad luck for you. AFAIK Microsoft was stupid enough to keep using linked lists in FAT32, which certainly did not improve the seek time.
Re:Wrong point of view. (Score:5, Insightful)
True, it looks like the optimal solution is lower-level partitioning, rather than expanding the index to 64bits (tests showed that the latter is slower), but that still means that the practical limit of 1.5-1.7 GB per file (because you have to have some safety margin) is far too constraining. I know installations who could have 200GB files tomorrow if the tech was there (which it isn't, even with large file support).
I am also guessing that numerical simulations and bioinformatics apps can probably produce output files (which would then need to be crunched down to something more meaningful to mere humans) in the TB range.
Computing power will never be enough: there will always be problems that will be just feasible with today's tech that will only improve with better, faster technology.
Re:MOD UP (Score:2, Insightful)
Re:Have you ever seen some people's email? (Score:3, Insightful)
They probably don't write emails but instead write Word documents and attach them to empty emails.
Re:Funny...in AIX... (Score:3, Insightful)
that's just so lame. we have XFS and JFS. you can keep your AIX and your expensive hardware with you.
thanks.
It's all about efficiency. (Score:3, Insightful)
it either
A) Wastes Memory Space
B) Wastes Code Space
C) Wastes Pointer Space
D) Or Violates some other tenant the programmer believes
So, When they go out and create a file structure, or something similar, they don't feel like exceeding some 'built-in' restriction to their way of thinking.
And usually, at the time, it's such a big number that the programmer can't think of an application to exceed it.
Then, one comes along and blows right through it.
I've been amused by all the people jumping on the 'it don't need to be that big' bandwagon. I can think of many applications that ext3 or whatever would need to use to make big files. they include:
A) Database Servers
B) Video Streaming Servers
C) Video Editing Workstations
D) Photo Editing Workstations
E) Next Big Thing (tm) that hasn't come out yet.
Re:Error Prevention (Score:1, Insightful)
The 2 GByte limit came from a time when 14 inch disks held 30 MByte and disk space and RAM was too precious to waste an extra 32 bits when these would always be all zero for the forseeable futute.
The concept of a hard drive that was as large as 2 GByte was just silly - it would fill the whole computer room, and in any case this is a limit on each file, not on the file system.
Re:It's all about efficiency. (Score:3, Insightful)
We have code for infinite precision integers. The problem is, if it were used for filesystem code, you still couldn't do real-time video or DVD burning, because the computer would be spending too long handling infinite precision integers.
As long as you're careful with it, setting a "really huge" number, and fixing it when you reach that limit is usually good enough.