Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Examining Mac OS X 10.4's Spotlight

Posted by michael on Mon Nov 08, 2004 05:04 AM
from the seek-and-ye-shall-find dept.
Ton writes "Apple has published a discussion of Spotlight, the radical systemwide search technology that will be part of Mac OS X 10.4 'Tiger'. The really interesting part is that metadata will be playing a big role in Spotlight while just a few years ago people were afraid metadata in Mac OS X was going the way of the dodo."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Reiser (Score:5, Interesting)

    by Anonymous Coward on Monday November 08 2004, @05:24AM (#10753010)
    From reading the article, I think Hans Reiser has been right about the need for reiser4 on mainstream linux.

    He saw all this stuff comming from way back. If you read the LKML, you will remember that he warned us.

    Its a pity no one listens to him.
  • by siliconjunkie (413706) on Monday November 08 2004, @05:27AM (#10753025)
    The post links to the Apple Spotlight page that has been there for months. Is THIS [apple.com] the "discussion" that is being referred to in the post?
  • by Anonymous Coward on Monday November 08 2004, @05:30AM (#10753033)
    >>> "Apple has published a discussion of Spotlight, the radical systemwide search technology that will be part of Mac OS X 10.4 'Tiger'.

    What's really funny is that there's no link to the actual published discussion... but anyway...

    http://developer.apple.com/macosx/tiger/spotlight. html [apple.com]
  • by tuite (802608) on Monday November 08 2004, @05:31AM (#10753040) Homepage
    I read about beagle for linux it seems to be very similar in functionality. http://www.gnome.org/projects/beagle/ [gnome.org]
  • FYI... (Score:5, Informative)

    by jonr (1130) on Monday November 08 2004, @06:38AM (#10753221) Homepage Journal
    Just a small info. The brain behind Spotlight is Dominic Giampaolo [nobius.org], the same guru that wrote the fantastic BeFS for BeOS.
    • Re:FYI... (Score:4, Informative)

      by pchan- (118053) on Monday November 08 2004, @08:21AM (#10753579) Journal
      more accurately, Giampaolo was the guy that re-wrote the BeFS, after a filesystem based on a database proved to be too slow. his book (Practical Filesystem Design) is very enlightening for people interested in these types of things, and is now a free download pdf on his website.

      for non-beos users, here's what you need to know about befs (note that it was pretty much complete by 1995):
      1) FAST. super fast. seriously.
      2) 64 bit, with support for giant volumes and files (10 years ago!)
      3) journaled filesystem. no fsck, no corruption on crash (trust me, my daily use system had bad ram for a while and crashed hourly).
      4) metadata built in and instantly accessed. change the name of a file or any other metadata, and all your "live queries" would reflect the changes.

      how long must my linux desktop wait for what beos had 10 years ago?
      • Which explains why it's tied to the filesystem rather than using a general hook at the vnode layer to allow the same functionality to be implemented regardless of the filesystem in use.

        Wow. Check it out. Everything you said here is completely 100% wrong.

        Spotlight is filesystem-independent. It runs as a set of daemons and stores its metadata database in a hidden directory called ".Metadata" at the root level of the volume.

        All your "could be" talk is basically a summary of how Spotlight works.
  • by Richard_at_work (517087) <richardprice@ g m ail.com> on Monday November 08 2004, @06:56AM (#10753268)
    because Im a potential switcher. I purchased a B&W 350mhzx PowerMac last week to see if MacOSX was really as good as its made out to be here on slashdot. The system is intended to let me try out OSX and a few other apps, so the speed isnt really an issue, adn Ive chucked a GB of ram in there anyway.

    Coming from a WindowsXP background, some things Ive noticed so far:
    • Clicking the 'X' doesnt actually close the application. This annoyed me to start with, but ive slowly gotton used to it.
    • Having to select the application window before I can quit it using the application menu. Or I have to right click on the dock icon to quit. Annoying still.
    • Love the dock. Its just ..... right.
    • Most of the file system is hidden from you, which I like. Put my data where I want it and ignore the rest.
    • The ability to access the underlying BSD OS easily. Love it.
    • Everything looks and feels 'polished'. THats what I always hated about KDE/Gnome when I tried them, the features were there, but noone had taken the time to step back and polish the entire thing off so it all looks and feels together.
    • Every time I boot the Mac, my TFT display is 'wavey' until i have the monitor do an autoadjust. Dont really know whoes fault this is, tho its fine under windows and linux.
    So, final conclusion? I love it, so much that I have already placed an order for a G5 Imac. And in the meantime, Ive purchased a G4 upgrade for this little baby, just to help it along :) If you are wondering what OSX is like, go grab a cheap Mac off of Ebay and try it out. 233 Imac for £99? [ebay.co.uk], 333 imac for £110? [ebay.co.uk] (both the same person, which isnt me, I have no affiliation with this person at all. - notice added for the pedantic slashdotters who hate to see someone else profit)
    • by Anonymous Coward on Monday November 08 2004, @07:05AM (#10753299)
      Clicking the 'X' doesnt actually close the application. This annoyed me to start with, but ive slowly gotton used to it.

      If you want to quickly quit a load of apps or switch application, hit cmd-Tab, and then cycle through the apps with the tab key.

      However you have one gig of RAM on the system. You have no need to quit the programs when switching between them. They'll be paged out to disk as necessary if you manage to fill the available RAM. Multi-tasking works very well as processes aren't in general allowed to hog the processor.

      I think this is a common thing amongst people who're used to windows - the windows in OS X represent documents, not applications, so that's why they can be closed without quitting the application. You will find Apple managed to balls this up by being inconsistent though - some applications DO quit on closing the window, but in theory they're applications which only have one window, and are utilities, like the Address Book.

      Be sure to try expose as well, though I doubt it'd work well on that older system.
      http://www.apple.com/macosx/features/expo se/
    • by Mark Hood (1630) on Monday November 08 2004, @08:02AM (#10753478) Homepage
      Having to select the application window before I can quit it using the application menu. Or I have to right click on the dock icon to quit. Annoying still.

      OK, use Splat-Tab (Apple/Command/Cloverleaf, call it what you will) to switch between apps. When you get to the one you want, hold down Splat and press Q. It quits the application. Press H instead and it Hides it. There's more of these... [macosxhints.com]

      Hope this helps.. It seems this is OS X 10.3 only, so you might want to check out LiteSwitch X [proteron.com] which does the same thing.

      Mark

    • by Unxmaal (231) on Monday November 08 2004, @08:23AM (#10753598) Homepage
      Clicking the 'X' doesnt actually close the application. This annoyed me to start with, but ive slowly gotton used to it.

      As someone replied earlier, this is a new paradigm in app management: the top menu controls the application, and the window menu controls the window. More importantly, OSX apps are designed to be left open -- keep them open, close or hide their windows, and they'll use virtually no resources, but will start significantly faster the next time you use them.

      Having to select the application window before I can quit it using the application menu. Or I have to right click on the dock icon to quit. Annoying still.

      Learn your keyboard shortcuts. Take the ten minutes to learn them, and you'll regain hours of your time. Cmd-Q is the shortcut for quit, for example. If you're used to Windows machines, you can switch the cmd key with the Windows key.

      Love the dock. Its just ..... right.

      Check out Quicksilver, from http://quicksilver.blacktree.com . Once you get used to it [and once it gets used to you], it's phenominally faster than the Dock.

      The ability to access the underlying BSD OS easily. Love it.

      iTerm, from http://iterm.sourceforge.net , is a great OSX terminal app.

      Here [unxmaal.com]'s a list of favorite OSX apps I posted a while back. Most are free/OSS, and they're all some of the best apps for any platform.

  • by DarthBobo (152187) on Monday November 08 2004, @07:23AM (#10753355)
    Unless you used BeOS in the past!

    This really is a big deal, much bigger than Microsoft's feeble attempts at full text search, or Google's desktop search. In many way's this much, much more useful than full-text search, especially for developers.

    At home I have about 6,000 MP3s, a 1000 photos, 500 scientific articles in PDF format and hundreds of words files that I need to juggle. Each one has its own metadata database, and none of them are updated in real time.

    Databases:
    MP3 - WinAmp & AudioTron
    Photos - Photoshop
    PDFs - Acrobat Indexer
    Word files - MS Indexer

    That doesn't include any of the other data that is stored completely databases and would have been easier to store in the file system - like email, guitar tab files and god knows what else.

    A properly implemented global meta-data store (that works at the filesystem level, not as an iterative service) profoundly changes how one uses the system, making sorting and finding data actually almost pleasurable.

  • Quicksilver (Score:5, Informative)

    by smartin (942) on Monday November 08 2004, @07:28AM (#10753375)
    This has already been done to some extent in Quicksilver.

    http://quicksilver.blacktree.com/

    It's an app that indexes parts of your file system and supports plugins to to index application data. The best part is that it is keyboard based. For example. type command-space "slash" enter and it fires off Safari opening /.

    I'm not sure how Apple will improve on this.
  • by Zedrick (764028) on Monday November 08 2004, @07:31AM (#10753383)
    What's up with apple and German tanks? First the Panther (http://www.achtungpanzer.com/pz4.htm#panther [achtungpanzer.com]) and now the Tiger (http://www.achtungpanzer.com/tigerp.htm [achtungpanzer.com]). What's next, the Leopard? When apple releases Mac OS 1x.x Leopard II, then I'm buying a Macintosh!
  • disk space (Score:5, Interesting)

    by devonbowen (231626) on Monday November 08 2004, @07:41AM (#10753415) Homepage
    Anyone have an educated guess of how much disk space this is going to use? I mean both for the meta-data db and the full-content db.

    Devon

  • by TVC15 (518429) on Monday November 08 2004, @08:45AM (#10753735)

    I've tried Spotlight and suggest that when it comes out, every time you step away from your computer make sure to lock your screen. All someone has to do is type 'porn' into the little search toolbar and within seconds it's all nicely listed.

    Perhaps Apple needs to add a feature to turn off indexing for certain directories. ;-)
  • Plug-Ins (Score:4, Interesting)

    by Feneric (765069) on Monday November 08 2004, @09:25AM (#10753990) Homepage

    How well this system works will in part depend upon how many data format plug-ins are provided. For example, take something like the SID audio format. It's relatively unknown, but has an officially registered MIME type with IANA [iana.org] giving it a status above many other file format types, and it is used to provide background sounds on some web sites. Will it make the cut?

    This is just one file format chosen at random. There are thousands out there, some of which are used pretty heavily for documentation in certain circles. How about all of the OpenOffice file formats, or the AbiWord format?

    I can see this feature being hugely useful if Apple does a good job of providing plug-ins, and making it easy for third-parties to add more.

  • The really interesting part is that metadata will be playing a big role in Spotlight while just a few years ago people were afraid metadata in Mac OS X was going the way of the dodo.

    The kind of metadata that was almost deprecated by Apple isn't quite same thing as the "modern" concept of metadata. The classical HFS metadata covered concepts like file type, file creator, and "Finder bits" that aren't handled at the file system level in other OSes. This, combined, with the Mac OS's historical use of resource forks for storing developer defined data records, made perserving such data difficult or impossible in heterogenous environments like the Internet. It's really a shame; I've always thought this concept was the most elegant attempt to solve the problem of "rich data" associated with data files without requiring the data in the file itself to have some form of universal container format.

    The metadata concept used by Spotlight is going to be based in part on a plug-in system that allows the Mac OS to reconstruct metadata information from the data within files themselves, rather than just using the metadata facilities provided by HFS and Mac OS resource forks. That means that each different kind of file, from Word documents to PDFs to Postscript jobs, needs its own special kind of processing to read its own format of storing such data. It's less elegant and more processor intensive that just using the historical HFS system, but it's more likely to to be useful for extracting metadata from files provided by Windows and other Unix variant users.

  • Still needs work (Score:4, Interesting)

    by xnot (824277) on Monday November 08 2004, @10:03AM (#10754323)
    I'm not convinced yet apple is going to get Spotlight right, i.e. truely revolutionary. It has potential (smart Finder folders is on the right path) but at the moment, it seems they are more interested in simply trying to duplicate Quicksilver/Launchbar technology, which is the wrong way to do this.

    I'm tired of apple ripping off ideas from developers without (A) Giving them credit or (B) developing something equivalent so the new as at least as feature-full as the old. Based on apple's history, the first version of Spotlight will likely be a horribly dumbed down version of Launchbar in terms of tech, since apple is obsessed with "ease of use": i.e. a three year old has to be able to work it.

    Rant aside, there are a few key pieces I think apple is missing:

    (1) User-created metadata. I should be able to tag anything I want with any metadata I want so the organization system follows ME and MY preferences, instead of the system determining it for me. Apple should be thinking about taking the insanely wonderful metadata system they created in iTunes and applying that to the finder. It is essential you be able to tag metadata in, because you don't always access the same objects for the same purposes.

    (2) Flexible file system. This is a concept I've developed which basically says that the file system should be dynamic and adaptable to match the thought flow of the user (only possible with a good metadata file system). If you've ever seen this app on the PC, think: "The Brain". What that means is that if apple does #(2) right, it should be easy as hell to tag things, and then basically I can create relationships which let me "flow" through my files by navigating CONCEPTS instead of folder heirarchy. A good app that does this is Devonthink. Devonthink will grab the contents out of your files, and when you do a search, you can not only see your search term but "related" search terms. Click on a new search term and you get a new listing. So as you come up with ideas about what you want to do, you can easily and naturally branch off into other parts of your file system. This methodology models the way the human brain actually works- thinking in concepts and spacial organization, rather then structure. (The "flexible" comes because the system takes your tags and adapts the search around them, allowing you to change how the "flow" works, depending upon what topics are most important to you.)

    (3) The next level after metadata search is a new way of visually interpreting the metadata and relationships between. Which means a NEW FINDER. I can't believe Steve actually threw this comment out after demoing Spotlight: "With this, you probably won't even need to use the finder any more." Well then why even have the Finder at all, Steve?! There IS a reason for the finder, which is why it's stayed around all these years, and that is that people think SPACIALLY. People are creatures of habit, and one way we remember where things are is if we know where to look for it and it's always in the same place. Which means there needs to be a visual grounding to the above dynamic files system, to give people a sure footing to all of this. I'm talking about things like a window that always stays in the same spot and always performs the same task, like showing you what new files have been added to the system, or actively updating your list of word documents wherever they are. Right now in the finder, a window is a window is a window. That shouldn't be. If a search is applied to a window, then that window isn't just showing you files, it's performing an active function. The finder needs to evolve to take on the new roles and responsiblities it should have in the context of a metadata files system. Spotlight should replace the finder: the two should work together seamlessly.

    The good news is that Spotlight is built into the system, so even if apple screws up the implimentation (likely), the next generation of 3rd party apps will hopefully be able to fill in the gaps.
    • Re:Radical (Score:5, Informative)

      by dJOEK (66178) on Monday November 08 2004, @05:11AM (#10752973)
      Spotlight is basically a SQLite db that holds data about documents and files on your system. Metadata is gathered by a sort of 'plug-in' for each different file type.

      A Typical use will be making query's such as: Show me everything agent dero sent me between tuesday and thursday last week. Mails, IM transfered images, you name it... Best of all, since this is metadata based, it's supposed to be lightning fast

      You could envision a plugin that would Spotlightify slashdot threads you read, in theory, and apply the power of a database to it.

      but really, you should RTFA
      • Re:Radical (Score:5, Interesting)

        by jcr (53032) <jcr&mac,com> on Monday November 08 2004, @08:32AM (#10753657) Journal
        Metadata is gathered by a sort of 'plug-in' for each different file type.

        Apple has had a few developer kitchens on writing Spotlight importers. The idea is that any given app developer might have his own ideas as to what constitutes the interesting searching criteria for his file types. Apple has importers for common image formats, plain text, rich text, mail messages, etc.

        If you were a photographer, for example, and you have a fancy camera that puts a lot of info into the EXIF tags of the image files it generates, you could search for "all images I made using this particular lens with a f-stop setting between 2.5 and 3", or if you're looking through files from a music notation program, you could search for "all files in 5/8 time in the key of G minor".

        -jcr
        • Re:Radical (Score:5, Informative)

          by CountBrass (590228) on Monday November 08 2004, @05:32AM (#10753044)

          The radical difference is that Spotlight generates the metadata itself rather than you having to tag stuff yourself. It has content handlers to intelligently tag all kinds of different "stuff" so it "knows" what a Word document is and what a web page is and what a .png file is etc etc.

          • Re:Radical (Score:5, Informative)

            by shotfeel (235240) on Monday November 08 2004, @12:58PM (#10756361)
            What's radical is that it does all the above, plus some. The way I rememver Jobs introducing it is something like this.

            You have a program called iTunes that creates a database of your music so you can search for a song by any one of a number of tags, including genre, play time, title, author, etc plus any of the keywords the user adds and how they rated it.

            You have another program called iPhoto that does the same for image files because iPhoto understands the internal tags in a jpg (or other image) file.

            You have another program called Finder that indexes based on file data. It knows what size the mp3 is, but not how long the song is -which iTunes does know.

            You have all this separate programs for dealing with different kinds of files because they all contain different kinds of metadata and internal tags.

            Spotlight puts all these kinds of searches in one place, and allows you to combine them. So with the appropriate plug-in filter, it can search any file type and take advantage of any internal tags in the file to speed up the search. Its much faster and more accurate than searching based on the entire contents of the file.

            So Spotlight combines metadata it generates itself (file content), with basic file metadata (file size, creation date...) and file type specific metadata (image dimensions or song duration).

            Then, IIRC, you can save your search and the results will be updated in real time as files are added or deleted.
        • Re:Radical (Score:5, Informative)

          by Carthag (643047) on Monday November 08 2004, @05:36AM (#10753060) Homepage
          As mentioned, I think it's the plugin architecture that makes it special. That makes it possible to search for anything that you can imagine. For example, you could write plugins for your logfiles, movie subtitles, internet cache, etc. It's basically your imagination that sets the limit.

          To my knowledge, other metadata-based search systems have not had a similar degree of extensibility. Please correct me if I'm wrong.
          • Re:Radical (Score:5, Informative)

            by TheRaven64 (641858) on Monday November 08 2004, @05:57AM (#10753108) Homepage Journal
            Actually, the plug-in architecture was also present in BeOS. BeOS R5 shipped with a plug in that would convert ID3 tags to filesystem metadata. The only novel things about Spotlight are the fact that the plug-ins are invoked automatically in the background (in BeOS they had to be explicitly invoked, usually from the Tracker - BeOS's finder) and the full content indexing.

            I still have to be convinced that full-content indexing is a good idea. I very rarely need to search for something in the contents of a group of files, and when I do it's usually such a small group that the time saved would not outweigh the disk space used by such large indexes. On the other hand, this problem should get better over time, since the largest files are usually video, and have little indexable content, meaning that the index is likely to get relatively smaller over time (until someone writes a plug-in that can interpret objects in images, and applies this to every frame in a movie. Fortunately, I think this is still a long way off).

          • Re:Radical (Score:5, Informative)

            by jtrascap (526135) <bitbucket@med3.1 ... laza.nl minus pi> on Monday November 08 2004, @06:22AM (#10753179)
            Okay - I'll bite

            * Desktop-metaphor based GUI for a personal computer
            * WYSIWYG publishing with a laser printer
            * PDAs via Newton
            * AppleLink (err, AOL now)
            * QuickTime (movies, QTVR, 3D, etc)

            We could go on and on. Give Apple props where due, huh?
            And please consider modding the troll down...
              • Re:Radical (Score:5, Insightful)

                by the quick brown fox (681969) on Monday November 08 2004, @08:26AM (#10753615)
                Don't get me wrong, Apple does cool stuff but their strongsuit is marketing, not "invention".

                You've got to give them credit for product design as well. Nobody makes more desirable-looking software and hardware. Is it any wonder that Apple's fiercest supporters are graphic designers?

              • Re:Radical (Score:5, Insightful)

                by BlowChunx (168122) on Monday November 08 2004, @08:55AM (#10753785)
                QuickTime player may not have been ground breaking, but the entirety of the framework was.

                Name one other multimedia framework that has been around as long as Quicktime. And don't mention Video for Windows [wikipedia.org]...I'll take your response off the air.
          • by Jayfar (630313) on Monday November 08 2004, @10:28AM (#10754599)
            I'm very fuzzy on the details, but I know that Apple played a leadership role, back in the mid-90s, in lobbying the FCC for the radio spectrum allocations for what we now call WiFi.
    • Re:Radical (Score:5, Informative)

      by Professor S. Brown (780963) on Monday November 08 2004, @05:15AM (#10752981)
      The linked article is shit.

      http://developer.apple.com/macosx/tiger/spotlight. html [apple.com]You want this one instead, its got loads more info on what it does and how it works, plus some code examples for the gimps.
      • Re:Radical (Score:5, Interesting)

        by tliet (167733) on Monday November 08 2004, @09:16AM (#10753936)
        I am the poster and the link was included in the post. Actually, the whole post was about the specific link to the Apple Developer site. Why the editors removed that link is absolutely beyond me...
    • Re:Radical (Score:5, Insightful)

      by Meredeth (821492) on Monday November 08 2004, @06:23AM (#10753184)
      MetaData is not new. Its not radical. But MS aparently can't make it work. So Apple gets to use it first, 5 percent of the computer population go wow! 95 percent ask why can't we have this, and Longhorn SP1 will get it and proclaim it as a great new radical technology.
    • Re:Radical (Score:4, Insightful)

      by mabinogi (74033) on Monday November 08 2004, @06:45AM (#10753237) Homepage
      Nowhere does anyone say that Spotlight using metadata is radical, or that metadata itself is radical.

      The metadata part was noteworthy because MacOS has always had metadata, but Apple looked like it was abandoning, or at least deprecating the concept in OS X. The fact that Spotlight will use it shows that metadata on MacOS still has a future.
      • Re:Radical (Score:5, Informative)

        by jcr (53032) <jcr&mac,com> on Monday November 08 2004, @08:36AM (#10753682) Journal
        Apple looked like it was abandoning, or at least deprecating the concept in OS X.

        Well, Apple *did* deprecate the old file type and creator tags, and resource forks in files that the Mac file system had always had since 1984. There were lot of problems with the metadata in the original MFS, not the least of which was that each file on the Mac was actually two files.

        -jcr
    • Re:um.. (Score:4, Informative)

      by BenjyD (316700) on Monday November 08 2004, @05:14AM (#10752979)
      You must have a different version of locate to me. I can't get mine to index my emails, it has no idea about the metadata entries in common document types and can't tell the difference between an image and a movie file.

      Could you send me the source for the version you have installed that does that?
    • by Angostura (703910) on Monday November 08 2004, @05:22AM (#10753006)
      Anyone who has used the instantly updated searches in Mail.app or iTunes will have a feel for how useful a system-wide approach could be. However I too am concerned about resource usage. I think I'll wait and see how big the metadata index tends to get and how big the CPU/memory hit is.

      I believe though that the indexing is done during saves, so you'll not notice a general system slow down. What you will notice is a slow down on file saves.
      • by catwh0re (540371) on Monday November 08 2004, @05:41AM (#10753069)
        Apple are well known for optimising their software to be significantly faster with each pre-release build. Having had the opportunity to test the developer tester of 10.4 with spotlight on a 12" powerbook (which was bogged down with various applications at the time) I can assure you that spotlight remained snappy, and definitely true to the 'instant' claim (I've noticed apple are quite careful on not over advertising their products, as it cause more problems than sales and a bad image). After using microsoft products we become very used to how slow a process can be. Apple's advantage is clear, they know their target hardware, like video-card driver writers they can optimise any part of their OS to fit their hardware for optimum speed. Additionally the g4/g5 chipsets have some quite useful registers for performing these sorts of searches (think sort of like MMX for x86, except with developers actually utilising them outside of games)
    • by Professor S. Brown (780963) on Monday November 08 2004, @05:24AM (#10753009)
      People who have used it report no performance degredation. And no, its nothing like Windows search, which Mac OS has also had since System 8 or earlier.

      For one, it doesn't take half an hour, it shows you the results as you type, instantaneously.

      Secondly, via plugins it can understand *any* file, such as an image metadata importer that uses OCR so you can search for words, or a Flesh-tone detector so you can search for all your porn that way.
    • by catwh0re (540371) on Monday November 08 2004, @05:34AM (#10753057)
      Actually it's quite different from the index search.

      Already the differences in Fat32/NTFS versus HFS+ (the mac filesystem) yield significantly faster searches before spotlight is introduced. Sit down on an OSX apple and notice that an entire search of the HD is actually a fast operation, not the waiting many-minute exercise that it is on windows.

      Now since spotlight is built into the core of the system, and isn't just a tack-on service like the windows indexer is, there are significant speed advantages, updating the SQL database when files are modified, added, etc is incredibly light on the CPU, and is equivalent to doing something like changing the file name.

      What spotlight isn't, and this might be where you are getting confused, spotlight isn't a spider that crawls from folder to folder cataloguing information about each file, which is what the windows indexer was doing, hence why it was resource intensive, as it was busy checking files and folders that you have possibly not made any changes to.

      As a counter to the 'Filesystem metadata is great, but "instantly" updated search indexes sounds like a solution to a problem that doesn't really exist.' Microsoft, google and apple would disagree. Having an up-to-date catalogue without the CPU strain is a must have, go figure MS have been trying to implement it since NT4.0.

    • by guet (525509) on Monday November 08 2004, @06:43AM (#10753233)
      uhm. No. It is not continually indexing the data, if you read the article you'll see it only updates the meta-data for items when they're saved - you can write custom plug-ins for new data types, or just go with the bundles ones for standard file types like images, text etc.

      Filesystem metadata is great, but "instantly" updated search indexes sounds like a solution to a problem that doesn't really exist.

      On the contrary, this is a *better* solution to a very basic problem that has plagued computers since they were invented.

      The problem :
      How do I organise and access the data I use every day (emails, letters, images, music etc)?

      The old solution :
      You can put your files in folders (one per file). You can name the files with a short description, ending with a cryptic 3 letter code to denote the file type. Files *must* be in one category/folder only at a time. Limited meta-data (date modified, file-type etc) may be stored.

      The new solution :
      You add meta-data to files (often automatically) saying who created them, what project it's under, whether it's 'to do' or 'unfinished' or whatever. You'd do this in a save dialog for the application, as you saved the file. All other applications which use searchlight will update their view of this stuff for free, in real time.

      When you want to work on a project, you click on the live project folder, and immediately you see all the files, emails, images etc for that project, no more, no less, regardless of where they are on the disk and what other projects they're shared with.

      Want to see all the stuff to do with John, 5 months ago? On this project? Containing the word gizmo? That sort of query will be easy to make.

      If you have an image editing application, it can show you all the images taken in Paris in 2002, without having to build a database application into it. This makes adding this kind of feature to applications trivial.

      Ideally adding meta-data tags like 'project-1', and 'To do' should be as easy as choosing them in the save dialog or applying them like a label in the Finder. It's not quite at that stage yet, but that should come later. Some of these ideas are quite old (Be), but they are long overdue in a desktop operating system.
    • by Queer Boy (451309) <dragon,76&mac,com> on Monday November 08 2004, @08:06AM (#10753500)
      Only the initial index is lengthy. Depending on the system and how many files you have. New files are indexed as they are created, this is PART of the file system now, not an add on.

      Apple has had this type of search engine before, they called it V Twin and it was a basic part of Copland. This is what Sherlock used in Classic and why it was so fast. The idea is even older, it's from a conceptual computer interface Apple dubbed the Knowledge Navigator. All this appears to be is V Twin running on SQLite instead of a proprietary method.

      The interesting part to me is the focus on metadata. I loved this feature in BFS that metadata was king. This is going to lead the way to better file management. Hopefully the Finder will integrate it.

    • by BasilBrush (643681) on Monday November 08 2004, @10:05AM (#10754346)
      The difference? Spotlight works - it does find data in a large variety of files and emails, and my bet is that it won't doesn't eat up the huge amount of resources that you say Windows does.

      Filesystem metadata is great, but "instantly" updated search indexes sounds like a solution to a problem that doesn't really exist.

      Doesn't exist *for you* perhaps. Perhaps you don't have a lots of user data, or you have taken time to sort it into useful folders. I'd say it's about as useful as the incremental seach in iTunes is. Sure I could remember what artist did a track, and access a track by scrolling down to that artist, then finding the track. Or I could scroll down the list of thousands of track names, remembering my alphabet ordering, and locate the track that way. Assuming I've remembered the exact wording of track name. But I've always found it easier to type whatever word comes to mind first from artist or track into the search box.

      And so it is with documents. Even if I do remember the file name and folder that a particular piece of information is stored in, I still need to navigate there. Most times it will be quicker just to type in whatever it is you remember about the data you want into a search box - even if you know where the data is stored.

      • by LiquidCoooled (634315) on Monday November 08 2004, @07:00AM (#10753280) Homepage Journal
        The reason Windows XP does not do full text search correctly is because it uses a specific registry handler entry for each type of file (*.txt, *.rtf etc). It uses a different handler for different types of files.

        However it only comes with a few configured filetypes settings, and no way to set a default "When no searchFilter available, treat as plain text" setting.

        I stressed and strained about this when XP came out initially. The only way I found to do it so I got expected results was to build myself a scanner.
        It searched through a drive, and identifies EVERY file extension.
        It then looks through the registry to see which Extensions have linked Handlers.
        It generates a reg file containing stub links for every unmatched filetype.

        Its a bit shotgun, but allowed me to continue using the Text search for XP.

        Microsoft have released their own shotgun registry pack, for more info see here:
        http://support.microsoft.com/default.aspx?scid=kb; EN-US;Q309173 [microsoft.com]

        (I have since moved myself into using my own full search tool, but at least the XP search doesn't miss files which are clearly within visible range).

        [Now for the science part..]

        Take a file, something like "PunchTheMonkey.asp".

        Make sure you have it open in notepad, and make sure there is a certain text string - for instance "spyware".

        Open the windows XP search in that folder, tell it to search *.ASP, and give it the phrase "spyware".

        Windows XP will NOT find this file.

        -----

        The Windows .TXT flat text handler is identified by using a registry key:

        [HKEY_CLASSES_ROOT\.txt\PersistentHandler]
        @="{ 5e941d80-bf96-11cd-b579-08002b30bfeb}"

        Adding an entry like the one above for each required filetype will restore the full text search functionality.

        So, I add the following entry into the correct .ASP place

        [HKEY_CLASSES_ROOT\.ASP\PersistentHandler]
        @="{ 5e941d80-bf96-11cd-b579-08002b30bfeb}"

        After I have logged off/rebooted, I try the same again, and XP will now identify the file.
    • Try to pretend that you're managing 2 or 3 or more major projects that can change or be passed along to someone else every few months with mails, im's, files, reports you don't look at, media submitted by other people in different countries, to-do lists and other project management data...

      Now imagine someone asks you, the project manager (or just the last person still around) on a project from 3 years ago, what the initial proposal from that guy in japan who did the Flash files was versus what we paid him and what the VP's said about that....

      People *will* have copies of these files still floating around *somewhere* in e-mail or im history, at least. You may not, I may not, but that's where this will come in handy.

      A few years ago, hd space was not large enough to think that you'd keep all that data around, but gmail's new 1Gb e-mail storage just showcases the lack of a need to dump all that crap off your media if you can just organize it well, and who needs that when you can keyword search, anyway?

    • by Anonymous Coward on Monday November 08 2004, @09:19AM (#10753955)
      Hmmm. This sounds a little dangerous. It will make it much too easy for the wife to find your porn collection, your AOL-IM sessions with that weird Goth chick, the draft of your divorce papers, etc. I AM NOT UPGRADING TO TIGER.
      • by dr00g911 (531736) on Monday November 08 2004, @02:35PM (#10757523)
        Speaking as someone who married a girl geek, I've had to find workarounds for this set of annoying situations already. She's crafty and won't fall for the 'put the stuff in /etc/' trick so that my hypothethical goth and asian schoolgirl porn won't show via a normal search.

        Solution?

        Save the porn / super personal stuff on an encrypted disk image saved somewhere inconspicuous, and set cronned (or logout) scripts to scrub your various histories and recent items. Make sure that the machine logs you out after no more than 10 minutes of activity.

        Hypothetically, that is.

        Hi sweety!
    • by UncleRage (515550) on Monday November 08 2004, @08:45AM (#10753737)
      And without a proper search tool, how is it, exactly, that we're supposed to keep track of our fellow insurgents, plans for sneaky attacks, plots to undermine the "powers that be" and means of crippling the status quo?

      A disorganized revolution is just a waste of time.

      Keeping one's data organized is a priority, bucko. ;)
    • by BasilBrush (643681) on Monday November 08 2004, @10:31AM (#10754626)
      Summary:

      Spotlight can support arbitrary file types, entirely dependant on what an application developer decides to supply, and you decide to install. Google is limited to the file types Google implements.

      WinFS is an overly complicated pile of steaming pooh, that Microsoft are having trouble delivering.

    • by K-Man (4117) on Monday November 08 2004, @01:05PM (#10756425)
      AFAIK they use some kind of search on the lexicon for the inverted index. For instance, the string "nut" is matched to "nutmeg", "donut", etc., and the document lists for those terms are merged together. Phrase search would also be done using all matching words, eg "nut hol" would expand to phrase searches like "donut hole", "peanut holder", etc.

      The exact method for matching the search string to the lexicon isn't clear. It could be a suffix tree, but it may be as simple as grep-like scanning of the words, since there aren't that many relative to the text size.

      Looking at mail.app it seems to do this process on each keystroke. It's not terribly fast, but it gets the job done.