Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Google Businesses The Internet Operating Systems Software Windows

Google Experiments With Local Filesystem Search 482

Teoti writes "No, Puffin is not the next name of your favorite email client, but, according to the New York Times (NSA reg. req.), the project codename for a new Google search application coming directly into your desktop, that will let you search your local filesystem efficiently. This is different from, but complementary of, the Google DeskBar that already lets you search the Web. The article also gives a few words on the end of the stand alone browser in Longhorn."
This discussion has been archived. No new comments can be posted.

Google Experiments With Local Filesystem Search

Comments Filter:
  • by JessLeah ( 625838 ) * on Wednesday May 19, 2004 @02:09PM (#9196962)
    I certainly hope this isn't a Windows-only thing.
  • privacy (Score:5, Interesting)

    by Councilor Hart ( 673770 ) on Wednesday May 19, 2004 @02:11PM (#9196993)
    So, will I get ads based on my data?
  • by lukewarmfusion ( 726141 ) on Wednesday May 19, 2004 @02:12PM (#9197009) Homepage Journal
    I recently searched several hundred thousand files on my work machine. It took nearly 90 minutes to complete the search. I expect Google will be able to significantly improve upon that. They're one of the few companies that I really trust to do the right thing.
  • DO NO EVIL? (Score:1, Interesting)

    by way2trivial ( 601132 ) on Wednesday May 19, 2004 @02:14PM (#9197034) Homepage Journal
    what if any, 'aggregate' data will this pass back?

    will we see 'adsense' words based on which file we are searching for?

    there must be a motive for this, some sort of expected gain, or why?

    for the most part, google's actions are benign, I believe the claim that gmail scans are automated and innocuous.

    but what's the benefit to google for this one?

  • by Anonymous Coward on Wednesday May 19, 2004 @02:18PM (#9197069)
    to go thru the wiki, jpg filenames+exif data, home directories, SQL database, etc. A Google type interface is what I'm looking for.

    For those infants out there, Lotus Magellan was the greatest, it was Windows Explorer as it should have been done, it searched any spreadsheet, database, or word processor file.

    Gawd, Linux needs this. I would pay ~$250.00 for an industrial strength business version.
  • Re:interesting (Score:5, Interesting)

    by Kircle ( 564389 ) on Wednesday May 19, 2004 @02:21PM (#9197107)
    [Google] going to reach a point where they stretch their resources too thin?

    Google researchers are allotted 20% of their working time to do outside projects or to follow personal interests. Google News and Gmail were both results of work done during this "20%" time. So in short, no, I don't think Google has really stretched their resources any more so than before.
  • by joabj ( 91819 ) on Wednesday May 19, 2004 @02:23PM (#9197133) Homepage

    I remember Alta Vista offered this sort of search-your-own-computer software back in *1998*. This seems to be the most recent version: http://siliconvalley.internet.com/news/article.php /968131

  • Re:privacy (Score:5, Interesting)

    by Deitheres ( 98368 ) <brutalentropy@nOSPAm.gmail.com> on Wednesday May 19, 2004 @02:24PM (#9197149)
    I don't foresee Google adding ads to a local search function... there are no ads on the Google toolbar, nor are there any ads on the Google Deskbar (save the ones that appear in the mini browser, but those are merely Google.com ads).

    Google seems to be as anti-ad as most people on Slashdot. I personally hate ads, but I feel that most of Google's ads are non-invasive and in good taste.
  • by The Lynxpro ( 657990 ) <[moc.liamg] [ta] [orpxnyl]> on Wednesday May 19, 2004 @02:29PM (#9197217)
    Since Microsoft considers Google a major competitor and has its target set on Google with Longhorn's capabilities, I think it would be a great idea if Google started distributing their own version of the Mozilla web browser. With Google's reputation, there would definitely be more people making the switch to Mozilla based browsers if Google were to do this. After all, Netscape is considered a failure now by the public and Mozilla to a casual observer lacks credibility no matter how great the product is.

  • by JWW ( 79176 ) on Wednesday May 19, 2004 @02:30PM (#9197221)
    You know when I read the line about dispensing with the web browser as we know in their next release, I find myself thinking.... there will never be tabbed browsing in any Microsoft "browser".

    I can't imagine not having this feature and it floors me that Microsoft can't imagine anyone ever needing it.
  • by Anonymous Coward on Wednesday May 19, 2004 @02:31PM (#9197231)
    Don't forget: google developers are Linux-centric. The cool ideas come from their developers. Making something Windows only would probably require *more* work than letting it live in as platform agnostic a form as possible. Thus, the other side of the coin is consistent with your guess being wrong.
  • by Snork Asaurus ( 595692 ) on Wednesday May 19, 2004 @02:33PM (#9197247) Journal
    Altavista put out a Windows search app based on their engine technology around 1998 (during their part-of-DEC, better-than-most-search-engines of the time phase). It indexed all documents and provided keyword searches that included Word docs, PDF's and more. It was free and a little buggy but showed promise. Then it just kind of disappeared.

    Perhaps Google can fill this void in the pathetic Windows power tool-set ("Windows power tool-set" being close to an oxymoron).

    But, despite my love for Google, in these more Orwellian times, I'm glad that I have the tools (not from MS) to monitor port activity.

  • Re:Security... (Score:3, Interesting)

    by Tenebrious1 ( 530949 ) on Wednesday May 19, 2004 @02:34PM (#9197256) Homepage
    This is as good idea, so long as it doesn't allow others to search my filesystem.

    But what if they could? If google cached, online, the location of MP3s and MPEGs loaded on your system, then allowed others access (with your permission of course). Hmm... sounds like a P2P file sharing system...

  • by BrookHarty ( 9119 ) on Wednesday May 19, 2004 @02:36PM (#9197273) Journal
    Normally I just need to know file names, so I do something simple like du -ak / > /var/tmp/all so "all" is a catalog of all files.

    If I need to do text search, I have a little for sh script that will look for a prefix in /var/tmp/all for the files I need and do quick egrep's. Saves me time when I need .conf files that have the line I need, or .hidden files that I need to source or read.

    If I don't need to hit the FS for finding files, a catalog already speeds this up. I've started doing this in cygwin to speed up searchs also. (Gotta love having unix tools under windows) :)

  • by blueZ3 ( 744446 ) on Wednesday May 19, 2004 @02:45PM (#9197341) Homepage
    Call me crazy, but I actually just keep logically structured directories and make sure to save items into the appropriate location... It's much simpler to take 10 seconds to place a file in the appropriate directory at the start than to hunt for it later.

    Even when a file crosses multiple logical groups, (picture, jpg, family, nephews, 2004) if my information categories are sensible, and I use a heirarchy that makes sense to me, I don't need search that often. In fact, I can't recall the last time I had to do a search of my drive to find a file. (I should probably mention that my work requires a lot of information mapping, so creating and maintaining such a structure is trivial for me)

    Of course, since Windows search is so inefficient and (sometimes) problematic, I learned long ago not to rely on it.

    bluez3
  • What about linux? (Score:2, Interesting)

    by neves ( 324086 ) on Wednesday May 19, 2004 @03:01PM (#9197456) Homepage
    Since Googles toolbar and deskbar doesn't work in linux, this software probably also won't. Won't you use for searching the contents of your files in your filesystem in Linux?
  • Good for the goose? (Score:4, Interesting)

    by Rick Zeman ( 15628 ) on Wednesday May 19, 2004 @03:02PM (#9197476)
    Google, well aware of this threat, hired a Microsoft product manager last year to oversee the Puffin project as part of its strategy to compete with Microsoft's incursion into its territory.

    That's the first time that I've ever read of it going in a direction away from Microsoft. Usually, it's the other way around, Redmond sucking up the managers and staff if they can't buy or steal the technology.
  • by david_reese ( 460043 ) on Wednesday May 19, 2004 @03:17PM (#9197588)
    Wouldn't the speed of the search be influenced mostly be the capabilities of your own computer?

    Actually, the speed of the searches are usually influenced by the speed of the Algorithm. You can take a pretty basic full file/text search (ie, windows search) and run it on a 2ghz+ dual-opteron beast with superfast HD, and it will still lose to a 500mhz laptop doing search with a proper index and metadata lookup.

    Add in AI stuff like predictive/speculative lookup and search/result cacheing, and the difference becomes night and day.

  • by Anonymous Coward on Wednesday May 19, 2004 @03:21PM (#9197615)

    They do, in a way [google.com].

    Mozilla.org and firefox are the top 2 results if you search for web browser [mozilla.org]. Interestingly, the top links are: Mozilla, Firefox, Opera (twice), Safari, Netscape (twice), Galeon, evolt.org's legacy browser archive, and webstandards.org, in that order. The first page doesn't mention MSIE at all. MSIE is listed 5th on the 2nd page, after lynx, anybrowser.org, amaya, and Konqueror.

    It seems people who talk about browsers don't like to mention MSIE.

  • by SilentChris ( 452960 ) on Wednesday May 19, 2004 @03:34PM (#9197724) Homepage
    Wow. You really need to turn on indexing. That doesn't sound right at all.

    On my XP machine I have in the neighborhood of 300,000 files, and a full-text search takes 1 minute, tops. On my Mac it's closer to 150,000, and a full text search takes about 25 seconds. 90 minutes sounds like something is seriously wrong.
  • by Afrosheen ( 42464 ) on Wednesday May 19, 2004 @03:36PM (#9197737)
    That's why 'slocate -u' should be in everyone's cron daily file.
  • by K-Man ( 4117 ) on Wednesday May 19, 2004 @03:37PM (#9197752)
    After the Google appliance, this seems like an expected move. The desktop is certainly key from a marketing sense.

    However I don't see a lot of overlap with web search. The major pieces won't work the same:

    Crawling: People want fresh information, eg that marketing report that just went out five minutes ago. Many web sites are happy to be crawled once a month. Keeping up with user edits on a filesystem is going to be a lot harder, and users will probably not be happy with heavy reindexing cycles. The ultimate would be heavily integrated with the filesystem, keeping an eye on all file activity, and refreshing the index appropriately. I believe Longhorn's delays are related to this problem.

    Indexing: Desktops have a lot of file types, and strange crypts like the Outlook. Certainly Google has some support in this area, but more may be needed. There are also other document units like email messages instead of files, or even database records.

    Fetching: Granted, a simple search toolbar will work, but I've been more impressed with, for example, Apple's Sherlock protocol, which allows multiple search "channels", eg Web, News, Stocks, etc., some from third party providers. IIRC this is what Firefox uses.

    Ranking: Pagerank is definitely not going to work, although that may not be such a handicap when hit counts are in the one or two-digit range. Still, it's not a competitive advantage.

  • by Mechanik ( 104328 ) on Wednesday May 19, 2004 @03:54PM (#9197886) Homepage
    I want Google search on my .pst files from Outlook. Searching for a keyword through 2+ years of email takes FOREVER with the built-in search feature in Outlook. We're talking 5 or 10 minutes here.

    And if I had a nickel for every time I had to resend something to a co-worker because they were too goddamned lazy to just search their email for the message I sent them THE FIRST TIME, well, Google wouldn't need an IPO because I'd just buy them outright!

    That being said, filesystem searches with Google would be damn nice too. :-P


    Mechanik
  • It's a lot easier to write a cross-platform website than it is to write cross-platform applications.

    Having done quite a bit of both in the past several years, I'd highly disagree. There are plenty of off the shelf products or methods to create cross-platform applications and very very few (and generally poor in quality) tools or even documentation to write cross-platform websites (modern ones, with dhtml and heavy usage of DOM).

    But a lot of the code (particularly for interacting with the file system and the GUI bits) will be platform-specific.

    Nope, that's pretty much been standardized, assuming you're writing from scratch. Now porting an application written platform specific is a completely different story. But this example is an application written from scratch.

    And as for filesystems, well... nowadays filesystems are much more consistant than, say, SysV versus VMS versus the dozen variants of CP/M. Subdirectories and pretty consistant meta information (date created, date modified, date accessed, etc) on every file is the accepted standard. They may do things different under the hood, but (at this time) they are all pretty much POSIX.

    --
    Evan

  • by telstar ( 236404 ) on Wednesday May 19, 2004 @04:44PM (#9198486)
    1. Netscape conquers the browser market...
    2. Netscape IPOs and climbs to some insanely high value...
    3. Microsoft integrates browser into OS...
    4. Netscape crubles...

    - - - - fast forward - - - -

    1. Google conquers the search market...
    2. Google IPOs and climbs to some insanely high value... (coming soon)
    3. Microsoft integrates search into OS... [Longhorn] (coming eventually)

    Where do you think the rest of this goes?
  • by Anonymous Coward on Wednesday May 19, 2004 @05:15PM (#9198900)
    Back in the late 1990s, I used the AltaVista Desktop personal search software. I used it on my Windows 98 computer back then. It was great, if I need to find something on my computer I would use keywords and it would find all the matching documents instantly. It seemed to already have everything on my computer indexed so it instantly knew were everything was.

    Unfortunately, what I downloaded was only a demo version of the program that was only good for 90 days or something like that. When I decided to purchase the software I discovered that there really was not a reasonably priced version available for individual users. All that was available was extremely expensive versions intended for large companies. They did not even make an attempt to market it for users of home computers or small businesses. So even though I loved the software I had to stop using it. If I remember correctly it indexed not only text files but also MS Word documents, HTML, and my e-mail.

    When searching for documents on my computer I always used the advanced search feature and did a boolean search using terms such as AND, OR, NOT and NEAR. It was very efficient. I now use Linux instead and have occasionally used grep, egrep, sed, awk and find but would perfer to also love to have the option to use a search engine on my home computer. I hope whatever Google comes up with will be available for Linux or at least will run under WINE or CrossOver Office. Of course, I would only use it if it is implemented in a way that does not invade my personal privacy. By the way, when searching the Internet, AltaVista does not seem to be using the same powerfull search engine with boolean operators that they once used so I recently switched to Google instead.

    I also wonder how all this will compare with the new search engine that Microsoft is developing for WinFS under Longhorn. I hope that by then Linux will be offering equally good search capabilities. I seem to recall hearing that Han Reiser is in some way working on upgrading the ability of the Linux ReiserFS file system to be searched. Is that correct?
  • Why on earth would you trust the statistic on some static web page?

    I would be more inclined to believe Google's statistics [google.com] on the popularity of web browsers. (Look for the section marked "Web Browsers Used to Access Google" or follow this link [google.com] if you are really helpless.)

    Considering that the most clueless Windows users are probably using the address bar in Explorer to automatically use MSN, the Google figure for all non-IE browsers may actually be higher due to the self-selection of Google users.
  • Re:About time (Score:3, Interesting)

    by afidel ( 530433 ) on Wednesday May 19, 2004 @06:23PM (#9199694)
    Way back in the day Altavista had a personal search engine. It ran under win9x and basically brought the features of the search engine to your personal docs. It could index almost all office type docs (no not just MS Office but all three of the major suites), email (Outlook and any mbox application), etc. I kept it running under win2k by doing an in place upgrade but unfortunatly it would not install under 2k or above so when it came time to reformat I lost the ability to use it. The indexer ran on a schedule or could be run manually, it would not only index local files but also one or more websites so before RSS you could use it as a news agregator. Overall it was very cool and I can't wait to see how Google implements the idea. Frankly it makes such a large productivity boost in your workflow that it's almost as big of an upgrade as from win9x->2k+ is.
  • by Czmyt ( 689032 ) <steve@czmyt.com> on Wednesday May 19, 2004 @08:26PM (#9200518) Homepage
    slocate is a great little program to speed up the process of finding files on your *nix computer system, but it's not a full-text indexer. Finding the names of files like slocate does is not the same as finding words that appears within those files. It is a great replacement for "find / | grep $PATTERN" though.
  • Locate for windows (Score:3, Interesting)

    by Fuzzle ( 590327 ) on Wednesday May 19, 2004 @08:52PM (#9200664) Homepage Journal
    Locate32 [www.uku.fi] is a program that can replace your built in Windows FIND function, including indexed searches.
  • by DoraLives ( 622001 ) on Wednesday May 19, 2004 @09:11PM (#9200748)
    functional equivalent to updatedb to run at regular intervals

    Assuming people shut their machines down by telling the software to shut down (as opposed to just killing the power), why won't the following work?

    Run the update as part of the shutdown process and save that. The machine takes longer to finish turning itself off. So what? Load what you saved at the next startup and merely append changes to it for as long as the machine runs, saving as you go. Repeat every time the machine is turned off. For folks who don't turn the machine off, give them an autoupdate option to run at 3am or some equally convenient hour for the user. Or am I missing something vital here and am too dazed to appreciate it?

"If it ain't broke, don't fix it." - Bert Lantz

Working...