Google Experiments With Local Filesystem Search 482
Teoti writes "No, Puffin is not the next name of your favorite email client, but, according to the New York Times (NSA reg. req.), the project codename for a new Google search application coming directly into your desktop, that will let you search your local filesystem efficiently. This is different from, but complementary of, the Google DeskBar that already lets you search the Web. The article also gives a few words on the end of the stand alone browser in Longhorn."
What operating systems does it work on? (Score:4, Interesting)
privacy (Score:5, Interesting)
I can't frickin' wait (Score:5, Interesting)
DO NO EVIL? (Score:1, Interesting)
will we see 'adsense' words based on which file we are searching for?
there must be a motive for this, some sort of expected gain, or why?
for the most part, google's actions are benign, I believe the claim that gmail scans are automated and innocuous.
but what's the benefit to google for this one?
Lotus Magellan for my linux server (Score:1, Interesting)
For those infants out there, Lotus Magellan was the greatest, it was Windows Explorer as it should have been done, it searched any spreadsheet, database, or word processor file.
Gawd, Linux needs this. I would pay ~$250.00 for an industrial strength business version.
Re:interesting (Score:5, Interesting)
Google researchers are allotted 20% of their working time to do outside projects or to follow personal interests. Google News and Gmail were both results of work done during this "20%" time. So in short, no, I don't think Google has really stretched their resources any more so than before.
Everything Old is New Again (Score:3, Interesting)
I remember Alta Vista offered this sort of search-your-own-computer software back in *1998*. This seems to be the most recent version: http://siliconvalley.internet.com/news/article.ph
Re:privacy (Score:5, Interesting)
Google seems to be as anti-ad as most people on Slashdot. I personally hate ads, but I feel that most of Google's ads are non-invasive and in good taste.
Google should distribute Mozilla (Score:5, Interesting)
Re:Also on CNET... No NYT Registration (Score:3, Interesting)
I can't imagine not having this feature and it floors me that Microsoft can't imagine anyone ever needing it.
Re:What operating systems does it work on? (Score:1, Interesting)
Altavista did it 6 years ago (Score:5, Interesting)
Perhaps Google can fill this void in the pathetic Windows power tool-set ("Windows power tool-set" being close to an oxymoron).
But, despite my love for Google, in these more Orwellian times, I'm glad that I have the tools (not from MS) to monitor port activity.
Re:Security... (Score:3, Interesting)
But what if they could? If google cached, online, the location of MP3s and MPEGs loaded on your system, then allowed others access (with your permission of course). Hmm... sounds like a P2P file sharing system...
Re:File searchs are slow. (Score:2, Interesting)
If I need to do text search, I have a little for sh script that will look for a prefix in
If I don't need to hit the FS for finding files, a catalog already speeds this up. I've started doing this in cygwin to speed up searchs also. (Gotta love having unix tools under windows)
Isn't it better just to be organized? (Score:5, Interesting)
Even when a file crosses multiple logical groups, (picture, jpg, family, nephews, 2004) if my information categories are sensible, and I use a heirarchy that makes sense to me, I don't need search that often. In fact, I can't recall the last time I had to do a search of my drive to find a file. (I should probably mention that my work requires a lot of information mapping, so creating and maintaining such a structure is trivial for me)
Of course, since Windows search is so inefficient and (sometimes) problematic, I learned long ago not to rely on it.
bluez3
What about linux? (Score:2, Interesting)
Good for the goose? (Score:4, Interesting)
That's the first time that I've ever read of it going in a direction away from Microsoft. Usually, it's the other way around, Redmond sucking up the managers and staff if they can't buy or steal the technology.
Re:I can't frickin' wait (Score:2, Interesting)
Actually, the speed of the searches are usually influenced by the speed of the Algorithm. You can take a pretty basic full file/text search (ie, windows search) and run it on a 2ghz+ dual-opteron beast with superfast HD, and it will still lose to a 500mhz laptop doing search with a proper index and metadata lookup.
Add in AI stuff like predictive/speculative lookup and search/result cacheing, and the difference becomes night and day.
Re:Google should distribute Mozilla (Score:4, Interesting)
They do, in a way [google.com].
Mozilla.org and firefox are the top 2 results if you search for web browser [mozilla.org]. Interestingly, the top links are: Mozilla, Firefox, Opera (twice), Safari, Netscape (twice), Galeon, evolt.org's legacy browser archive, and webstandards.org, in that order. The first page doesn't mention MSIE at all. MSIE is listed 5th on the 2nd page, after lynx, anybrowser.org, amaya, and Konqueror.
It seems people who talk about browsers don't like to mention MSIE.
Re:I can't frickin' wait (Score:3, Interesting)
On my XP machine I have in the neighborhood of 300,000 files, and a full-text search takes 1 minute, tops. On my Mac it's closer to 150,000, and a full text search takes about 25 seconds. 90 minutes sounds like something is seriously wrong.
Re:What operating systems does it work on? (Score:5, Interesting)
An Expected Move, but Complex (Score:3, Interesting)
However I don't see a lot of overlap with web search. The major pieces won't work the same:
Crawling: People want fresh information, eg that marketing report that just went out five minutes ago. Many web sites are happy to be crawled once a month. Keeping up with user edits on a filesystem is going to be a lot harder, and users will probably not be happy with heavy reindexing cycles. The ultimate would be heavily integrated with the filesystem, keeping an eye on all file activity, and refreshing the index appropriately. I believe Longhorn's delays are related to this problem.
Indexing: Desktops have a lot of file types, and strange crypts like the Outlook. Certainly Google has some support in this area, but more may be needed. There are also other document units like email messages instead of files, or even database records.
Fetching: Granted, a simple search toolbar will work, but I've been more impressed with, for example, Apple's Sherlock protocol, which allows multiple search "channels", eg Web, News, Stocks, etc., some from third party providers. IIRC this is what Firefox uses.
Ranking: Pagerank is definitely not going to work, although that may not be such a handicap when hit counts are in the one or two-digit range. Still, it's not a competitive advantage.
Screw filesystem searches... (Score:3, Interesting)
And if I had a nickel for every time I had to resend something to a co-worker because they were too goddamned lazy to just search their email for the message I sent them THE FIRST TIME, well, Google wouldn't need an IPO because I'd just buy them outright!
That being said, filesystem searches with Google would be damn nice too.
Mechanik
Re:What operating systems does it work on? (Score:5, Interesting)
Having done quite a bit of both in the past several years, I'd highly disagree. There are plenty of off the shelf products or methods to create cross-platform applications and very very few (and generally poor in quality) tools or even documentation to write cross-platform websites (modern ones, with dhtml and heavy usage of DOM).
But a lot of the code (particularly for interacting with the file system and the GUI bits) will be platform-specific.
Nope, that's pretty much been standardized, assuming you're writing from scratch. Now porting an application written platform specific is a completely different story. But this example is an application written from scratch.
And as for filesystems, well... nowadays filesystems are much more consistant than, say, SysV versus VMS versus the dozen variants of CP/M. Subdirectories and pretty consistant meta information (date created, date modified, date accessed, etc) on every file is the accepted standard. They may do things different under the hood, but (at this time) they are all pretty much POSIX.
--
Evan
Don't you all see the trend? (Score:4, Interesting)
2. Netscape IPOs and climbs to some insanely high value...
3. Microsoft integrates browser into OS...
4. Netscape crubles...
- - - - fast forward - - - -
1. Google conquers the search market...
2. Google IPOs and climbs to some insanely high value... (coming soon)
3. Microsoft integrates search into OS... [Longhorn] (coming eventually)
Where do you think the rest of this goes?
I loved the AltaVista personal search software (Score:1, Interesting)
Unfortunately, what I downloaded was only a demo version of the program that was only good for 90 days or something like that. When I decided to purchase the software I discovered that there really was not a reasonably priced version available for individual users. All that was available was extremely expensive versions intended for large companies. They did not even make an attempt to market it for users of home computers or small businesses. So even though I loved the software I had to stop using it. If I remember correctly it indexed not only text files but also MS Word documents, HTML, and my e-mail.
When searching for documents on my computer I always used the advanced search feature and did a boolean search using terms such as AND, OR, NOT and NEAR. It was very efficient. I now use Linux instead and have occasionally used grep, egrep, sed, awk and find but would perfer to also love to have the option to use a search engine on my home computer. I hope whatever Google comes up with will be available for Linux or at least will run under WINE or CrossOver Office. Of course, I would only use it if it is implemented in a way that does not invade my personal privacy. By the way, when searching the Internet, AltaVista does not seem to be using the same powerfull search engine with boolean operators that they once used so I recently switched to Google instead.
I also wonder how all this will compare with the new search engine that Microsoft is developing for WinFS under Longhorn. I hope that by then Linux will be offering equally good search capabilities. I seem to recall hearing that Han Reiser is in some way working on upgrading the ability of the Linux ReiserFS file system to be searched. Is that correct?
Re:What operating systems does it work on? (Score:2, Interesting)
I would be more inclined to believe Google's statistics [google.com] on the popularity of web browsers. (Look for the section marked "Web Browsers Used to Access Google" or follow this link [google.com] if you are really helpless.)
Considering that the most clueless Windows users are probably using the address bar in Explorer to automatically use MSN, the Google figure for all non-IE browsers may actually be higher due to the self-selection of Google users.
Re:About time (Score:3, Interesting)
slocate not quite Google (Score:2, Interesting)
Locate for windows (Score:3, Interesting)
Re:What operating systems does it work on? (Score:3, Interesting)
Assuming people shut their machines down by telling the software to shut down (as opposed to just killing the power), why won't the following work?
Run the update as part of the shutdown process and save that. The machine takes longer to finish turning itself off. So what? Load what you saved at the next startup and merely append changes to it for as long as the machine runs, saving as you go. Repeat every time the machine is turned off. For folks who don't turn the machine off, give them an autoupdate option to run at 3am or some equally convenient hour for the user. Or am I missing something vital here and am too dazed to appreciate it?