Metadata in Vista Could Be Too Helpful 276
linumax writes "Windows Vista will improve search functionality on a PC by letting users tag files with metadata, but those tags could cause unwanted and embarrassing information disclosure, Gartner analysts have warned. Search and organization capabilities are among the primary features of Windows Vista, the successor to Windows XP due out late in 2006. While building those features, Microsoft is not paying enough attention to managing the descriptive information, or metadata, that users can add to files to make it easier to find and organize data on a PC, according to Gartner. 'This opens up the possibility of the inadvertent disclosure of this metadata to other users inside and outside of your organization,' Gartner analysts Michael Silver and Neil MacDonald wrote in a research note published on Thursday."
Oblig. Nelson (Score:5, Funny)
Ha-ha! You're using Windows!
Re:Oblig. Nelson (Score:2, Funny)
/windows has no root
/storms off
Non-Oblig. Homer (Score:4, Funny)
Bart: Isn't that just the wrong way?
Homer: Yeah, but faster!
Not just windows, Mac's too (Score:5, Interesting)
As a result I no longer have spotlight index my e-mails. And of course that's a pain in the ass since it means Mail.app's searhc feature is busted. While I can figure out how to work around that (e.g. don't use mail.app, which would be a pity), the story does not end there. Unfortunately, spotlight indexes my backup volumes too, and it can blunder across old mail there and index it.
Now you might think I could also turn off indexing the backup volumes but there's the rub. First I might not want to. Second, you can't always do it. Spotlight has some bugs in how it handles logical partitions on disks and in particular it sometimes ignores being told not to index a volume if another partitions is being indexed.
Anyhow eventually there will be more fine grained control on privacy, but then the interface will become more cludgy too. In fact that may just kill the whole fine grained control effort since most folks don't worry about this sort of things and would prefer simplicity.
It's perhaps worth noting that windows dropped making the filesystem a database (for now). That might be a smart move since making at a wrapper like spotlight means they are less locked into a single search design. Problems like this will emerge slowly and flexibility to plug problems will be needed.
other Automatic meta data generation issues (Score:5, Interesting)
Which of course means automated meta-data scraping. this leads to the problem of confidential info disclosure. that's obvious. But it also leads to another problem that annoying. When do you update the meta data? when the file is created or modified? a small lag? or in batch overnight?
On macs you can force a batch overnight search. But the default on is for instant updates. If you add a search term to a document WHILE a search is being performed in another window it will find it! amazing. and very useful too. And it assures things like computers that sleep at night and detachable drives stay indexed.
But it's also amazingly annoying when you stop doing conventional desktop activities and start doing more unix like things. Tage for example untarring a 30 GB archive with twenty thousand small files in it or something that is generating transisent files in a rapid fire fashion. Well you start untarring and for the first few files it zips along. then suddenly throughput nose dives. Why? you look at your processes and you see MDL the indexing programming is chewing up your disk access.
You can work around this if you can control the file names and make sure they are ones it will not index. But that's not assured, always possible, and will vary from computer to computer.
So anyhow there's lots of fine tuning needed on these ubiquitous metadata systems. Fine grained privacy control and fine grained operation modes so it's live in desktop application mode and lags in unix/high performance modes.
Re:Not just windows, Mac's too (Score:4, Insightful)
It doesn't sound like a metadata related problem to me. It sounds more like a furniture placement issue.
But seriously, de-selecting 'Mail' in the Spotlight pref pane, should stop spotlight from displaying results in its window, while retaining the full indexing facilities within Mail.app itself.
Re:Not just windows, Mac's too (Score:4, Insightful)
Re:Oblig. Nelson (Score:4, Funny)
Windows Insecure??? (Score:2, Funny)
Say it ain't so.....
Re:Windows Insecure??? (Score:5, Insightful)
Re:Windows Insecure??? (Score:3, Funny)
Microsoft (Office)
Adobe (PDFs)
If you can think of any other companies that keep turning up, you let me know.
Re:Windows Insecure??? (Score:4, Insightful)
I think you're seeing a conspiracy where none exists. If, for instance, AppleWorks suddenly overnight became the most popular word processor ever, and people were passing AppleWorks bills to the local senator over email... well, you'd have the same problem, because AppleWorks (and most, if not all, word processors) keep the same meta-data as Word and PDF does.
Re:Windows Insecure??? (Score:2, Interesting)
Re:Windows Insecure??? (Score:2)
You COULD make it so that all metadata is acesssible to all users.
Or, you could make it so that if you don't have access to the file, you don't have access to the metadata for that file, either.
So it *IS* implementation specific. Sorry.
Any bets on which approach Microsoft took?
Re:Windows Insecure??? (Score:2)
I'll just disregard that as a troll until then.
I'm shocked (Score:3, Funny)
I'm shocked, shocked to see Microsoft prioritizing features over security.
</Claude Rains>
Re:Windows Insecure??? (Score:4, Insightful)
according to a compilation by Workshare, a maker of software that strips metadata out of files.
You wouldn't think that they have some invested financial interest in getting the the public overreacted about the dangers of metadata
Am I being reverse paranoid?
Easy solution (Score:5, Insightful)
Don't fill out the metadata fields!
Re:Easy solution (Score:4, Insightful)
It has everything to do with human behavior and nothing to do with computer security. As it is, desktop search tools are opening up whole avenues to quickly find the secret smut on your desktop. Do you have a Google account AND search history enabled? Go to google.com and do a Search History and see what stuff you've been searching on that Google knows about. You shouldn't have done a search on "merkin".
Re:Easy solution (Score:2)
Re:Well that would be great, but... (Score:2)
If a Vista user tags a jpg with "Family Vacation", will Mac and Linux users be able to view these tags?
Re:Easy solution (Score:2, Informative)
I've been on both sides of this problem with current Windows/Office implementations - receiving sales or RFP information that included "hidden" revision or comment information intended for another client, or catching similar information in documents heading out the door.
Within Offi
Re:Easy solution (Score:2)
we enfored RHDtool where i work, and it's really a must-have thing... i've mentioned this story in other comments, but it's so illustrative i'll share again:
this summer, we received some documents from DOJ that were meant to be put on our website... they included revision history data that showed different information about different drafts of the agreement we'd been working on... of course, not every schmo who looks at a document on our website is going to *find* this stuff at all, but it's s
I don't get it.. (Score:5, Interesting)
Re:I don't get it.. (Score:4, Insightful)
Like Big Bird says, remember to put your infants in the back seat, so the "safety" devices don't kill them.
Re:I don't get it.. (Score:5, Insightful)
Turning to the metadata: Having lots of metadata to search can be a very good thing. But, if used improperly (e.g., having the index not properly secured from outside access or malicious software) they can be a bad thing (read: security risk).
So, as the grandparent said: "Like Big Bird says, remember to put your infants in the back seat, so the "safety" devices don't kill them."
Re:I don't get it.. (Score:5, Insightful)
Otherwise, you'd be able to search for the meta data in the private files of other users.
Re:I don't get it.. (Score:2)
Re:I don't get it.. (Score:3, Informative)
I did RTFA. The "problem" is you may deliberately send a file, eg a spreadsheet, but along with the file, Windows will have your indexing info, which may give away more than you want ("generic fuck off message", etc). Of course, this information comes courtesy of a company that has a "metadata cleaning" system they want to sell you. Everyone seems to be think
The problem is giving away metadata with the files (Score:4, Insightful)
For example, several years ago Microsoft reportedly [computerbytesman.com] posted its annual report as a Word document, which contained evidence that it was composed on a Macintosh.
That example is good for a chuckle (OK, maybe a belly laugh for us Mac fanboys), but suppose someone sent a document to a customer that showed it was filed in a folder named "Correspondence with Idiot Customers" without the sender realizing it...
Re:I don't get it.. (Score:2)
Silly as it sounds, it's possible to be TOO friendly. This is one reason it's fortunate that little children can cry when someone they don't know picks them up and takes them away from their parents.
Re:I don't get it.. (Score:2)
Isn't this like saying Airbags are too safe? I thought whole point of metadata is to make it easier to search and find data? How can it be *too* helpful?
It is possible for something to be helpful in some instances and harmful in others. Airbags can cause accidents if they go off when something hits the bumper, but would not otherwise have caused a crash. Most likely there are more crashes because of airbags, but fewer serious injuries.
In this particular case, metadata can be great for finding things but
Surprise? (Score:2, Insightful)
Google desktop is a little scary... (Score:4, Insightful)
Of course, we don't have it on our main office machines, because they are running Slackware. Our machines that are locked into Windows for hardware interface reasons had to have Desktop removed from them after a couple of almost-incidents.
YMMV
Re:Google desktop is a little scary... (Score:2, Interesting)
How is that scary? It's just indexing data that is already on your computer. The fact that a file is "hidden" in a subdirectory 10 levels deep in an odd file format doesn't make it any more secure, just harder to find. Security by obscurity doesn't work. If a hacker has access to your machine, he can just as easily index your files from the out
Security by obscurity (Score:3, Informative)
Medicine is different, though. HIPPA basically requires that you use this kind of security (obscurity). Let me give you an example. If I have your (HIPPA protected) chart in the office on my desk, that's OK. If I leave it in the waiting room, it's not. Information does not have to be hidden from a determined (and illegal!) search, because, well, that's illegal, an
Re:Google desktop is a little scary... (Score:2)
Guess they left out the <humour> meta-data tags.
Oh Great (Score:5, Insightful)
Re:Oh Great (Score:2)
Stop. I presume that some of Gartner's employees have actually done some programming - otherwise why would anyone pay attention to anything Gartner said?
Surely not ? (Score:4, Funny)
Surely Microsoft aren't adding a feature to Windows without giving thorough consideration as to how the feature will work in a multi user, internet connected, environment ?
After all they've show time and time again how much they cae about these things
That reminds me... (Score:5, Funny)
Re:That reminds me... (Score:3, Informative)
In your colleagues case it sounds like he may have been able to prevent it, but that is not always so [abanet.org] with metadata that that vendor includes in your documents.
Re:That reminds me... (Score:2)
News? (Score:2)
But I suppose that for the protection of the unwashed, we should inform them of new flaws in MS products.
Hahaha, must have opened porn.... (Score:5, Insightful)
Re:Hahaha, must have opened porn.... (Score:3, Insightful)
Re:Hahaha, must have opened porn.... (Score:3, Insightful)
Not necessarily. Even in the healthiest of relationships one often becomes unreasonably annoyed with one's partner, and sometimes that annoyance gets vented to others. There's nothing wrong with (say) griping to a friend over IM that your GF is driving you up the wall because "she just won't fucking shut up about how her clothes don't fit right,
Re:Hahaha, must have opened porn.... (Score:5, Funny)
Stupidity 101 ? (Score:5, Insightful)
After 10 years of M$ Word disclosing secret information, you'd have guessed that "a removal tool" as mentioned in the article is obvious to anyone with half a brain as not good enough.
Storing the meta-data in a seperate file, or how about with the other metadata (i.e. with the inode) isn't so hard, is it? And it is quite obviously the right thing. There's even a big, red hint right there in your face: It's called meta-data. Might want to treat it different from the actual data, you know?
Re:Stupidity 101 ? (Score:5, Insightful)
Re:Stupidity 101 ? (Score:2)
If you want to guard against stupid l0sers who will only send/save/copy/move the
Re:Stupidity 101 ? (Score:2)
Re:Stupidity 101 ? (Score:3, Insightful)
Associate metadata with file in filesystem in such a way that it follows the file around. In other words: Put it in the inode or whatever the windos equivalent is. That way, metadata stay associated, no matter where you move the file to.
But when you send the file out by mail, FTP or whatever, only the file contents ar
Re:Stupidity 101 ? (Score:2)
Train those users (Score:5, Funny)
Re:Stupidity 101 ? (Score:2)
Re:Stupidity 101 ? (Score:2)
Re:Stupidity 101 ? (Score:3, Interesting)
Unix stores what little metadata it natively supports in the inode, not in the file data blocks.
Special files have nothing to do with metadata, but with the Unix philosophy of "everything is a file", which works great and allows you to reduce the number of necessary system calls considerably.
I know no director
This is a BETA, Right? (Score:5, Insightful)
Re:This is a BETA, Right? (Score:2, Troll)
Re:This is a BETA, Right? (Score:2)
Re:This is a BETA, Right? (Score:2)
Re:This is a BETA, Right? (Score:2)
A few high-profile incidents of this could destroy the 15,000 strong company I work for, depending on what is revealed.
The thing is, the business security implications of this are minor at worst, and none at best
Of those 15,000 around 10,000 routinely have contact email with clients, most of those every day. You can't prevent mistakes on that scale without being absolutely fascistic
Re:This is a BETA, Right? (Score:2)
A beta release is (or at least is supposed to be) essentially a release where the important features are pretty much done, and where the "only" work that's left to do is shake out bugs, tweak minor things, fix documentation and so on.
It is NOT a release where you put in all sorts of crazy features that you don't actually plan to have in the final product - that would be rather stupid on pretty much every level I can think of, especially the economic one
The 2008 Toyota Prius (Score:4, Insightful)
Oh, sorry... I just figured that we're talking about products that are still a few years down the pipe that haven't been anywhere close to finalized yet.
I don't know about anybody else, but we not only don't evaluate software years before it's released, but we generally wait until the software has been out for at least a year before even looking at it. I don't know what the point is of reviewing a product this early. The only thing that I can figure out is that it's a way to get a few more pageviews.
Re:The 2008 Toyota Prius (Score:2)
MS has committed to an August 31,2006 date, so it better be damn close to finalized.
Now, chances are they won't make that date, but they've publically said they would.
Re:The 2008 Toyota Prius (Score:2)
"embarrassing"? (Score:4, Funny)
All Microsoft has to do (Score:3, Interesting)
is to make the metadata attatched to document files viewable only on the Vista installation it was created on. Perhaps it would be possible to have the operating system strip the data off the files that are being copied or moved to other network locations as a precursor to each respective process. In this case, they would also have to work some kind of functionality into the next iteration of Outlook, so that the problem could be stemmed from the email side of things.
What 3rd party vendors would do to accomodate this is anyone's guess.
Re:All Microsoft has to do (Score:4, Insightful)
This is just another example of disclosures from the past where change log information was left in documents released to public forums. Very interesting info disclosed in some of those word documents. Must be standard procedure now for lawyers to check the change log info on documents they are sent.
And if people don't fill out the meta data info the fancy new search capabilities won't be as useful so why have them?
Re:All Microsoft has to do (Score:2, Insightful)
Yawn, non-story (Score:5, Insightful)
How is this different than naming your file "Invoice for Asshole Larry.doc" and mailing it to the client? Simple solution: don't put potentially embarassing stuff in the metadata fields.
Do people really need an analysis to tell them this?
Re:Yawn, non-story (Score:2)
I'm not sure how companies ever get out of the stupidity loop, but somehow they get to #4. Companies are constantly hiring high-priced consultants to tell them things that may in hindsight seem obvious, but it really is a matter of experience.
A company that has never been burnt by poorly managed meta-data won't really give a damn unless they have someone thinking ahead.
Word: "Properties" and Track Changes (Score:3, Insightful)
The more data a computer saves (especially if hidden from plain site), the greater the chance of embarrassment and unintended leakage of sensitive info.
Re:Word: "Properties" and Track Changes (Score:4, Informative)
Re:Word: "Properties" and Track Changes (Score:2)
PDF Annoying? (Score:2)
Seriously, what the hell bugs people so much about PDF?
Re:Word: "Properties" and Track Changes (Score:2)
Now *there's* a solution I can get behind!
From this point on, I am exporting all my shared .doc files as giant GIFs. No harm no foul.
(If it seems like I'm joking... I am. But only sorta.)
Send... as in external... as in not shared. (Score:2)
I mean, you might as well say "How dare you send me this bottle of Chateau D'Yquem. I mean, wine in a bottle? Geez, now I need a corkscrew. Why couldn't you send me a box-o-wine so I wouldn't have to go to all this trouble?" Uhm, yeah, you need an extra t
Usefulness of metadata (Score:4, Insightful)
Having something like "post-it notes" that do not stick to the file, but instead are part of the directory entry for that file, might be more useful and safer. If someone sends me a file, I don't want that person's metadata to pollute my classification of files.
That's somewhat like what happens with e-mail - I receive plenty of mails that the sender marked as "high priority", but that are low priority to me. Metadata on the file should be objective; subjective information should be stored somewhere else and not be transmitted together with the file.
Re:Usefulness of metadata (Score:5, Funny)
In the interestation of securitization, the catalogation of the nation's datation should not be left to the ineptitudination of incompetentation corporatizations with a historicalization of not giving full thoughtfulination to securitization.
Re:Usefulness of metadata (Score:2)
What makes you say that? MP3 files, and their ID3 tags, don't seem to be an issue really?
I like the concept of metadata in the filesystem because it moves beyond the 'folder barrie
Re:Usefulness of metadata (Score:2)
Ok - I still don't see the problem. MP3 tags are user-defined (although often fed by CDDB or the like). So really
Summary (Score:2)
Allchin said those enhancements--along with a reduction in the number of times customers have to reboot their machines and other features--will mean that companies that move to Longhorn will be able to cut their operating costs. Of course, he added, "that's up to us to prove."
Got that? To cut your operating costs, pay Microsoft some more money for some Longhorns.
Company policy. (Score:5, Interesting)
But this will just be an extension to that policy to check for any meta data.
Re:Company policy. (Score:2)
Looking back I wonder if there is still a chance private data could be leaked, that somehow PDF layers the hidden stuff underneath and if someone were to peel back the top.
For the most part, no. PDF files do, however, support the concept of layers (which must be explicitly created by the authoring program). The only security issues I've seen with this is where people layer black boxes over text to censor them, not realizing the information under the boxes is still there and readable. This has caused se
Re:Company policy. (Score:2)
Re:Company policy. (Score:2)
Sounds like the XLST stylesheet will have to be modified for every type of document?
Re:Company policy. (Score:4, Informative)
The places you need to worry about metadata exposure are the document-aware "export" functionality, because rather than simply printing from primitives, these work with full knowledge of the document and it's structure.
This is bull (Score:2, Interesting)
as this type of technology comes to the mainstream its to be expected the early stuff may have a bug or two. (see: google desktop)
and here they are slamming microsoft for a new feature people are asking for. and telling them how to do it, when they have no idea on how hard this kind of thing is to do from a software engineering perspective.
I mean sheesh The product is in BETA, make a bug report to microsoft as a beta tester if you find a
Terms of Embarrassment (Score:3, Insightful)
Oh, you mean more embrassing than finding cookies and cached images from pr0n sites and the like? Unless you're considering self comments like "he's so hawt! I'd so tap that!" Not that you that most people's surfing already involuntarily discloses their personal data like a sieve.
I'd be less concerned about people appending credit card numbers and such to files, not embrassement.
This Happens Already (Diebold/BlackBoxVoting Ref) (Score:2)
http://www.bbvforums.org/cgi-bin/forums/board-aut
And search for "properties".
Re:This Happens Already (Diebold/BlackBoxVoting Re (Score:2)
Lobbyists and Corps have a long history of writing legislation etc and literally giving it over to our public representatives.
Still dissappoints me though.
One day I'll be jaded enough to say "whatever"
Here is quick fix (Score:3, Insightful)
Yes, if they manage to apply rights based system system wide, something like OS X, it won't be problem.
I mean if they are stealing, steal it completely
Note I had to 'sudo ls -la' to see it even.
(os x 10.4 "tiger")
what planet are these people from? (Score:3, Insightful)
Kind of like Gnome has been doing for a few years now? How out of touch are these people???
Re:Stupid (Score:2)
This is idiocy - any disclosure of data which is unwanted can be damaging; so, are we not to have it?
No it is not idiocy. Sharing metadata can be both useful and disastrous, as shown by the metadata often shared with Word files. The concern is that, like MS Word, Vista will include metadata in shared files without providing a proper UI that informs the user and makes sure they are aware of that metadata. MS does not exactly have a stellar record in this regard. Third parties currently provide application
Re:Search your data? (Score:2)
Of course, people that can't be bothered to give their files descriptive names aren't very likely to fill out metadata info, either. So it's not goin
Re:Couldnt care less (Score:4, Funny)
Re:what will happen to the file name? (Score:2)
Folder structure can also be thought of as another piece of metadata (e.g., tag this file/folder with the "owning folder" tag).
Re:I doubt Gartner knows what they're talking abou (Score:2)
Yes and no. WinFS could support a concept similar to the resource fork concept in MFS/HFS/HFS+/etc. on the Macintosh. The "content" of the file could be one fork, the metadata could be stored in a second fork, and the forks combine to comprise the file object. I think NTFS might already support such a concept (I vaguely recall reading something a