Gravatars Can Leak Users' Email Addresses 170
abell writes "Gravatar offers a global avatar service, using an MD5 hash of the user's email as avatar ID. This piece of information in some cases is enough to retrieve the original email address. Testing a simple attack on stackoverflow.com, I was able to determine the email addresses of more than 10% of the site's users."
Public address (Score:5, Funny)
Here's my own Gravatar hash:
b835b33911b93c136d8e61cbbbe6736d [gravatar.com]
Who will be the first to crack it?
Re:Public address (Score:5, Funny)
Is it wagnerr@umich.edu?
Re: (Score:2, Informative)
It is, actually. If you don't include the -n option for echo, it will insert a \n to the string, changing the md5, which is the hash you got.
Re: (Score:2)
try again
$ echo -n wagnerr@umich.edu|md5sum -
Re: (Score:2)
How about wagnerr@umich.edu? /completelymissingthepoint
Re: (Score:2, Funny)
Who will be the first to crack it?
Fixed that for you.
Re:Public address (Score:5, Funny)
Re: (Score:2, Informative)
Wagner Computer Science program -- Page Not Found. [wagner.edu] Looks like that answered your question.
Re: (Score:2)
Re: (Score:2)
That took all of one second to find in an md5 lookup database. And thirty seconds for me to realize that I could have looked two lines higher to see it in plaintext next to your userid. :wallbash:
Re:Public address (Score:5, Funny)
That took all of one second to find in an md5 lookup database. And thirty seconds for me to realize that I could have looked two lines higher to see it in plaintext next to your userid. :wallbash:
Upside: You get to keep your geek card.
Downside: You'll never survive the world outside your basement.
8^)
Re: (Score:2)
That's just what I was thinking. I use Stack Overflow, and my username is (predictably) Satanicpuppy. 5 seconds of googling will give you my email address, because I treat all information on public forums as public information.
This is only a problem for people who think that they can really be sure of their privacy because some website takes a half-assed precaution.
Possible workaround (Score:3, Insightful)
Can anyone tell me if the "you can add extra stuff after a +" that GMail lets you do is standard in the RFC for all email addresses? If it is, to "fix" this, if you should sign up to Gravatar with an email address using a random string after an added "+" the brute force search on hashes will be much, much harder. (Assuming that your email provider is implementing that part of the standard.)
Re: (Score:3, Informative)
The RFC is actually pretty promiscuous; it's only implementations of it that fall short. Did you know that apostrophes are legal in the u
Re: (Score:3, Informative)
Heck; it's amazing how many sites forbid the '+' sign that Google takes advantage of
Here's what happened in hotmail when I tried to e-mail to [name]+bananas@hotmail.com
http://i49.tinypic.com/fbjh1j.png [tinypic.com]
I googled that odd character and it seems to be Chinese [google.com]
Hotmail treats the "send a message from one of your disposable addresses" generated by Spamgourmet as a typo.
Re: (Score:2)
I don't hold out much hope for slashdot not breaking this link, but here goes: http://translate.google.com/#zh-CN|en| [google.com]
Did you see the images part of your google search? Seems one of the meanings (a verb) is used more than the others...
(if I need to be less subtle, the image search for that character is NSFW)
Re: (Score:2)
Re: (Score:2)
To be fair, sub-addressing (using both the '-' and '+' characters) was around well before the creators of Google graduated from high school.
Re: (Score:2)
Did you know that apostrophes are legal in the username portion of the email address? Yet how many web sites do you think would allow you to sign up as "First_O'Last@mailserver.net"?
I recently sent an e-mail to a firstname.o'connell@host.gov.uk (no need to let her get random spam) in order to submit my response to a consultation the British Civil Service is making on policy relating to voter registration. Crossed my fingers and sent it via gmail.
Re: (Score:2)
It's a common convention that has been around at least since the '90s and probably earlier; I don't know where it started. I use underscores
So? (Score:2)
Unless I'm missing something, the article can be summarized: "Guess the person's email address, check if the md5 hash of the address you guessed matches the Gravatar. If it matches you guessed correctly."
Nothing to see here. Move along...
In other news, all password hashes can eventually be cracked by brute force... Oh noes!
Re: (Score:2)
Exactly. Not like it matters anyway. I even post my email up on my website so people can like, you know, email me!
Re: (Score:2)
At first glance... (Score:2)
Re: (Score:2)
No, Gravatar is the son of Gappa, the Triphibian monster.
Re: (Score:2, Informative)
And you didn't think of Gravitar instead? Kids these days...
http://en.wikipedia.org/wiki/Gravitar [wikipedia.org]
Re: (Score:2)
gravatar also sounds like an alternate name for a black hole.
Why is this a problem? (Score:2, Insightful)
Re: (Score:2)
Re: (Score:2)
> I used to post a fair bit on Usenet and I am fairly sure most of the spam I
> get is from spammers who picked up my email address there.
I still post a fair bit on Usenet. Most of the spam I get is not from spammers who picked up my email address there.
e9af4cb49c97162d6be3ea8c6ca90a46 (Score:2, Funny)
I actually *just* (20 minutes ago) put my picture up there. Can you guess my email ;)
Re: (Score:2)
I'm not sure if you were sarcastic or not, but your email address is at gmail, and I'm gonna mention Fight Club, and there's no I in team. Do you want me to post your email address more plainly?
So, yeah, posting email hashes is only a little bit safer than posting the full text.
Re:e9af4cb49c97162d6be3ea8c6ca90a46 (Score:5, Interesting)
Your email is: tyler.szabo _AT_ gmail.com
md5 -s "tyler.szabo@gmail.com"
MD5 ("tyler.szabo@gmail.com") = e9af4cb49c97162d6be3ea8c6ca90a46
For bonus points, your name is Tyler Szabo, you go to University of Waterloo and plan on graduating in 2011. You work at Amazon. You are in a relationship with a Kaylan Elizabeth L. (last name withheld as a courtesy, I'm sure you know who I mean :) ).
I found out you registered this, looked up your avatar on Gravatar, found you on Stack Overflow which gave me your real name (searched for Szabo assuming that was something to do with you). Using this, I looked you up on Facebook, Twitter, and various other sites. Your single avatar helped me link everything together. Once I had your real name from Stack Overflow it became easy.
Good times. Perhaps this reveals another security vulnerability? One avatar links -ALL- your social networking.
I also have your parents, previous employers, etc, but won't post those here :)
Re: (Score:3, Funny)
Your email is: tyler.szabo _AT_ gmail.com
md5 -s "tyler.szabo@gmail.com"
Nice job obfuscating his email in the first line.
use email+whatever@domain.com (Score:3, Insightful)
Use your email address with "+randomsequence"@
Randomsequence will have to be consistent between the user and the sites they want the gravatar to work at, but it will generate an MD5 hash different than their actual address; yet if the site sends email to the user with it the user will receive it.
easier than other methods? (Score:3, Interesting)
But is this significantly easier than other methods of harvesting email addresses? Spammers already do dictionary attacks on big providers like yahoo. It's not clear to me that this method is a better way of generating a list of email addresses. If you carry out a dictionary attack on yahoo.com, you're going to come up with probably tens of millions of valid email addresses. If you carry out this attack on gravatar.com, how many addresses are you going to get for your trouble? 10% of gravatar's users, apparently -- which I'm guessing is not really that big a number. Remember, once a spammer has a botnet, it costs him zero to send out one more spam to test whether a particular address is valid. Therefore the dictionary attack is free.
The defense against dictionary attacks is also exactly the same as the defense against this attack: either don't use a big email provider, or use a big email provider but pick a username that has a lot of characters (so it's not vulnerable to brute-forcing) and is also not vulnerable to dictionary attacks.
Not A Bug (Score:3, Insightful)
Email addresses are usernames. They are not secret information. If somebody can be bothered enough to find your email address through brute-forcing the MD5 hash of it; you've got bigger problems.
Far more than "10% of stackoverflow.com's users" can have their email addresses GUESSED far faster. Likely your email address is also FAR easier to establish through a simple Google search on your pseudonyms.
If you for some odd reason want your email address to be secret; for the same name as wanting a secret pseudonym or using a false name when signing up; register a fake email address instead (and set it up for forwarding). You're giving your email address in clear text to the site's owner and all the internet hops inbetween him and you ANYWAY.
It's important to learn to distinguish between what is a secret and what is not; and if you want to make things secret, at what level you should put your trust.
Re: (Score:2)
I'd disagree. Every site I can think of off the top of my head uses e-mail addresses and usernames as separate entities if the username is public. For instance netflix uses e-mail addresses to login, but they have you create a separate username for posting reviews and other shared/social parts of the site. Likewise slashdot makes sharing your e-mail address optional. And even the site in question tried to hide the e-mail address, they just did a very poor job of it.
Public Key Encryption (Score:2)
What if Gravatar published a public key, and sites displaying Gravatars pointed their image links to encrypt(gravatar_id + random_salt)? It seems like this would solve the problem, since people viewing the page can't get access to the users' real Gravatar IDs. Sure, the forum sites would still see your Gravatar ID, but they already have your email address in the first place.
Re: (Score:2)
Two points.
Firstly, the image files can't be static if you're using the salt, since the gravatar backend would have to remove it and look up the gravatar_id; this would increase running costs for gravatar by a considerable amount. Second, if you're using a gravatar_id why bother with the encryption? As long as there's no way for the gravatar ID to be resolved back to an email address it doesn't matter if people know it, especially since knowing the encrypted version would necessarily be functionally identic
In the grand scheme of things this is pretty minor (Score:2, Funny)
It's not exactly big news that a system based on MD5 hashes is susceptible to dictionary-style attacks; this should be obvious to anyone who understands how hashes work. In order for this particular attack to work, the attacker already has to have some reasonable guesses as to what your e-mail address is; the Gravatar trick only confirms the address. So it seems to me that the amount of additional data leaked is fairly small.
OTOH, I suppose I'm somewhat desensitized to this sort of thing, since I've had the
Could provide an API (Score:2, Interesting)
From Gravatar's FAQ:
MD5 isnt strong enough encryption, they’ve cracked that havent they?
MD5 is plenty good for obfuscating the email address of users across the wire. if you’re thinking of rainbow tables, those are all geared at passwords (which are generally shorter, and less globally different from one another) and not email addresses, furthermore they are geared at generating anything that matches the hash, NOT the original data being hashed. If you are thinking about being able to reproduce a collision, you still don’t necessarily get the actual email address being hashed from the data generated to create the collision. In either case the work required to both construct and operate such a monstrocity would be prohibitively costly. If we left your password laying around in the open as a plain md5 hash someone might be able to find some data (not necessarily your password) which they could use to log in as you... Leaving your email address out as an md5 hash, however, is not going to cause a violent upsurge in the number of fake rolex watch emails that you get. Lets face it there are far more lucrative, easier, ways of getting email address. I hope this helps ease your mind.
So, they might have already thought about this vulnerability and dismissed it as not interesting.
They could still fix their concept by providing an API where a website wanting to discover the avatar for a given email first hashes the email with MD5 and then the Gravatar URL which is generated redirects them to a link to the image (which contains no information about the email address, or perhaps uses a salted [wikipedia.org] hash). This, in conjunction with rate limiting the number of queries per websit
Simple way to protect yourself (Score:2, Insightful)
Some email providers have a simple way of giving you a throw away id. E.g example+slashdotnospam@gmail.com is sent to example@gmail.com.
Say my name is Lary Page. If my email id is lary.page@gmail.com, I can still protect myself so that you will never get my email id.
MD5 (lary.page@gmail.com) = "1b8dbe98e2b1138fd3ba34e26fc55107".
So I provide my email id as lary.page+1b8dbe98e2b1138fd3ba34e26fc55107@gmail.com. If I gave you the md5 of that id, you'll find it hard to get back to lary.page@gmail.com.
Try, the
In other news, Water is wet (Score:2)
I think most of us figured out this possibility within 30 seconds of seeing how Gravatar worked.
One solution would be to have a private salt known only to Gravatar and the implementing website. Gravatar could determine the correct salt to use base on the referrer.
Of course this would mean each subscriber would need to be hashed against each salt in the Gravatar database.
In either case, I don't think it's really that big a deal.
a pinch of salt (Score:2)
Call me when he finds a way to determine the email after gravatar starts adding a pinch of salf to the hashed emails...
Who cares? (Score:2, Interesting)
Using @ instead of @ is enough to stop most e-mail harvesting bots, I don't see them brute-forcing MD5s any time soon.
How is that news? (Score:2)
Is obvious for everyone that understand how it work.
Geez...
As the email of Gave (from Valve) is well know, and gravatars can be used in a pseudoanonymous way, I tried to search internet for the hash of is email in images.google.com. Not found. Either Gabe don't talk in forums gravatar powered, or he use a different email address.
So, If you use gravatars, and other people know your email, can search your post. This is obvious from the use of md5. With your addres hashed with md5 spamm bots can't collect addr
Thats why I use.... (Score:2)
That's why I use a new hotmail address usually made with the sites name and my own to keep logs of everything that comes from there, so if anything is compromised, then I know usually where it comes from. Also I have no worries someone gets my address as it is irrelevant seeing as it is not my real one.
Not News (Score:2)
The important part of the trick is that you have to assume the email address is the same as the username and then compare the hashes of that name @yahoo.com, @hotmail.com, @gmail.com, and other popular email services. Because people that use those webmail addresses have never received spam before.
If any spammer did try this, I would expect them to be very pissed off to discover that after all that work they already had 99% or more of those addresses to begin with.
If your email address is simple... (Score:2)
If your email address is common-word@famousprovider.com, then the spammers have already put your email address into their lists. Why not? They don't care if 95% of the mail they send bounces, and they don't care if they target any specific person, the "hit" rate they need to make a profit is is negligible. I see spam attempts to thousands of never-existed addresses on my colo, and my home domain is pretty damn obscure. I'm sure Gmail gets hits from aaron.aardvark through zephram.zymurgy continually.
Re:So let's change the algorithm. (Score:4, Insightful)
No it's not related to MD5 itself. period.
Re:So let's change the algorithm. (Score:5, Insightful)
I [benramsey.com] disagree. [gromweb.com]
Granted, those are basically very unsophisticated databases that just store lookup values, but it's relatively easy to bruteforce an MD5 hash down into one of the possible original strings (obviously with any algorithm that has a fixed output size with limitless inputs like MD5 there are infinite inputs that will hash down to a single md5sum, but when you're trying to get a valid email address out of a hash it's easy to pick the right one). Couple that with the fact that in this situation, you know that the entire string is lowercased and probably 60% of the gravatar emails (probably more like 90% actually) are going to come from one of four or five domains... reversal becomes quite easy. If you're bored, you could spin up a few Amazon EC2 or Rackspace Cloud Server instances to dump out some large tables. One each for gmail, yahoo, msn, aol, whatever else; it'd be a very simple script to make. You could probably cover every alphanumeric email address under 12 characters overnight, at a cost of about a dollar and ten minutes of scripting.
The thing to realize here is that gravatar doesn't md5 emails to hide them from people who want to obscure their identity, just to obscure them from spambots. So it's really a non-issue. If you're that concerned, leave your blog comments with a fake email address.
Re: (Score:2)
You hit the nail on the head. If one uses these, they should either use an alias (I know Hushmail and Yahoo both offer alias functionality) that they can filter incoming mail with.
Even better, because Gravatar is essentially Alice and Bob, they should have gone with either a salt (64 bits is "meh", 128 is decent, 256 is good for the forseeable future), SHA-256, and toss in a site key that only their backend database knows. This way, it would be immensely difficult to associate the hash with an E-mail addr
Re:So let's change the algorithm. (Score:4, Insightful)
Doubt it. there's 26 letters and 10 digits, in addition to that . is very common in email-adresses. Thus you get 37 possibilities for each position. 37 to the 12th power is 6582952005840035281 hashes to run, and even if you do 10^9 Hz (i.e. one giga-hash-a-second, which would require on the order of a few hundred cores), you'd still need 208 years to do that many hashes -- then you need to look up each of them in gravatar, and analyze the result for a hit-or-miss.
"every alphanumeric email-address under 12 characters" is infact much too large a keyspace to reasonably cover overnight with a "very simple script".
It's not a large enough keyspace to be cryptographically secure, but it's large enough to not be trivially exhaustible.
Re: (Score:2)
Re: (Score:2)
get a list of email addresses
If they had that, they wouldn't need to do anything now would they?
Re: (Score:2)
True. And -that- is feasible. I was just commenting on the claim that you can exhaustively search all 12-character alphanum strings in a trivial amount of time. you cannot.
Re: (Score:2)
Actually, it's not infeasible. If you own a botnet with 100,000 machines at your disposal, you could set them to cranking through these hashes. If they could crank through at your estimated speeds (which are generous given that most infected machines are likely to be slower, but it still gets the point across) they'd crack it in less than a day. Even if the problem was two orders of magnitude harder than you suggest, it's still doable in about two months. You are still correct with your tightly qualifie
Re: (Score:2)
So why bother searching? Since the only reason Gravatar obscures email addresses is to stop spammers, the spammers can just send email to all addresses that correspond to [common user name]@[common domain]. In fact, that's exactly what they do. There's no need to waste time and money breaking MD5 hashes.
Re: (Score:3, Insightful)
That's assuming email addresses are random sequences of letters, digits and dots.
If you're a spammer and don't mind missing the email of mr. q9x7.3f.1zzp@hotmail.com, a phone book would probably provide an effective dictionary for narrowing that keyspace considerably
Re: (Score:3, Insightful)
Or, use john -incremental -stdout. This will test reasonable names first, while not being restricted to RL names only.
Re: (Score:2)
This programmer used a bot to gather over 8k email addresses. So it's pretty useless against spam.
Re: (Score:2)
If all you have is the hash, then bots would be pretty useless.
Re:So let's change the algorithm. (Score:5, Informative)
Re: (Score:2)
Re:So let's change the algorithm. (Score:4, Informative)
Not really, since the salt would need to be publicly known for Gravatar to work (and it would break any backwards compatibility to add it in now). This was a 'social engineering' attack, not a rainbow table lookup – it pieced the name together with common providers to find a matching MD5. Salt would just add a single extra step.
I believe it's exactly the same problem/attack as was brought up about MicroID [wikipedia.org] in the past. The idea of Pavatar [pavatar.com] is a much better way to do this sort of avatar-finding (though the decentralisation comes with its own problems), since it relies on a public web address instead of a semi-private e-mail address.
Re: (Score:3, Insightful)
Re: (Score:2)
Wull, anyway
No thanks to kdawson who perpetrated this and alerted every spammer with a slashdot tab open that they could start harvesting email addresses there and how to do it.
DORK! No pie for you!
Re:So let's change the algorithm. (Score:5, Informative)
1) register as a website with gravatar, find out how long the salt is
2) register on stackoverflow with your email address
3) enumerate the possibilities until you find the hash of your own address and therefore the salt
4) extract 8000+ emails from stackoverflow
5) repeat for other sites
Re: (Score:2, Insightful)
Re: (Score:2)
How is that going to work? If each site uses a different salt, it will produce a different hash for the same email, thereby defeating the whole purpose of Gravatar.
Re: (Score:2)
You can generate a list of potential emails using usernames, so you could run through GameboyRMH@aol, gmail (got me!), h
Re: (Score:3, Informative)
MD5 collisions actually don't help the attacker here, in fact, an MD5 collision would simply be a false positive for this case (the attacker thinks they've found the email address, but they haven't).
No need (Score:3, Insightful)
It would have been trivial for them to just add a secret salt string to the email before hashing, and that would have solved most of the problem. It is possible that they wanted to be "nice", in that in the case they go out of business, anyone can regenerate the ID's without them. But, as this guy has shown, that's not a great idea.
Re: (Score:2)
Re: (Score:2)
You could add a salt yourself (Score:2)
I guess you could add a salt yourself, at least of your email provider works like gmail, and allows you to supply a meaningless string after a +. If the first part of your email address is guessable from your username, you could do something like:
homburg+randomsalt@gmail.com
Re: (Score:2)
No, it would not be easy to reverse engineer the salt string. Even if you know half of the source text and the hash, that does not make it much easier to get the second half of the source. You would still have to try all possible combinations. The secret salt could be:
example@gmail.com124235rjcw475tvye
example124235rjc@w475tvyegmail.com
e1x2a4m2p3l5er@jgcmwa4i7l5.tcvoyme
124235rjcw475tvyemoc.liamg@elpmaxe
There are just too many possibilities.
Re: (Score:3, Informative)
It's not, any hashing function would be subject to the same problem. If you RTFA you'll find that they just brute force combinations of the user name and common email domains.
To actually fix this would require not hashing (only) email address, you could mix in some secret salt with the email before hashing, or you could use encryption (with a secret key), or you could just hand out unique identifiers which are associated only in the Gravitar database. I don't know if any of these are feasible for this parti
Re: (Score:2)
Re: (Score:3, Interesting)
It's quite well known that MD5 shouldn't be used for anything privacy related, given the fact that it's been exploited quite publicly in recent history.
An email address isn't private... I suspect that MD5 was just a convenient way to get a fixed length id. I'd be more worried about collisions, but i'm too lazy to calculate how many avatars would be required before that might become a problem.
Re: (Score:2)
Re: (Score:2)
>> An email address isn't private... I suspect that MD5 was just a convenient way to get a fixed length id. I'd be more worried about collisions, but i'm too lazy to calculate how many avatars would be required before that might become a problem.
2^128^.5 = 2^64
Phew, I'll have to take a break after that one.
Re: (Score:2)
2^128^.5 = 2^64
Phew, I'll have to take a break after that one.
That's an approximation for a 50% probability though... right? I'd be more inclined to think that anything over a one-in-a-million probability of a collision is unacceptably high, as a collision would break things, but that's what i'm too lazy to figure out. It's not purely a maths problem.
Re: (Score:2)
Yes. For a birthday attack, math says you need about one square root of the number of items to get a collision.
2^64 is a lot of items, which is why hashing is still useful.
But back to TFA, these items should be salted with a secret salt to make the data unusable to outsiders.
eg: md5('mypass'+$youremail) = useless information to hackers
Re:So let's change the algorithm. (Score:4, Interesting)
Re: (Score:2)
If the gravatar makes the pairing trivial then it's trivial to automate. And so the spam filter will have to iterate.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
By using this exploit, spammers get additional user useful data: They'll know each user's full name in most cases. They'll know that the user is interested in the site he's commenting on. They'll know what language he speaks. Basically, they can compose much more compelling emails with a higher probability of getting through and even being seen as relevant to the recipient.
Re: (Score:2)
Re: (Score:2, Insightful)
Correct: the attack here is:
Take big Site with thousands of user, many using thier (sorta) "real names".
Permute these names with some known big email provider hostnames.
Send them all some spam.
It does not really matter if 90% of those emailadresses are incorrect, the rest will hit.
I would not do the MD5 validation thing, why should I?
Re: (Score:2)
You're dead on about using thousands of hashes. The practice hurts an attacker far more than it hurts legitimate users. It's called key stretching [wikipedia.org], or key strengthening.
Re: (Score:2)
Not the algorithm (Score:3, Interesting)
This is not related to the MD5 algorithm or use of salts. The fact is that Gravatar wants sites to use Gravatar without sending loads of requests to gravatar.com. Therefore Gravatar must provide a "client-side" API for generating Gravatar avatar URLs based on the known constant, email addresses. Sure, they could have salted things, but whatever they do, there's an essentially open source function somewhere that takes an email address and converts it to a Gravatar URL. As the algorithm is available to an
Re: (Score:2)
Re: (Score:3, Informative)
Re: (Score:2)
A normal implementation of salt (with the salt in plaintext along with the hash) would not help in this case.
Re: (Score:2)
In order for Gravatar to work, the algorithm has to be publicly known. Which means every site uses the same salt (pointless) or each domain has its own salt, which can be determined from the referrer header (not only also pointless since a potential attacker knows what site they're on, but it would also make the service pretty much impossible to implement). The only other option would be two-way encryption with some sort of per-domain shared key, but given that most of the point of Gravatar is simplicity o
Re: (Score:2)
Salting would help a bit here, but far more effective would be key stretching [wikipedia.org]. Hash the email, then feed the hash back through the hash function a few thousand times. The extra computation doesn't have much of an impact when generating a single email identifier, because hash functions are blazing fast, and 1,000 iterations is still blazing fast. But the extra computation grievously hurts people who are using brute force to create rainbow tables, making the whole thing take thousands of times longer.