A Look At Google's Email Spam Prevention 176
CNet has a story about the security measures Google employs to protect their email systems and fight the never-ending war on spam. Their Postini team, acquired two years ago, has a variety of monitoring tools and automated response systems to find and block undesirable messages. Quoting:
"The system scores each message on numerous combinations of criteria, assigning a weight to each and then comparing the score to those in a database of several hundred thousand message types that have been flagged as good or bad from Postini honey pots and customer spam reports. ... To block fresh spam attacks not covered by existing heuristic technologies and viruses not covered by existing signature databases Postini relies on proprietary Zero-Hour technology to identify new outbreaks that show up in the traffic patterns and quarantine them for later rescanning. Customers can also create and build out their own white lists of message senders they trust and blacklist others they don't trust. It takes an average of 150 milliseconds for a message to be scanned by the antivirus engines that Postini licenses from McAfee and Authentium.
Don't care how they do it.. (Score:5, Insightful)
Go gmail
Re:Don't care how they do it.. (Score:5, Informative)
Pfft.. the internet became sentient sometime ago and used to babble like a baby. Since whatever it said was pretty much garbage, it was impossible for anyone to correctly figure out whether the noise was the baby's (spam) or from the tv (non-spam?). Now that the internet speaks more coherently it is far more easier for Google to figure out stuff that is coming from the internet - spam that is. It is rather obvious actually.
I wonder why yahoo has a miserable spam filter though; maybe Yahoo is like the careless parent who never gave a shit to figure out when the baby stopped babbling. And judging by the kind of spam I get in my hotmail box (it is all from microsoft), probably MS would be like those parents who insist on babbling themselves when the baby is around.
There, mystery solved! Now no one has to RTFA. Now if only someone made this into a car analogy for the greater good.
Re: (Score:2)
Okay, I hope the the moderation to my post was an instance of meta-humor. Just in case someone who is ignorant about spam filtering techniques and believes the moderation that my post is actually informative or insightful: STOP! The internet is not really sentient (yet).
Re:Don't care how they do it.. (Score:4, Funny)
STOP! The internet is not really sentient (yet).
Am too!
Re: (Score:2)
I'm curious to know why you say this. I have both a Yahoo and gMail account and for both, equal amounts of spam make it into my inbox (which is maybe one message every 3 or 4 months). Both seem to have very good anti-spam technology to me. Now Hotmail, I have no idea about since I stopped using them about 8 or 9 years ago.
Re: (Score:2)
Re: (Score:2)
When you go live can you learn to read please :)
Re: (Score:2)
Re: (Score:2)
the internet became sentient sometime ago and used to babble like a baby.
So THAT is how babby is formed!?
Re: (Score:2)
Re:Don't care how they do it.. (Score:5, Insightful)
Don't care how they do it..
Then I suggest that you don't really belong on /. ...
Re: I do care how it works (Score:3, Interesting)
Re: (Score:3, Interesting)
Re: (Score:2)
Re: (Score:2)
...because it's actually not working - Gmail spam filter recently became very ineffective - i have to classify about 5-10 Viagra spams daily. (Google, have you heard of it? geez!) then it occurred to me that a while ago Gmail captcha was cracked, so I imagine spammers send themselves hundreds of spams only to classify them as "non-spam". - as a consequence, spams are now slipping through the crowd-sourced filter because the crowd is infiltrated. c'mon google this can't possibly that hard to fix!
Actually I think they already have. I noticed the same thing only I was receiving a far greater volume. I think I suddenly went to a couple of hundred emails per day, some getting as far as my spam folder, some getting in to my inbox. Now just this weekend I noticed that this has now ceased and the number in my spam folder is working its way back down as they are deleted.
Re: (Score:3, Informative)
I get loads more spam than I used to. Something broke in Google's spam prevention about 4 months or so ago, and it's not been fixed yet. I redirect my email to my phone, where I get a notification of new email, and I've had to turn the sound and vibrate alert off because I got too much spam coming through.
Re: (Score:2, Interesting)
I've been told by some people that part of the reason of the recent suckage of gmail's spam filter are people who think they're smarter than google and automatically mark all their messages as ham so they can get via pop or smtp to their computers and then run their own spamassassin/razor/bla tools on the mail. Thus, messages that are _obviously_ spam get marked as ham and are forwarded to the rest of users. I don't think it's the main reason, but worth sharing anyway in case somone knows more about this 't
Re: (Score:3, Interesting)
Spam Assassin is a great compliment to GMail's spam filters.
1) I use IMAP Spam Begone [rogerbinns.com] to check my google inbox and mark stuff as spam/not spam.
2) I use DMZ's remote SA-Learn [dmzs.com] to learn spam from my google spam folder (after I check it for false positives) and I use it to learn ham of stuff that IT marked wrong.
Result, I haven't had any spam make it through since I started using it.
(Both scripts do require editing isbg.py hasn't been updated in 5 years, so to work with newer python I fixed some things and sa-l
Maybe there's less spam nowadays (Score:2)
I used to have 20,000+ in my spam folder every day for years. Recently it dropped to the low 400's.
But because there's much less spam, I actually check the spam folder quite often to see if there are false positives, and I almost always find a few. Makes me wonder how much mail I missed all this time?
Re:Don't care how they do it.. (Score:5, Interesting)
I've set up GMail to filter my email and by comparison I'd say one or two spams get through. So I'm very happy with GMail's level of coverage. It's not perfect but it makes things tolerable. I'm not at all happy with Yahoo's level of coverage. Yahoo allegedly also has spam filters, but I've yet to see they actually work. It's not uncommon to find my email box filled with Nigerian and other scams.
Re: (Score:3, Informative)
Wow, what Bayesian filter are you using that is only giving you a 20% catch rate?
I'm using spambayes [sourceforge.net] (a pop3 proxy) and I would estimate it catches well above 95% of my spam. My inbox would be utterly unusable without it.
It requires some training - the more training you give it and the more religious you are, the better it works. I've trained it on around 3000 ham and 3000 spam messages and it is incredibly accurate (almost scarily, sometimes) at catching spam. False positives are extremely low - here's the
Re: (Score:2)
Re: (Score:2)
1 out of 3 manually fed spams was classified incorrectly? That's incredibly high. After feeding it the first few hundred spam e-mails, it should already be working well enough that the end ratio should be much lower.
Anyhow, how fast is it? GMail's system appears to be horribly slow. 150
Re: (Score:2)
"Anyhow, how fast is it? GMail's system appears to be horribly slow. 150 ms per message means it can only handle 8 messages per second. And that's likely on fairly new hardware -- my old PIII mail server can do far better than that."
i would bet you that this stat is PER SERVER (which google has whole buildings full of)
Re: (Score:3, Interesting)
20% on a Bayesian filter is ridiculously low; so low in fact I believe you are stretching the truth to make or point, or you're not training it.
My gmail account is quite old (gotten when only google employees were giving out beta requests), using an extraordinary common firstname.lastname account name, and since Jun 17, I've gotten 2247 spams. So that's what, 19 days? Gmail has *let through* probably fewer than 10 actual spam in that time frame (0.44%), and I haven't checked for any false positives.
Re: (Score:2)
Beyond the obvious keyword flags (any various drug names and the various ways to spell mortgage) I have three pretty simple rules :
1. If it has invalid html tags in the text, it is probably spam.
2. If the originating IP address isn't from within the US, it is probably spam.
3. If my email address isn't the only email address the email was sent to, it is probably spam. Anybody who emails me knows that if it isn't worth sending me my very own copy, it probably isn't worth me reading either.
Honestly for me tho
Re: (Score:2)
Re: (Score:2)
Email is for old people. the rest is done through chat/twitter/facebook/what-not-else-other-sns.
Most of my mails nowadays are notifications from other services that I have a mail waiting there :)
Re: (Score:2)
"Postini"? (Score:5, Insightful)
My previous ISP switched me over to Postini with no advance notice (we got a cheery note from marketing after the deed was done). Blocked half the spam and half the ham. They told us how to disable the filtering "features" but it turned out that all the filtering could not be turned off.
I'm not with that ISP any more.
Re: (Score:3, Interesting)
Re: (Score:2)
Re: (Score:2)
I wish that my ISP's filter would find Snopes candidates from my mother-in-law and relegate them to the bit bucket. But it never learns. Bayseian filter? No.... it learns only when a user spanks it.
Re:"Postini"? (Score:4, Funny)
Now, instead of getting emails from her with "I wonder if this is true. It sounds so amazing!", I get "I already checked Snopes, and while this one isn't real, it makes for a good story!" MLIA.
Re: (Score:2)
[smacks forehead, mumbles something]
Re: (Score:2)
--
Mum,
<wife>'s just come in and told me that you forwarded the email that you got from "Terry" to her.
I'm not going to have a go at you or anything, you didn't send it to me, as I requested =) But, if you are going to do stuff like that, let me show you the best way to do it so that you don't do to others what I don't want doing to me.
When you get the email that you want to forward it will have loads of addresses in it already, which don't need t
Re:"Postini"? (Score:5, Interesting)
Google is the only mail service that I know of who still just won't accept my emails. They make it very difficult to contact them. There is a form buried somewhere in their help system, but it says that they won't respond unless they need additional info from you, which leads me to believe that they never actually read anything submitted through that form. (I have tried a few times.) They also specifically say they don't take whitelist requests. I have SPF records, I have correct reverse DNS, I'm not on any blacklists, etc.
This means when I send emails to my friends who use Gmail, or comparies who use Postini, I get blocked without cause. Then I have to use a different server. It's kind of annoying.
(Why do I use my own email server? Because I can. This is
Re:"Postini"? (Score:5, Interesting)
I had a similar [wordtothewise.com] experience [slashdot.org]; I run my own mail server, send no bulk mail whatsoever, and both Postini and GMail independently decided I was a spammer. No DNSBLs had me listed, ReturnPath was happy, etc. Meanwhile, I was blocked from sending mail to my lawyer, my financial advisor, my chiropractor, etc., all of whom turned out to be downstream from Google. Despite Google's claims that the customer is in full control of filtering, none of them were able to get at my e-mail without getting their sysadmins involved - which often required discovering that they had sysadmins at all.
Worse, Postini's spam filtering takes its own output as input. Once it's scored a message of yours as spam, future messages will be more likely to score as spam - which of course makes any subsequent messages even more likely to score as spam. Brilliant. At one point, my spam score from a triple-signed (SPF/DK/DKIM) server was 98 out of a possible 100.
Google's philosophy of "we don't do it unless we can automate it" works horribly when it comes to customer service. There's no feedback loop, no whitelisting, no channels, no nothing. It's SPEWS all over again, or perhaps the Kafka International Airport [theonion.com].
But Google has no reason to worry about false positives; the more messages they call spam, the more spam they can say they blocked. Perverse incentives.
I have a guess... (Score:2)
You probably misspelled your mail server's "user agent" string as postfux ;-)
Re:"Postini"? (Score:5, Interesting)
For what it's worth, Gmail has been just the opposite for me. It's Yahoo and AOL which randomly decide to block me -- sometimes with some cause, sometimes just because it's on a residential connection.
Yet Gmail never so much as greylists me -- everything goes straight through, every time.
Re: (Score:2)
Ditto. I'm not saying the gp is wrong about his experience. But in my own case, I've found that both Yahoo and AOL just stopped accepting email from me ca. 2008. I run my own server on my own domain (not via a residential connection). In Yahoo's case, it was fairly easy to fix; I filled out a form, and after a while Yahoo users started receiving my emails again. With AOL, I haven't looked into
Re: (Score:3, Informative)
I helped a customer get off AOL's blacklist a couple months ago.
It was a straightforward process with an immediate automated reply.
In order to complete the process you must be able to receive an email at abuse@, postmaster@, or the technical or administrative contact for your domain.
The final email was from a human. It was completed the day following.
Re:"Postini" - why I use my own mail server (Score:2)
Re: (Score:2)
Re: (Score:3, Interesting)
Re: (Score:3, Insightful)
Tell him to look up the definition of "whitelist".
My guess is the system runs much more optimally when your entire address book is whitelisted.
Re:"Postini"? (Score:4, Interesting)
Have you noticed? GMail gives one no way at all to sort the captured spam. Since I still endure false positives from the system and there is NO way to disable or bypass it, having means to sort all of it by From:, To:, and other criteria would make it easier to identify the false positives and rescue them from the trash bin.
Well, I'll take that back, in part: that applies to the Webmail interface, but if ones uses IMAP with a local IMAP client, then the spam folder could be subscribed and sorted within the client. God only knows how GMail's system interprets the dragging of a message from Spam to Inbox via IMAP: does that automatically whitelist that sender in the future, or do I have to still log into the Web site and identify it as Not Spam manually?
Re:"Postini"? (Score:4, Insightful)
"there is NO way to disable or bypass it"
Have you looked into filters? They added an option to "Never send it to Spam" about a year ago. You can create custom white lists with this, or just include everyone in the filter and totally bypass the spam filter.
Re: (Score:2)
Have you noticed? GMail gives one no way at all to sort the captured spam. Since I still endure false positives from the system and there is NO way to disable or bypass it, having means to sort all of it by From:, To:, and other criteria would make it easier to identify the false positives and rescue them from the trash bin.
I haven't noticed - filters make this pretty trivial:
in:spam from:blah
Re: (Score:3, Insightful)
That's irrelevant: you'd have to KNOW who it was from in order to employ a SEARCH like that. That's not useful at all when you aren't looking for something specific.
Re: (Score:3, Insightful)
having means to sort all of it by From:, To:, and other criteria would make it easier to identify the false positives
Now you say:
That's irrelevant: you'd have to KNOW who it was from in order to employ a SEARCH like that. That's not useful at all when you aren't looking for something specific.
If you don't know who it's from, to ,etc how is sorting by these fields going to help you filter out false positives? Since you now posit that you don't know who it's from, then that won't give you any information that you can use. In addition, you don't need to be searching for something specific to use the filters that are available.
Re: (Score:2)
If you've never done this, so you haven't thought it through well enough to recognize why it would be useful. A big part of the benefit comes comes being able to quickly exclude and delete what is obviously not false positives... thus quickly winnowing the list to something manageable to find those that actually might be. This is possible because, for instance:
Re: (Score:2)
Take a deep breath dude, was trying to give you info that I thought might help. Now it seems that you've presented a moving target.
I don't know what the GP is thinking, but here's my thoughts...
At first I thought your help was awesome. But then I realized it being able to filter on From: is not the same as being able to sort on From:
For example...
Joe sends me email which gmail sends to spam. Since I know gmail does this frequently with his emails and that Joe sends me email once a day, I can easily use the filter to find his message in spam.
Now comes the problem. Sally sends me an email which gmail sends to spam. She doesn't normally s
Re: (Score:2)
Whose brain-cell-murdering Kool-Aid have you been sipping, hmmm?
Re: (Score:2)
Re: (Score:2)
I would like spam sorting also; I never empty my spam folder without reviewing it for false positives (which I rarely get, but still), and it would be very handy to sort it by subject, so that I can bypass duplicate spams all at once.
Re: (Score:2)
Who uses their ISP's email/webspace anymore anyway? It makes switching ISPs much more difficult, unless you don't mind breaking all your old addresses or feeling stuck with your current ISP.
Re: (Score:2)
> Who uses their ISP's email/webspace anymore anyway?
Not me any more, and that was one of the reasons. I pay Newsguy for Usenet and email service and also have another address provided by friends. All I want CenturyTel to do is handle packets.
Yawn... (Score:2)
...has a quick look and goes back to catching up with news on the MailScanner mailing list.
Postini may or may not work, (Score:5, Funny)
but what I really want to tell you is that I've inherited a great deal of money and I need someone to help me transfer it to the US. I live in Nigeria. You all seem to be great gentleman, so I will pay appropiately.
Contact me.
Re: (Score:3, Funny)
You must be new here.
Re: (Score:2)
Hi
I like money :)
Please reply to this message with your contact information
better then their fishing 'algorithm' (Score:2, Funny)
part of gmails phishing filter seems to do this
if(hyperlink in email ends in .exe)
{
isphishing = true.
}
Even if this is an email from someone in your whitelist and is merely quoting text from your own message you sent them. .exe in it to be marked this way :(
And there seems to be NO way to prevent a message with
Toughest spam (Score:5, Funny)
Re: (Score:2)
It was me! (Score:2)
So, the product is still great. Tech support has gone downhill though. Anyone who has tried to deal with Google tech support for anything will know how it feels..
Re: (Score:3, Interesting)
I signed up with Postini just as it was acquired by Google. Before that I'd used SpamSoap, which worked great but was declining in effectiveness (more false negs) but not in price ($30 per month is a lot for a small business). Postini and then Google were far more reasonable at just $3 per year per address (for the less-flexible controls). I get maybe one or two delivered spam per week, usually when I also see a corresponding spike in filtered spam which indicates a new attack of some kind. I get only one o
Praise Gmail (Score:3, Interesting)
But what about spam from "me"? (Score:3, Interesting)
Re: (Score:2)
I believe e-mail spoofing (where the spammers spoof the header to make it look like it comes from you) is completely different than sending e-mails to yourself, and gmail knows this. That said, when is the last time a spoofed e-mail actually made it to your inbox?
Re: (Score:2)
Weird.
This happens to me a lot on my own server (where I don't put very high weight on SPF records), but I guess I assumed that Gmail was better controlled than that.
That said: Whatever Google is doing, seems to be working. I haven't had a legitimate email tagged as spam by them in years, and my spam folder (which used to get hundreds of spams daily) has shrunk to having only a dozen or so in the past month.
Re: (Score:2)
strange, this never happens to me in google. All that spam that comes from "me" is correctly tagged as spam, my mail that comes from me (via google) is never tagged wrong.
The only spam that comes through is some wired Russian spam every other day or so.
Re:But what about spam from "me"? (Score:5, Insightful)
Keep in mind:
It's a perfectly legitimate (and common) for non-webmail users to have their outgoing server be their local ISP. So if google did what you're suggesting, all those people that use an IMAP client to receive their gmail, and send via their ISP wouldn't be able to send to other gmail users
Re: (Score:2)
Keep in mind:
It's a perfectly legitimate (and common) for non-webmail users to have their outgoing server be their local ISP. So if google did what you're suggesting, all those people that use an IMAP client to receive their gmail, and send via their ISP wouldn't be able to send to other gmail users
Also keep in mind, Google is actively marketing their email services to ISPs... many ISPs are using GMail for their email services.
Mine switched from an internal co-located email service to completely outsourced Gmail based, ISP branded email solution in less than a month. They lost a lot of control - but saved a metric shit-ton of cash in the process.
Re: (Score:2)
It's a perfectly legitimate (and common) for non-webmail users to have their outgoing server be their local ISP. So if google did what you're suggesting, all those people that use an IMAP client to receive their gmail, and send via their ISP wouldn't be able to send to other gmail users
This does not make it legitimate (though it may be common) to forge the From address line. They should use Reply To if they want to send From another address/mail server, and have replies to go their gmail account.
If you want to send with the correct From header, you should be using secured email and sending via the gmail servers (SMTP is the protocol used in any case). No ISPs I know of block the ports for secured email, so you can easily send via the google servers.
Forged From headers are a big problem fo
Re: (Score:2)
So why on earth don't you sign your mail. Some really clever people have come up with some pretty good ways of proving your electronic identity.
Alternatively, why not tell your system where your mail comes from and then reject anything that doesn't come from those sources.
Its not that hard to persuade your own mail system what mail is really from you and not a fake.
There's no need to lose functionality, you just have to think around the problem
.
McAfee (Score:5, Interesting)
Re: (Score:2)
Yeh, that's information that would have been useful yesterday. Now I feel dirty.
Look at it this way:
At least it's not Norton !
Re: (Score:2)
Not really. Think of it like beta testers, or even better - neutral sensors. They aren't paying for the service, so they can't call your helpdesk and bitch about its effectiveness or about that one spam message that really offended them.
McAfee gets millions of email accounts to monitor and use as sensors for new spam, allowing them to gather that data, crunch it, and redistribute the new spam identifcation data to their own paying customers, including gmail.
This allows them to be more accurate and have bett
Postini works (Score:2)
In my humble and largely anecdotal experience, Postini works well. We send out e-mail that can often be flagged as SPAM when we perform penetration testing, and Postini seems to be the toughest to get around. We see in-house devices such as IronMain, and outsourced services such as MXLogic and FrontBridge/hosted Exchange, but Postini seems to do the best at stopping illegitimate messages. The company I work for uses this it as well, and logging into my Postini inbox I see a lot of spam but no false positive
Re: (Score:2)
Gmail and Me (Score:2)
I've logged in to it four times, and I deleted something like 2000 spam messages.
I'll continue to not use it, thanks.
Re: (Score:2)
Every gmail account I've opened has been flooded with spam. One I never sent a single message from.
Re: (Score:3, Insightful)
Did you have an easy to guess username?
Just because you didn't send email from "robogun@gmail.com" doesn't mean your robogun@att.net isn't on a spam list somewhere. How do you increase the size of a spam list exponentially? strip all the domains from the addresses and find common names... then generate one email address for each domain you want to hit.
Ta-da... spam email sent to accounts that were never used. This could indicate that google's directory harvest attack identification methods need some fine tu
Great (Score:2)
Now apply this technology to Google Groups.
Yeah, I know it's usenet, but they could apply it to their web interface (see comp.lang.c++ for a sample of what it has to deal with).
SPAM volume patterns (Score:3, Informative)
What I find telling is how my SPAM volume rises and falls according to the American holidays. Whenever the Yanks have a holiday, SPAM drops to a trickle.
That to me is a clear indication that most SPAM originates in the US even though it mostly gets relayed through Asian proxies.
Incoming spam isnt the problem outgoing spam is (Score:2)
Spam and scams originating from Gmail has been so bad lately that several clients of mine have actually requested that I block gmail entirely. I have been tempted to do so with my home account as well since its rendered craigslist all but unusable. When do they plan to address that...but then what could they really do??
No spam at all (Score:2)
I have had two gmail accounts for a couple years now. One of them has my name on it (in the form of: "firstname.lastname@gmail.com") and the other is a nick (not the same as my /. one) that I often use in forums/games. Curiosuly enough, neither of these accounts gets any spam at all. And by this I don't mean that the spam filters are effective because there is no to be filtered. I can understand that my name based account doesn't get spam, after all I rarely give it out to anyone except people I know in per
GMail Spam Filter != Postini (Score:2, Informative)
From the article:
"Google's Gmail antispam efforts are separate from those of Postini, which Google acquired two years ago, although it follows similar computerized operations and the teams have started to integrate the processes."
I've had email at an ISP that uses Postini, and I have email at Gmail. IMHO, Gmail > Postini.
Is Google complicit in spam? (Score:2)
I find it very strange that my Gmail account received so much spam long before I ever started actively using it. It's not like me e-mail address is made up of one or two words. I cannot for the life of me understand how anyone would possibly guess my e-mail address (two letters plus an uncommon word). I'm guessing someone got a hold of their user list. Anyhow, their spam filter is fairly accurate.
Gmail spam (Score:2)
my email spam prevention system (Score:2)
Barking up the wrong tree... (Score:2)
They have the resources, they should fight the war the right way - by going after the people who sponsor spam. They are electronically reading our gmail email, they can see the headers. They know where the spam comes from, and when. They know what domains are being sp
Summary... (Score:2)
...is misleading. New summary:
Bayesian filtering.
NEXT PLEASE.
Most decent-sized companies (Score:2)
...are better off doing their own solution using a combination of sendmail, mimedefang, spamassassin, and greylisting. If you are big enough to 'need' postini, you likely have a staff that can do it better themselves using open tools and tuning that solution to your particular environment. But nowadays, nobody wants to hire competent staff, it seems.
Re: (Score:2)
One of my complaints about Postini (and whatever it is that CenturyTel uses) was that the "virus" filters cannot be turned off. I have no Microsoft or Apple software.
Re: (Score:3, Insightful)
As an email administrator - I wouldn't give a user the ability to disable virus filtration on their email account - even if I knew they weren't a direct threat to any known virii. Too many stupid people out there know how to use the FWD button.
I know what you're saying, but since you're probably the smartest user out of the tens of thousands that use your email server - they're not likely to give you a one-off option.
Re:now am worried !! (Score:5, Insightful)
150 milliseconds sounds fast, but equates to only 7 messages per second.
Sure that may be faster, presuming it's a deep intensive scan, than what one can do on their home PC, and yes Google has zillions of boxes ... but anyways, my point is that 7 messages per second illustrates the very real, high cost of dealing with spam; scanning of just a million messages, which is a fraction of the spam volume, at 7 messages per second, takes well over a day of computer time.
Ron
Re: (Score:2)
No, it ought to trip the "STOP TYPING IN ALL CAPS! IT'S LIKE SHOUTING!!" filter and prevent you from posting it.
Re: (Score:2)