Google's Next Challenge, Spam Results 238
krou writes "The Guardian's tech blog is running an interesting piece on Google's next big challenge, which is dealing with the spammers it helped create. 'Google is the 900-pound gorilla of search, with around 90% of the market (excluding China and Russia), and there's an entire industry which has grown up specifically around tickling the gorilla to make it happy and enrich the ticklers.' They quote Paul Kedrosky who notes that 'Google has become a snake that too readily consumes its own keyword tail. Identify some words that show up in profitable searches — from appliances, to mesothelioma suits, to kayak lessons — churn out content cheaply and regularly, and you're done. On the web, no-one knows you're a content-grinder.' Whether searching for reviews, products, businesses, or even conducting academic research, scraper sites are ranking higher than original content. The article speculates that Google may try fix the problem but, from Google's perspective, most of these type of sites use AdSense ads, and generate revenue for Google (89% of clicks come from the first page of results), so Google may not have an incentive to change things too much. Alternatively, people could stop using Google, 'because its search is damn well broken... The question is whether it would be visible enough — that is, whether enough people would do it — that it would show up on Google's radar and be made a priority.'"
Fix the spam filter on blogger... (Score:2)
They could also fix the spam filter they've added to Blogger that you can't disable. It's hilarious to see legitimate posts get flagged and hidden while Chinese clothing spammers and porn spam gets through.
mesothelioma (Score:3, Interesting)
This puzzled me: "profitable searches — from appliances, to mesothelioma suits, to kayak lessons"
I'm thinking, "Mesothelioma suits? What's that, a protective suit you wear when you're working around asbestos?"
Before Google came up I realized he was talking about lawsuits. Gees, lawyers and businessmen talking sure confuse this old nerd sometimes. To a businessman, "suit" is what lawyers bring, to a nerd, it's usually protective gear.
If you go talking about RAM here, I'm going to think "memory". If you
Re: (Score:2)
Re: (Score:2)
Then I'm the kind of punster who, in the computer game I run, makes it so that a "law suit" is also a kind of clothing, but you take psychic damage from wearing it.
Re: (Score:3)
i remember an article, awhile back, that said the most expensive word in google's adword program (where advertisers pay a biddable amount each time someone clicks on their ad when someone searches for that word) was not some sex-related term, not some date site term, but... drum roll please... mesothelioma
if you searched for that word, and clicked on an ad next to the search results, you were costing that advertiser something like $10 just for that click
jeepers
Re: (Score:2)
Re: (Score:2)
mesothelioma
if you searched for that word, and clicked on an ad next to the search results, you were costing that advertiser something like $10 just for that click
I just went and clicked on 9 mesothelioma text ads.
Any other keywords that are stupidly expensive?
Re: (Score:2)
methinks slashdot is about to make it a very expensive day...
Re: (Score:2)
Youthinks correctly. I've now gone and done it myself, plan to do it for all my PCs and may keep it going for a week.
Re: (Score:3)
LOL
someone needs to introduce you to 4chan and anonymous
Broken? (Score:5, Insightful)
Re: (Score:3)
Re: (Score:3)
Re: (Score:2)
then again, you get my example from today:
searching for Cub Scout stencils for an upcoming project. google search for Cub Scout Stencils, Cub Scout Wolf logo, Cub scout wolf head logo, etc.
Eventually found what I was looking for, but the first page had 7 out of the 10 results as something like:
www.kompai. com/hMg9Yaoe/
OR
mitchelljm. us/lz-printable-cub-cadet-stencil.htm
the first 3-4 pages of each search were full of those.
Now, just 3 hours later as I went back to get the form of those URLs, they're almost al
Re: (Score:2)
I was thinking the same thing (Score:2)
Drug Interaction info is unfindable (Score:4, Informative)
If you want to get information about how Drug $A interacts with Drug $B, Google's pretty useless - you mostly get sites that want to sell you drugs and list $A and $B, or at best lists of medical papers, usually scraped by reformatters, which have some paper on $A and another paper on $B. (Of course, if you want information on how Drug $A interacts with Drug $$V, then you're totally out of luck :-)
I've given up on Google and use Wikipedia for any medical information.
Re: (Score:2)
Does + still work? I use - and quotes, but + seems to be ignored.
Re: (Score:3)
It largely ignores quotes too now. Which pisses me off to no end.
I'm sick of the time relevance problem too. If I type in something about a current thing with previous versions, topics about the previous version push the current one way down in the results. Try searching for a common Ubuntu problem in the current version, for example. You have to narrow it down "in the last year" EVERY single damn time you search. Do I really care about configuring the screensaver in Kubuntu 5.5, when 99% of users are
Re: (Score:2)
been using google since beta, have 7 google accounts (that i'm working to reduce down to 2), have my own google apps domain, google voice is my primary number, chrome is my browser, perform dozens of searches a day
and had a workaround for an annoying problem ... google's use of synonyms makes it hard to search for something specific, appending "&nfpr=1" to a query disables it. even have a keyword search set up to automatically append it
and after all this time i learn that the "+" operator does exactly w
Re: (Score:2)
Re:Broken? (Score:5, Informative)
Re: (Score:2)
Duck Duck Go
? Duck Duck Go is a game featuring rubber duckies, not a search engine.
Re: (Score:3)
Re: (Score:2)
I find Blekko.com to be decent.
Quoted from a techcrunch article [techcrunch.com]
In addition to providing regular search capabilities like Google’s, Blekko allows you to define what it calls “slashtags” and filter the information you retrieve according to your own criteria. Slashtags are mostly human-curated sets of websites built around a specific topic, such as health, finance, sports, tech, and colleges. So if you are looking for information about swine flu, you can add “/health” to your query and search only the top 70 or so relevant health sites rather than tens of thousands spam sites. Blekko crowdsources the editorial judgment for what should and should not be in a slashtag, as Wikipedia does. One Blekko user created a slashtag for 2100 college websites. So anyone can do a targeted search for all the schools offering courses in molecular biology, for example. Most searches are like this—they can be restricted to a few thousand relevant sites. The results become much more relevant and trustworthy when you can filter out all the garbage.
Re: (Score:3)
Re: (Score:3)
I do try the other search engines every now and then, but even when searching for something rather obscure Google returns more relevant results than the others. Seriously? I haven't noticed any limitation in the amount of ground that blekko.com [blekko.com] or duckduck.go [duckduckgo] cover... and it really doesn't matter if there are a bajillion pages you index when two thirds of them are spam pages with content ripped from wikipedia.
Also, when I search for something really obscure, google always wants to push me in the most obvi
Re: (Score:2)
Some other search engine will eventually come along. They will provide a better service, and Google will swallow them up like a ripe, juicy tomato.
FTFY
Re: (Score:2, Insightful)
If someone does, who will find out about it first: users or Google? If Google finds out first, they just have to stop the revenue-generating pollution to a degree that they remain best, and no one will ever know that the newcomer had briefly been better.
For all its "brokenness" Google just has to remain best and they'll win. And if that brokenness is a result of allowing noise because it makes them money, rather than a technology limitation, then it's something they have control of. I wouldn't bet on Goo
Re: (Score:2)
FWIW duckduckgo [duckduckgo.com] does a better job from my experience than Google does.
Uh (Score:4, Insightful)
Re: (Score:2)
Re: (Score:3)
Why is slashdot providing us with opinions?
Because it's a for-profit gossip rag, and more gossip = more ad views. Thankfully you can normally wait 5 minutes then look at the comments and see 10 posts along the lines of "here's why the article is bull and the slashdot editor is a tard" and get some links to actually informative sites.
What really confuses me is why the editors seem to reject submissions with links to source data, and approve submissions that come in hours or days later linking to some third party's useless opinion blog o_O
Re: (Score:2)
Companies that serve the almighty dollar (Score:2)
are left in shock and awe as soon as consumers find a better option and flock away in huge numbers. There will be no customer loyalty for Google if we continue to get served up crap. A bunch of clicks now may see revenue for Google now, but they'll feel the bottom line fall out from under them like a hangman's trap when 60%-80% switch to another search engine that focuses more on the science of search than the profitability of it (like Google used to be).
Re: (Score:2)
Anyone with a brain can recognize affiliate links or overhyped profiteering or just plain lies that generally accompany the spam/fake reviews/miracle product sites. Those who can't are going to lose their money to email spam, or offline advertisements, or late night informercia
Mixed metaphor alert (Score:5, Funny)
So which is it? Is Google a gorilla or a snake? Make your mind up!
Re: (Score:2)
Re: (Score:2)
Coming soon to a theater near you, Samuel L. Jackson stars in....GORILLASNAKES on a plane!
Re:Mixed metaphor alert (Score:5, Funny)
Re:Mixed metaphor alert (Score:5, Funny)
Re:Mixed metaphor alert (Score:4, Funny)
train has sailed??
wow. you can lead a whore to water but you can't make her think.
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
If we hit that bullseye the rest of the dominoes will fall like a house of cards. Checkmate.
Re: (Score:2)
A mixed metaphor is if you mix your metaphors within the same sentence or metaphor construction.
Completely different sentences with obviously separate metaphors -- especially when they're quotes from different people -- is just called writing.
Re: (Score:2)
Exactly, it is distracting. First you have gorillas getting tickled, then you have snakes eating their own tails. I was half expecting Freud to pop-up at the end.
There are some sick puppies out there.
Re: (Score:2)
Re: (Score:2)
Snakorilla
Dear Jeebus, please don't let any SyFy execs read this post.
Re: Is Google a gorilla or a snake? (Score:2)
Manimal.
Re: (Score:2)
We'll just let winter roll in and render the problem academic.
They are still far better than what came before (Score:3)
The thing is thought that the other search engines before Google were terrible in this regard.
Before Google the SEO business was rife with dodgy practices. It was only when google showed that these dodgy practices were not going to help get to the top of their results that the SEO market grew up and started being more constructive for the web as a whole.
Before this they would just do whatever they could to game their clients page higher up the ranks using whatever means they could just to get their clients page hits. This was a nice easy metric that clients could easily track and understand with minimum of technical knowledge. Their customer to visitor ratios might have been going down as the page hits went up but this was very hard to track before the whole web metric industry grew up. One might even say that the web metric industry owes much to Google in this regard as now any money spent on SEO and advertising usually needs to be justified by also spending money on tracking too.
Re: (Score:3)
The article raises this point in the second paragraph: 'In fact, the problem that plagued the first generation of search engines such as Altavista now seems to be gaining traction on Google, which outdistanced those earlier rivals precisely because it dumped the spam so effectively.'
Re: (Score:2)
OTOH, were Google to implement some way of handling software version numbers intelligently, it would make it a much more useful service.
Waddaya mean, next challenge? (Score:2)
The whole reason Google rose to dominance was that 10 years ago it was doing a far better job of hiding the spam results than its now-mostly-defunct major competitors. Since then, spammers, scammers, and pranksters have been trying to game the results, often with noticeable effects.
People change.... only for something better (Score:3, Interesting)
People wont change while theres nothing better to change to...
I still don't "see" these issues with google that supposidly exist, I know others dont see these issues iether who aren't as web savvy as me, but if they DO exist, it's only when something better comes along that people will switch, I tried bing..... and couldn't even get it to find microsoft security essentials when searching for mse as its normally know.
Re: (Score:2)
I tried bing..... and couldn't even get it to find microsoft security essentials when searching for mse as its normally know.
Just tried it. Microsoft security essentials was the second result (after Micro Solutions Enterprises, whoever they might be, but they have the domain mse.com).
Re:People change.... only for something better (Score:5, Informative)
People wont change while theres nothing better to change to...
I see some nerds switching to http://duckduckgo.com/ [duckduckgo.com]
Re: (Score:3)
Wow, thanks! I have a few keywords that I use to test a search engine, and this one really is better and more relevant than google. I can't believe it, I've been waiting for this for years. Thanks!
Re:People change.... only for something better (Score:4, Insightful)
...
I still don't "see" these issues with google that supposidly exist...
Never got expertSexChange.com (and the like) in your results? I get them frequently and it's annoying.
What scrapers? (Score:5, Interesting)
I don't have this problem - when I search for things on Google, I get relevant results from real pages. Either I regularly search for things that nobody scrapes, or there's actually some skill involved in getting relevant results that most people can't be bothered with.
The biggest problem I've had of late searching on Google is trying to find reviews of hardware and getting ninety billion pages trying to sell it to me with 'Be the first person to review this product!" I need to find a different keyword on that.
Re: (Score:2)
It seems to me that with an ideal search engine it wouldn't require a special skill to get relevant spam-free results.
Re: (Score:2)
The difference being is that I'm not getting spammy scraper sites, but actual retail outlets with ratings from Google. If they were scraper sites, I'd be much more upset about the whole deal, but as it stands, it's just my poor choices of keywords, not Google rating fake sites above real ones. Lack of user knowledge is a bigger problem than lack of proper search formulas.
Re: (Score:3)
I find exactly the opposite problem. I want to buy something and all I get is pages of reviews for the damn thing. Finding an actual shop is proving almost impossible, and it's not as if I'm looking for unusual items either.
Yeah, I have noticed all the extra cruft creeping into Google lately. It's definitely not as good as it was a few years back.
Re: (Score:2, Informative)
I've found the easiest way to get around this problem is to remove words from google search. I usually use the following form:
search: "product name" review -buy -first
It's not perfect but it's much better than the results w/o the eliminated words.
Re: (Score:3)
-"0 reviews" -"first to review"
Re: (Score:2)
Re: (Score:2)
The biggest problem I've had of late searching on Google is trying to find reviews of hardware and getting ninety billion pages trying to sell it to me with 'Be the first person to review this product!" I need to find a different keyword on that.
$10 says that these sites are mostly scrapers trying to get page hits. For instance, I noticed the article that DRAM prices are at their lowest yet. I currently have 8GB in my machine and wanted to figure out what the max was (my manual is at home). So I went and looked up my motherboard model number from the website that sold it to me. I then proceeded to google: "XFX MBN790IUL9 LGA 775" and probably 80% of the sites that came up were scraped. The manufacturer's website wasn't even in the first two pa
Normal people is doomed. (Score:3)
I see 3 problems with normal people using google:
- Normal people can't tell the diference a scam and a honest page. The preferences are reverted, what you know is the honest page of a hacker (peple like Stallman, or the homepage of a project like MediaWiki) will look scary and dangerous, while will love a page full of flash ads, that probably are tryiing to install spyware.
- Normal people are the target of spammers. If you search for tecnical problems with ocropus, you will see less spam tar
Re: (Score:2)
I wish. Try searching for old LCD, stepper, or TTL chip numbers, you'll get tons of "datasheetsRus.com" type sites that consist solely of part numbers and what look like Markov word chains.
Re: (Score:2)
Maybe the solution is to let the owner of the page define what kind of page it is with restricting keywords, for example, they can label their own site as a "shopping" site, or "product review" site, but NOT BOTH. Yes, yes, I know there are many shopping sites that DO have both, (such as Amazon) but perhaps we need to get them to commit. If you can simply pick up a bunch of semi-random keywords and plaster them into your meta tags regardless of your actual content, it sort of kills the whole point for the r
Re: (Score:3)
Re:What scrapers? (Score:5, Insightful)
Try doing any programming or system administration related search and *not* have at least one of the first five results populated with the following worthless domains:
- experts-exchange.com
- ehow.com
- about.com
- scribd.com
- ittoolbox.com
These sites don't necessarily scrape and repost content, but the content they do provide is invariably worthless or too difficult to navigate in order to be worth my time. In fact, I really don't mind mailing list, wikipedia, and StackOverflow scrapers because at least they provide useful content as long as you block all the ads and javascript by default.
Spammers have gotten pretty darn good at figuring out how to game Google and Google's countermeasures are increasingly ineffective. What Google really needs to do is place some control over the results returned in the user's hand. I would pay actual money to Google if they would let me customize search results as follows:
- A way to mark results as useful or not for the query entered, and refine later searches based on those
- Blacklist certain domains from showing up in my results, ever.
- Add content qualification (for example, prefer sites that have a certain text-to-graphics ratio)
Predicted future news: (Score:2)
Google changes the rules to close old loopholes, spammers start gaming the new rules. The media is shocked that a massively profitable business category is capable of changing to meet the new challenges, unlike the *AA groups.
Google may fail, but it has a lot of momentum (Score:2)
I've noticed Google getting less and less effective all the time. I do a search, and 3/4 of the sites are 'fake' results that send me to ad pages with their own (totally useless) search results.
On important searches, I often spend 10-15 minutes tuning my query to help eliminate those sites so I can get to the real results.
Hey, Google - here's a free idea for you... do domain lookups on all your listings, and adjust PageRank based on who registers the domains. That should work for a few months before they
Re: (Score:2)
For me it's the same, but for a different reason: it's Google breaking their own searches. Specifically, how they silently replace your actual keywords with what they think you mean.
Rememb
Re: (Score:2)
Hey, Google - here's a free idea for you... do domain lookups on all your listings, and adjust PageRank based on who registers the domains. That should work for a few months before they start taking care to register each new domain with unique contact information.
I've wondered about that myself, it seems like most of the link farms and such are created by the same people. Or at least the design is identical to the rest. I assume I'm missing something, but it doesn't seem like it would be too hard to eliminate those sites given the static nature of them.
Content or scrapers (Score:2)
Hardly limited to search-engine spamming... (Score:3)
Google is now responsible for a fairly large portion of the plain old spam I get. As in, their computers send it. Their latest gimmick is a new "feature" of Google Groups:
1. You can't send emailed abuse reports, they don't process those.
2. You have to go to the group's home page and click "Report This Group".
3. But you can't unless you're logged into a Google account, and your Google account is a member of the group. Otherwise, you just get the "you must be a member of this group to see this page" page.
4. You can directly navigate to groups.google.com/abuse/, but...
5. They don't do anything about spam reports anyway.
Similarly, they are apparently rapidly becoming a world leader in Usenet spam, because they don't have any particular objection to people posting spam. Or, if they do, it has not yet risen to the level of the kind of objection that results in doing something to stop it.
Google created... what? (Score:3)
"Google's next big challenge, which is dealing with the spammers it helped create."
Except, "No." Creating a profitable system does not mean one helped to create policy-infringers, law-breakers, and exploiters. If we accepted that irrationality, we could say that young, pretty boys and girls create child rapists, cars with windows spontaneously generate car thieves, and political systems create thieving dirty politicians. But that's not true.
Exploiters and criminals are created through a combination of their own high expectations, the lack of opportunity (by their standards), and their lack of ethical conviction. They only act opportunistically or impulsively on exploitable situations.
Duck Duck Go to the rescue (Score:2)
For the last couple of months I've been using Duck Duck Go [duckduckgo.com] with great results, and with much less spam than Google. Plus you get warm fuzzies from using it. Written in Perl on top of FreeBSD, respects your privacy [duckduckgo.com] and supports all manner of yummy syntax [duckduckgo.com].
Couple that with zero click info such as:
define sfumato [duckduckgo.com] 12 usd in eur [duckduckgo.com] 12 cm in inches [duckduckgo.com]
I find myself not missing Google in the slightest when it comes to search.
excluding russia and china? (Score:2)
http://en.wikipedia.org/wiki/World_population
russia and china make up 21.5% of the worlds population - i am sure the 90% result will skew a lot more with these included.
Google has fixed some of the problems already... (Score:2)
Scraper sites outranking originals (Score:2)
I certainly hope so. The most galling thing for me, a prolific original content provider (if I say so myself!), is seeing these scraper sites out-ranking me with domains years younger than my own and no visible effort at SEO, black hat or otherwise. It would be nice if AdSense actually enforced their policy of not being allowed on content used without permission.
Re: (Score:2)
The most galling thing for me, a prolific original content provider (if I say so myself!), is seeing these scraper sites out-ranking me
A good robots.txt should take care of the ones that obey it, and it’s easy enough to detect robots that don’t obey it and IP-ban them.
E.g.
Re: (Score:2)
Re: (Score:2)
Like who? *Googles*
... ok, that's pretty neat, and a nifty idea who's time has likely come, but it's not a 'search engine for the world' like Google intends on being. It's something to follow, but not something that would supplant Google. Not anytime soon, at least.
Google already does some of this with their Maps, fetching local relevant results and so on. They're just talking about mapping to concepts rather than (always) physical locations.
"its search is damn well broken"? (Score:2)
I hadn’t noticed.
Re: (Score:2)
You may not have noticed but many others have and that includes me. I still use Google and will likely continue to do so but their search is getting increasingly unreliable.
You can feel the dishonesty (Score:2)
Google search results used to be reliable. You could refine your searches, based on previous search results, and progressively narrow your search until you get what you seek. Now you keep getting paid ad "search results."
I'm sick of this shit. Any ideas for an alternative search engine?
That works? (Score:2)
Google is the 900-pound gorilla of search, with around 90% of the market (excluding China and Russia)
You can do that? Well, in that case I'm the 900-pound gorilla of getting laid (excluding people who don't live in their mom's basement)
More options? (Score:2)
How about giving the searcher two more controls in 'Preferences'. First, a "radius" control, to set the discrimination for a tighter or looser match the the search criteria. That is, if I loosen the controls, I'll get more matches, but less accuracy. Conversely, I could eliminate anything that doesn't match, exactly, all terms.
Then, for some real fun, the Second control sets the "start page" to show results on, from 1 to 50, for example, or 'Random'. You may find y
people are fickle (Score:2)
Re: (Score:2, Interesting)
Yes, it's like people believe we're playing a game of chess: you move your pieces to get into a new, better position, and eventually you checkmate your opponent and you win. They believe one day spam will go away, or that we can do something to eliminate it and that something is "broken" because there is still spam.
It's more of a game of Go. Occasionally your opponent makes inroads into your territory, and you block them off. Occasionally you make inroads into their territory. Sometimes they make lif
Re:Playing the game changes the game (Score:5, Insightful)
Believe it or not it's easier to be a good ad company if you're also a good search company. One doesn't have to suffer to the benefit of the other. It's easier to sell a product if it's a good product.
Re: (Score:2)
Because they don't need a perfect search engine and it isn't their aim. They only need one that's good enough for advertisers. Often that corresponds to "good for users" but not necessarily. Take all the information about users they collect and sell to advertisers. That doesn't fit what I'd see as "good for users", but it's certainly "good for advertisers".
If you want to understand Google and what they do, you have to remember that they are an advertising company and us poor chumps trying to find stuff on t
Re:You're wrong. (Score:5, Insightful)
Re: (Score:3)
I think the point a lot of us are trying to make is that Google got where they are today by those means, but there's been a shift over the last couple years toward more short-term thinking. We all came to Google because they did what you said - they did search better than others.
The problem is that while not intentionally being evil (in my opinion) Google makes huge money from "Made for AdSense" type SEO garbage sites. If they took all those guys out, they'd make less. At the same time, they need to be care
Re: (Score:3)
There are two reasons for this, only one of which you can do anything about. The first is synonym matching. That's where you search for something like, I don't know, "website" and it will match "web page" as well. (I'm sure there are better examples). This is the one you can do something about, by putting +website, which forces it to appear as-is. You can also get this by putting a single word in quotes.
The second reason is that google matches not only text on the page, but also frequent text in anc