Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Google Businesses The Internet Patents

Cracking the Google Code... Under the GoogleScope 335

jglazer75 writes "From the analysis of the code behind Google's patents: "Google's sweeping changes confirm the search giant has launched a full out assault against artificial link inflation & declared war against search engine spam in a continuing effort to provide the best search service in the world... and if you thought you cracked the Google Code and had Google all figured out ... guess again. ... In addition to evaluating and scoring web page content, the ranking of web pages are admittedly still influenced by the frequency of page or site updates. What's new and interesting is what Google takes into account in determining the freshness of a web page.""
This discussion has been archived. No new comments can be posted.

Cracking the Google Code... Under the GoogleScope

Comments Filter:
  • Comment removed (Score:4, Insightful)

    by account_deleted ( 4530225 ) on Tuesday May 10, 2005 @12:31PM (#12489498)
    Comment removed based on user account deletion
  • by nganju ( 821034 ) on Tuesday May 10, 2005 @12:31PM (#12489506)

    The article is not written by a Google employee, nor did the author speak with anyone at Google. It's simply his analysis of the patent document filed by Google.

    Also, at the bottom of the article after the author's name, there's a link to some search optimization service's website.
  • by JVert ( 578547 ) <corganbilly@hotmai[ ]om ['l.c' in gap]> on Tuesday May 10, 2005 @12:34PM (#12489549) Journal
    Doesn't seem like the best solution. This would work if you started from a clean slate but spam pages are still out there and are being clicked on. Not much you can do about that, I just hope its not something silly like how much time you spend on a page. If I find a page that quickly answers my question or at least answers part of my question and I click back for other links i'd hate to think that that site would be marked as "spam".
  • by DrinkingIllini ( 842502 ) on Tuesday May 10, 2005 @12:35PM (#12489557)
    Of course there is reason for concern, any company gets too big and powerful they become evil. Wal-mart, Microsoft, Disney, Intel, Lucasfilm, they're all evil, and I'm sure they didn't set out to become that way, it's just the power of the dark side. Power corrupts, it's the nature of the beast.
  • by rm999 ( 775449 ) on Tuesday May 10, 2005 @12:37PM (#12489574)
    Perhaps, or perhaps if Google changes its rankings enough, the SEOs' credibilities will be destroyed (they will be seen as a temporary and overpriced fixes)
  • by Veinor ( 871770 ) <veinorNO@SPAMgmail.com> on Tuesday May 10, 2005 @12:41PM (#12489615)
    Almost any algorithm can be spoofed fairly easily: inserting very small text that's the same color as the background. Then whenever they want Google to think they've updated, they change the text. The viewer doesn't tell the difference, but the source code changes. Or they could just use comments in Javascript, or just create Javascript that never gets used.

    Also, a page with frames might get penalized since its content doesn't change, although the content of the frames may change frequently.
  • by Anonymous Coward on Tuesday May 10, 2005 @12:41PM (#12489616)
    There will always be >=11 sites wanting to be in the Top10
  • Re:Yes (Score:5, Insightful)

    by AKAImBatman ( 238306 ) * <akaimbatman@gmaYEATSil.com minus poet> on Tuesday May 10, 2005 @12:41PM (#12489619) Homepage Journal
    Truthfully? The top results should be for "Tiger" should be furry creatures that eat meat and perform in Las Vegas.
  • by CrackedButter ( 646746 ) on Tuesday May 10, 2005 @12:48PM (#12489700) Homepage Journal
    The WWF is a big organisation, they do work all over the world, they are not evil are they? I don't see them killing Panda's or anything.
  • Re:Yes (Score:1, Insightful)

    by Anonymous Coward on Tuesday May 10, 2005 @12:50PM (#12489736)
    you beat me to the same sentiment LOL.

    Aside from this, Tiger direct gets more hits, has more pages in it with the word "Tiger", is an older more established concern than this OS name is, is probably outbidding apple for the keyword, and [1000 other reasons why apple is not the center of the universe for most of us].

    Just because you have an apple computer, drive a VW, are probably overpaid, and have designer glasses, doesn't make your demographic more important than the other 99.99% of us, at least outside of your narcisstic little world.

    Get a clue and learn something about Google before criticizing their results. You might want to try searching for "Apple OS Tiger" as opposed to "Tiger". You really aren't as savvy as you think you are now are you?

    l8,
    AC

  • by Anonymous Coward on Tuesday May 10, 2005 @12:51PM (#12489740)

    This proxy "web-accelerator" thing really still has me freaked out.

    Freaked out in what sense? In the sense that Google can see what you are browsing (just like your ISP) or in the sense that some web applications break (because they were buggy to begin with)?

    Really, if something like the GWA "freaks you out", then you have a very low tolerance for the world in general and need to calm down a bit.

  • by CynicalGuy ( 866115 ) on Tuesday May 10, 2005 @12:52PM (#12489754)
    Is it the case that Google's search dominance is a direct result of it clinging onto a patent for PageRank ?

    Their search dominance is a direct result of PageRank. That they have a patent on it prevents other companies from copying the idea or hiring their employees away (Microsoft is notorious at doing both these things). So yes, the patent is important.

    Sorry kids, but patents and "Do no evil" are mutually incompatible concepts.

    You're retarded if you think that.
  • by Animats ( 122034 ) on Tuesday May 10, 2005 @12:53PM (#12489767) Homepage
    It's clear that Google is gearing up for a crackdown on search engine spamming. They've already started to kill off "link farms". They're checking spam blacklists. And they're not stopping there.

    Note that Google is now looking at domain ownership information. This may result in a much lower level of bogus information in domain registrations. It's probably a good idea to make sure that your domain registration information, business license, D&B rating, on-site contact info, and SSL certificates all match.

    "Domain cloaking" will probably mean that you don't appear anywhere the top in Google. So that's on the way out.

  • by Valar ( 167606 ) on Tuesday May 10, 2005 @01:00PM (#12489843)
    Well, considering that it isn't .NET's fault that they didn't properly implement exception handling I would say no. Also, combine this with the fact that that exception is caused simply by a server overload and you get a total nonissue.
  • by 4of12 ( 97621 ) on Tuesday May 10, 2005 @01:07PM (#12489904) Homepage Journal

    Google has millions upon millions of click history on their search results that say what it is people really are looking for, as well as which ones appeared good fodder for first clicking.

    No one else has such a large database of what humans have actually picked.

    Such a click history and search term history asset is worth even more if it gets correlated with Evil Direct Marketing information from the cookie traders.

    Although, it seems possible that large ISPs could also grab and analyze their members Google interactions to figure out people's tastes, assuming such interactions remain unencrypted.

    I have to wonder how many companies with static IP addresses have, unbeknownst to them, built up extensive history logs at Google showing their search term preferences and click selections. If I were a technology startup with a hot idea to research I'd be a little more paranoid about something like that.

  • by baggins2002 ( 654972 ) on Tuesday May 10, 2005 @01:10PM (#12489932) Journal
    Companies need to start realizing that making money is about providing what customers want. Advertising is a great way of getting your name out, but only a good product or service will actually carry through. So in that frame of thinking, I highly recommend that companies:

    Uhh, which world are you living in. Most companies have found that bigger profits can be made, by convincing people that they want what they have. And most customers find it easier to buy what they are told to buy.
    I like your world, but it's not the one I've been living in.
  • by Anonymous Coward on Tuesday May 10, 2005 @01:10PM (#12489935)
    Yes, they are out there, and some people do click and view them. The difference is:
    - They are visited much less often. Usually by mistake.
    - Few people follow links /out/ of them.

    Remember, Google's algorithms are based on statistics, not absolute decisions. If one person links to a page, that's not enough to believe it's a respectable page. A lot of people have to be interested in that page for it to become important.
  • by Anonymous Coward on Tuesday May 10, 2005 @01:11PM (#12489946)
    Please direct ALL google/applevertisments to mailto:cmdrtaco@slashdot.org [mailto] along with obligatory paypal payment.

    Thanks,
    Rob Malda
  • Yes, patents ARE a violation of google's do no evil policy, as it gives them a monopoly on the good search engine algorithems.

    So they have monopoly. What's your point?

    When did a monopoly by google become ok?

    Sometime around the 1790's when the patent system was created in the US to give inventors an temporary and artificial monopoly on their inventions so as to encourage them to innovate. Google has not violated their policy of "do no evil" by properly utilizing the patent system, and it has had the intended side effect of preventing Microsoft from using their corporate muscle to crush Google.

    but why support one companies attempts to cripple their opponents through legislation instead of competition?

    Why should a company with more money have a right to crush me with my own invention?

    The primary reason why the patent system sucks is that "invention" is far too loosely defined. Many patents get granted in cases where the patent office's own rules state that they should throw them out.
  • Sometimes search engine optimization isn't about making a hack site rank well. Sometimes it is about getting the traffic that a really nifty site deserves.

    Actually, pretty much everything you list falls under the issue of usability. Many of those options have lower usability for the user, and thus the search engine by extension.

    These companies don't need an SEO, they need to find a web designer that doesn't use Macromedia "tools".
  • Re:FAQs (Score:2, Insightful)

    by eluusive ( 642298 ) on Tuesday May 10, 2005 @01:33PM (#12490220)
    I say F-A-Q not FAQ. I pronounce IRC I-R-C not Irck. It makes me go irck when somebody says erck for IRC. I pronounce MySQL as My-S-Q-L not My Sequel. #$*#@$%&)(@#&%()*#@&%)(*#@% However, I do pronounce LASER as laser the word. Laser is no longer just an acronym.
  • Wait a second. (Score:2, Insightful)

    by ThePromenader ( 878501 ) on Tuesday May 10, 2005 @01:38PM (#12490283) Homepage Journal
    Isn't this "page update frequency" hullaballoo a bit premature? If Google wants relevant results I can only see update frequency being but a minor factor in any page rank determination algorithms. For example: Informations sites (historical information, dictionaries, encyclopedias, collections, etc...) are often at once the most relevant (if info is what you're looking for) and the least updated sites. I can't really imagine the Oxford Faculty meeting every week to decide new words for their dictionary to retain their www.oed.com pagerank. Just imagine what it would do to the English language : )

    Seriously, this little article is going to get Webmasters thinking a little more but I don't see anything to panic about. Not yet, anyways.
  • Re:SEO (Score:2, Insightful)

    by Anonymous Coward on Tuesday May 10, 2005 @01:49PM (#12490392)

    There is an art to SEO. Some of us employ spamming techniques that will force a website to the top of the list for a short period of time, and then become banned. To some people, this is desirable - such as when you know your product has a short lifespan.

    Others like myself try to help businesses retool their websites to be search engine friendly. Alot of smaller businesses out there have websites that have every bit of info on everything they do on every page, thats bad. We show them how to break it into logical pieces, present it to the end user in a manner they will respond favorably to, AND build the site in a manner which will get crawled efficiently.

    True SEO has two sides to it, the Optimization side and the Search side. You have to understand how your demographic searches for things. If you are selling womens jewelry online, you will build the site (SEO wise) differently than you would a site that sells lab equipment. There are cultural differences in how these demographics search for things, and differences in the lifecycle of a sale to them. Some web developers can create sites that are easy to navigate and look great, yet they forget who they are targeting. Their content may be relevant, but it wouldn't spur Google to refer to it by the terms that the target would search with. Good SEO is about building a website with relevant content in the context that the target uses, not your perception of how it should be used.

    I don't try to manipulate Google into thinking my websites are the authority on any subject. I try to build my sites to speak to the target demographic. When done properly, this drives traffic to your site because you knew what your target audience was going to put into the search box. Which, BTW, is what Google WANTS. They don't want page spam that will artificaially inflate a page's ranking and dilute the the accuracy of their product. They want to be able to detirmine what is relevant by the traffic a site gets, how many people link to it, and how often it is updated. For heavily used terms, there are some technical tricks that are employed to increase your ranking, but nothing outside what a good designer should want to do to bring attention to the most relevant content on a page.

    The truth of the matter is, nothing on the internet is unpopular. If Furries exist, everything has a place. But speaking a demographics language on the web is difficult, and quite often outside the scope of a web developer/Copy writer.

  • by mejesster ( 813444 ) on Tuesday May 10, 2005 @01:55PM (#12490463)
    It seems nobody has asked the question: what if a spammer wants to lower the rank of more reputable companies? If a spammer link spams a site that is already fairly popular, couldn't it harm the page rank of a company that has nothing to do with the spam?
  • by Anonymous Coward on Tuesday May 10, 2005 @02:29PM (#12490835)
    Even things like that are easily detectable. You could demand some minimum distance between the background and foreground color before it is deemed viewable. Or you could take positioning into account. All these are easy to check for programmatically. You could do some more complex stuff like having a script that could hide a section programmatically. If the script was complex it might be difficult for an algorithm to determine if the script would always hide the section. In general it might even be formally undecidable. However, in practice even this would be detectable because the page would have to be shown correctly on the end-users machine within a few second so Google could simply have a "browser emulator" in their engine that would let the script run for a few seconds and determine what was visible.
  • SEQUEL (Score:2, Insightful)

    by naph ( 590672 ) on Tuesday May 10, 2005 @03:08PM (#12491308) Homepage Journal

    "The history of SQL and relational databases traces back to E.F. Codd, an IBM researcher who first published an article on the relational database idea in June 1970. Codd's article started a flurry of research, including a major project at IBM. Part of this project was a database query language named SEQUEL, an acronym for Structured English Query Language. The name was later changed to SQL for legal reasons, but many people still pronounce it SEQUEL to this day."

    http://www.provue.com/proVUE/Fact_SQLServer.html [provue.com]

    just a bit of history.

  • by Juergen Kreileder ( 123582 ) <jk@blackdown.de> on Tuesday May 10, 2005 @03:12PM (#12491362) Homepage
    Exactly. I just found a whole page of this by searching 'web proxy' without the quotes and going down the search results to about page 6 or so. Interesting, when I reloaded the page all of that /url?sa= stuff was gone and the links were direct again.

    I guess it's a Google feature. They use the click-tracking URLs very sparingly. That makes it harder for SEOs to manipulate rankings that way.

  • Its also fairly simple to note that someone clicked a link then immediately returned to the results list by noting the "if-modified" request from the user's browser.

    A quick return would indicate that the page was not in fact what the user had requested.
  • by Doctor O ( 549663 ) on Tuesday May 10, 2005 @05:47PM (#12492966) Homepage Journal
    Being a professional webworker for more than 8 years now, I agree with you from experience, but actually I don't think you can blame Macromedia.

    I will not say anything at all about Flash because two camps who BOTH don't get it will start the usual pointless discussion. Flash is rarely used for what it's great at, visualizing data, and plagues us with wildly unnecessary and annoying l33t-masturbation stuff instead.

    Dreamweaver itself is indeed a powerful timesaver in the hands of an experienced XHTML/CSS guy. If you look at it closely, you'll find that it is a very nice graphical frontend to HTML itself, with a great set of shortcuts so that you almost don't have to touch the mouse at all. The palettes just provide access to the most commonly needed attributes of the element you're working on. If you leave all those nasty "behaviours", "timelines" and whatnow alone, it produces nicely readable and well-formed code. I'm using Dreamweaver since the early betas, and even back then this was the case. I tend to think that this was an initial design goal behind DW.

    The bad comes from the 'designers' who are taught print design at the universities and apply them to the Web, using all the nutty clicky-pointy tools that produce JS-laden horror cabinet of non-standards-compliance they dare to call "HTML". It's a classical PEBKAC. Look at it this way - if DW didn't have those features, GoLive would've taken over long ago and we don't want THIS to happen. IMNSHO the only thing worse would be Frontpage. At least the guys at Macromedia didn't invent bogus HTML extensions because they were incapable of providing a proper metadata infrastructure, like Adobe did.

    (I'm not a fanboy though, I just use what works best at the moment for the things I do. If someone shows me how to reproduce this "Apply Source Formatting" feature from DW in Kate/KDevelop and how to synchronize sites like in DW, I'm switching my machine at work from Win2K with DW to KDevelop/nvu on FreeBSD tomorrow, because it better fits the things I do nowadays. It will then match my setup at home.)

    While we're at it, SEO is, was and always will be BS, just like the whole Internet Advertising Myth which after nearly a decade of documented failure still isn't debunked. Duh.

An authority is a person who can tell you more about something than you really care to know.

Working...