Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Tool Detects "In-Flight" Webpage Alterations

Posted by CmdrTaco on Wed Jul 25, 2007 10:24 AM
from the its-like-a-foil-hat-for-your-browser dept.
TheWoozle writes "In a follow-up to a recent story about ISPs inserting ads into web pages, the University of Washington security and privacy research group has teamed with the International Computer Science Institute (ICSI) to develop an online tool to help you identify if your ISP is inserting ads or otherwise modifying the web pages you request."
+ -
story

Related Stories

[+] Your Rights Online: ISPs Inserting Ads Into Your Pages 434 comments
TheWoozle writes "Some ISPs are resorting to a new tactic to increase revenue: inserting advertisements into web pages requested by their end users. They use a transparent web proxy (such as this one) to insert javascript and/or HTML with the ads into pages returned to users. Neither the content providers nor the end-users have been notified that this is taking place, and I'm sure that they weren't asked for permission either."
[+] Your Rights Online: Study Confirms ISPs Meddle With Web Traffic 131 comments
Last July, a research team from the University of Washington released an online tool to analyze whether web pages were being altered during the transit from web server to user. On Wednesday, the team released a paper at the Usenix conference analyzing the data collected from the tool. The found, unsurprisingly, that ISPs were indeed injecting ads into web pages viewed by a small number of users. The paper is available at the Usenix site. From PCWorld: "To get their data, the team wrote software that would test whether or not someone visiting a test page on the University of Washington's Web site was viewing HTML that had been altered in transit. In 16 instances ads were injected into the Web page by the visitor's Internet Service provider. The service providers named by the researchers are generally small ISPs such as RedMoon, Mesa Networks and MetroFi, but the paper also named one of the largest ISPs in the U.S., XO Communications, as an ad injector."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by nokilli (759129) on Wednesday July 25 2007, @10:29AM (#19983869)
    If that isn't desirable, do a patch to Apache that creates a header that holds a hash of the content.
    The hash gets calculated once for static content, which is usually the bulk of the traffic, no? So
    not too big of a hit.

    Browser sees content. Browser sees hash. Browser compares the two...

    --
    Censored [blogspot.com] by [blogspot.com] Technorati [blogspot.com] and now, Blogger too! [blogspot.com]
    • What if the ISP is simply putting the web-page in its own frame, and the advertisement in a second frame? Unless you add the ability for web-pages to dictate that they should not be in frames, this one can't really be trapped for like that. The ISP could create its own hash for the served web-page that holds the frames.
      • While I'm not sure why frames are any different from whatever other kind of content modification, you're right that the ISP could modify the hash, so GP's idea apparently won't help. SSL would...
      • i dont think they could manage to do this without it being obvious to the user. frames arent exactly subtle.
      • Re: (Score:2, Interesting)

        ...Unless you add the ability for web-pages to dictate that they should not be in frames, this one can't really be trapped for like that...

        <script language="JavaScript" type="text/javascript">
        <!--
        if (top.location != location)
        {
        top.location.href = document.location.href ;
        }
        -->
        </script>
        That should do it. ;)
    • What if the ISP, having the server's (Apache HTTPD) code, recomputes the hash in the same manner.

      Browser sees content. Browser sees hash. Browser compares the two...gets an OK.
      • by eheldreth (751767) on Wednesday July 25 2007, @12:45PM (#19985897) Homepage

        What if the ISP, having the server's (Apache HTTPD) code, recomputes the hash in the same manner. Browser sees content. Browser sees hash. Browser compares the two...gets an OK.

        1.) Claim the hash is to protect the copyright on your site
        2.) Sue any ISP that alters the site without permission under the DMCA
        3.) ???
        4.) Profit!
    • But the ISP would just need to alter the header with the new hash for the adulterated page (which he can calculate as easily as the browser can). Also, this is no good for Ajax...
    • Not just a hash, but a message digest [google.com].
      • by vux984 (928602) on Wednesday July 25 2007, @12:14PM (#19985485)
        All these ideas are neat, but ultimately losers.
        MOVE TO ANOTHER PROVIDER TODAY.

        Why should I do that if I don't know the ISP is modifying the web pages in flight? Maybe I need a tool that could somehow detect that? That would sure be useful. Oh wait...Isn't that what this discussion is about?
  • Do ISPs really do this? I've never really noticed anything like this.
    • Re: (Score:2, Interesting)

      by Anonymous Coward
      My hosting service (the University of Minnesota) sticks a little legal disclaimer (some h5 tags) in a contrasting colot at the bottom of every HTML page it serves for non-official accounts. It's the typical "The University of Minnesota is not responsible for the content...blah blah blah" message.
  • When was the last time I saw an ad of a rival to Verizon in my verizon dsl line, I wonder.
  • by db32 (862117) on Wednesday July 25 2007, @10:32AM (#19983923) Journal
    Do we sue the ad folks for inserting ads and stealing content? I mean, in just about any other medium this would wind up in court overnight as copyright and stolen content and so on. But now we have a circumvention tool to detect it...so are we going to get sued under DMCA like nonsense for attempting to circumvent the ad insertion?

          • by db32 (862117) on Wednesday July 25 2007, @03:10PM (#19987765) Journal
            Not exactly. A book is just a book. Words on paper. A webpage is FAR more visual than text on page (unless you have been sleeping the last few dozen years). Inserting ads could easily be considered a derivitive work since you are altering the look of the site. What if I didn't want ads? What if my design is a nice soft brown and then you start inserting pink flashing ads? Or God forbid, these clowns insert one of those drive by installer ads, now your business reputation is completely screwed because some major ISP decided to make a buck without checking their sources and your website infected thousands of consumers. Good luck explaining to your customers how it was the ISP magically sneaking ads onto your website.
  • I like UW and their tools. I think they've done wonderful work. Paint.NET is fun, easy, and I love that they are still working on it.

    Who/what is able add to your pages:

    • Host ISP
    • browser
    • plug-ins
    • End User ISP? - in other words, your hosting ISP most definately can add to your page. But, can the end-users ISP, insert it into to the stream as it passes through? Technically, this would be feaseble. Are there examples of this?
  • by proverbialcow (177020) on Wednesday July 25 2007, @10:33AM (#19983929) Journal
    ISPs intercepting, altering results from online security tool
    • Lest you think I'm merely joking, FTFA:

      Caveat 2: Our integrity checking mechanism is not cryptographically secure. If a "party in the middle" were modifying web pages that you visit, it could modify our scripts as well. Instead, our mechanism acts as a "tripwire" that is likely to catch any party that is currently unaware of our experiment. In the future, we could create a huge number of variants on the JavaScript tripwire. This would make it more difficult for a "party in the middle" to reliably determine
    • by nweaver (113078) on Wednesday July 25 2007, @11:08AM (#19984493) Homepage
      We are specifically worried about this case. But we have some thoughts on how to make it more difficult for someone to do that, which will probably end up in a full paper later.
  • by nweaver (113078) on Wednesday July 25 2007, @10:33AM (#19983931) Homepage
    We (the authors of the page) will be answering questions in this thread.
    • make a package that can be used as a simple drop-in to a website to detect this. If enough websites implement something that alerts users that the webpage was altered, isp will be forced to stop doing this.
      • isp will be forced to stop doing this.

        That, or ISPs will work harder to defeat the detection.
        • That is a war that this package will win - probably with some cryptographic checks in version 2.0.
          • Not quite... (Score:4, Interesting)

            by nweaver (113078) on Wednesday July 25 2007, @01:44PM (#19986705) Homepage
            This is a war however which we can make damn difficult by using virus-like mutation techniques, so that every checker looks different: force THEM to solve the AV defender arms race.

            As long as the actual API used by the Javascript is common enough that the ad-injectors can't recognize and block our code by keeing in on the API calls rather than the overall Javascript.

            The proper solution, adding integrity checking to all HTTP, seems like its not happening.
    • Re: (Score:2, Funny)

      by Anonymous Coward
      Hi,

      What is your favorite flavor of ice cream?
      • Analyses (Score:3, Informative)

        We've seen a couple cases of NebuAdd, one other that looks interesting, and a fair amount of addblocking/firewall software (eg, ZoneAlarm does some modifications)

        We are waiting for the Slashdot and DIGG deluges to pass, however, before we have a more detailed analysis.
      • Re: (Score:3, Informative)

        HTTPS, when certificates are properly used, is designed to prevent man in the middle viewing and modification.
      • Because people don't use HTTPs for everything.

        I agree that doing things cryptographically-authenticated would be a good thing (one could probably do a more lightweight opportunistic mechanism, myself and others at ICSI have an upcoming paper in HotSec on the possibility), but most people don't use https, and a lot of web sites don't SUPPORT https for many things.

          • Re: (Score:3, Informative)

            One of the big reasons is the certificate model...

            If you self-sign, everyone gets a nag panel everytime they visit your web page. If you have verisign or someone else provide you with a certificate, it costs real money.

            Also, the HTTPS handshake is expensive, figure ~.1 CPU second per visitor to handle the public key exchange, and it starts to add up. There is a reason why GOOGLE doesn't use https for gmail by default (you have to manually type in https://mail.google.com/ [google.com] to get gmail through SSL), the key
      • Re: (Score:3, Informative)

        Because people don't use SSL, and ISPs are actively inserting adds into web pages.

        ANd click the link anyway, we want to have as many people try it as possible.
        • Re: (Score:3, Informative)

          Actually, our test page happens to answer these questions, to some extent.

          All of our test pages are marked with "Pragma: no-cache" and "Cache-control: no-cache" in the HTTP response headers, but we're observing changes to the pages anyway.

          Our integrity checking mechanism uses AJAX requests (XmlHttpRequests) to fetch the test page. ISPs can't distinguish between an AJAX request and a normal page request (i.e., they both look like normal HTTP requests), so they inject ads into both. However, we're

          • Re: (Score:3, Insightful)

            ISPs can't distinguish between an AJAX request and a normal page request (i.e., they both look like normal HTTP requests), so they inject ads into both.

            Under normal circumstances AJAX and "normal" requests are the same; however, AJAX has a "setRequestHeader" parameter that can be used to set additional headers. This is significant in that HTTP/1.1 states:

            The Cache-Control general-header field is used to specify directives that MUST be obeyed by all caching mechanisms along the request/response chain.

            You'v

  • No need for thousands of "All good in Kalamazoo" & "Up to date in Kansas City" posts.

  • A friend of mine had a similar problem with his webpages. They were on a free host (rolls eyes). I wrote a script for him to store special tags to denote the beginning and the end of his webpage content. After the webpage was loaded, a script erased everything and replaced all the html with his marked content. Ta-da, no ads!

    If you want to be stricter, encode your webpage content with base64 to make sure the ads don't intrude your precious content.
    • by Raistlin77 (754120) on Wednesday July 25 2007, @11:01AM (#19984411)
      I'll bet that his user agreement with that free host also clearly states that circumventing their added content in the manner that your script does is prohibited. If they discover your script, they'll likely disable his account.
    • by Excors (807434) on Wednesday July 25 2007, @11:23AM (#19984737)
      For sites like GeoCities that add

      </object></layer></div></span></style></noscript>< /table></script></applet>(...adverts...)
      to the bottom of your page to stop you trying to hide their adverts, it could be good to add <plaintext style="display: none"> to your page just before the point where they add their junk. plaintext is the unstoppable monster [htmlcodetutorial.com] of HTML – there is no closing tag, and the rest of the page will be treated as plain text instead of HTML. It's a slightly obscure feature, but it has better support between web browsers than many other parts of HTML and it can be fun to play with...
  • International Computer Science Institute (ISCI)
    It's ICSI. Pronounced Ee-ksee. It's where they exile you if you're not nerdy enough for Berkeley Computer Science proper, or something ;)
  • by NeoTerra (986979) on Wednesday July 25 2007, @10:51AM (#19984249)
    A certain ISP in Canada [userfriendly.org] delt with this not long ago...
  • I've wondered about this for a while as a way to defeat XSS attacks but would be adding some sort of ability to sign the content in a HTML response be beneficial here? You could use your SSL cert to simply add a signature response body for content transmitted over http. I way to inform the browser to expect the signature that the ISP can't strip out may be problematic though.

    The XSS idea would be to have the ability to have multi-part responses from the web server. The browser would put the page together fr
  • I can think of one way to do it - but it wouldn't be too hard for a determined ISP to defeat:

    Step 1: Calculate md5sum of webpage, store in separate location.
    Step 2: Include on the webpage some javascript to md5sum itself and compare this to md5sum in known location. Issue an alert if it differs.
    Step 3: Profit!

    Of course, this is awkward for dynamically generated pages and if the ISP is happy to mess around with the page to insert ads, they're probably also happy to mess around with any javascript which dete
  • It seems that everyone is concerned about downstream modification, and is completely ignoring the possibility of upstream modification. What if Sprint [verizon.com] started modifying upstream http [amazon.com]-posts to start a more viral ad distribution system? Not only would they be able to target their customers [barnesandnoble.com], they would also be able to target the customers of anyone who could read the post!

    This is the reason that we need to push for network neutrality [handsoff.org]. When the only choices are between a giant douche [summerseve.com] which alters content and a turd sandwich [panerabread.com] which alters content, the customer ends up screwed [lowes.com] in the end.

  • by ookabooka (731013) on Wednesday July 25 2007, @11:12AM (#19984559)
    These guys actually want as much traffic as they can get to get a good idea of what isps are doing what. Go ahead, click online tool. [washington.edu] It's pretty nifty.
  • Old stuff. (Score:4, Interesting)

    by TheLink (130905) on Wednesday July 25 2007, @11:31AM (#19984845) Journal
    Years ago on one April Fool's day, I got a list of ad sites (from the usual /etc/hosts files out there), then got the internal DNS server to resolve them to a server that served up the company logo instead (for all possible url paths).

    FWIW, seemed only one person noticed that the forbes page they loaded somehow had the company logos everywhere :). Nope I didn't get fired or even reprimanded - plus even better - I was saving company bandwidth (remember this was years ago)... Nobody complained about the lack of ads from ad.doubleclick.net and gang.

    I toyed with the idea of substituting ads with reminders (meeting at 2pm, or "you have been on slashdot for 2 hours!") and other more useful information.

    Lastly, I don't think their naive hashing thing checks if you are altering the images - the content may remain unchanged, but linked to contents may change (they aren't checked from what I see), so it doesn't work for my scenario where different ads are substituted for the unaltered URL.

    That said, I'm still curious on:
    1) How many ISPs would bother modifying traffic from those 7 destinations they are testing.
    2) What the various laws around the world say about this.
    3) What those laws say about "sponsored internet access" where an ISP gives a cheaper package/plan where the ads are substituted with the ISPs advertisers with the risk of some corrupted info.
    4) What those laws say about "streamlined internet access" where an ISP provides a package/plan where ads and other crap are removed (or modified) for their customer.
  • by Sloppy (14984) on Wednesday July 25 2007, @12:16PM (#19985511) Homepage Journal

    ..why not just use SSL?

    I can understand how this wouldn't help with hosting ISPs who insert ads into their own customers' pages, but if you're worried about your readers' ISPs modifying your pages, SSL seems like a no-brainer.

    What's the downside? It can't still be CPU, can it? It's 2007 now, and processing power is ridiculously cheap/fast.

    • Re: (Score:3, Informative)

      they're not talking about the ISP hosting the web page, they're talkign about your ISP adding ads to random sites that you visit. client-side, not server-side.
    • by spun (1352) <loverevolutionary&yahoo,com> on Wednesday July 25 2007, @01:31PM (#19986539) Journal
      Are you pretending to be mentally challenged in order to troll, or do you really not understand even after having it explained to you a little further up the page? It is not the developer's ISP, or the hosting ISP that is doing this! It is the ISP of the people looking at the page. So, you left out a step in your patented eyeball method: signing up for every ISP in existence and loading your page, to see if that particular ISP does it.
    • 1. You're hosting ISP may not do this, but the ISPs of the people who view your webpage may. How would you know? Are you going to sign up for an account with every ISP in existence and test each one for yourself?

      2. There are plenty of people who would never know: people who use adblocking software, for one. In any event, many commercial webpages are so overrun with advertising anyway, how would you know that one in the crowd was inserted by your ISP and not original to the page?

      The subject doesn't make
    • Re: (Score:3, Informative)

      It's not the host ISP that's inserting the ads, It's the "Client" ISP, for example Joe Smith buys a computer and buys high speed internet from "ECI" the Evil Cable ISP. Joe Smith visits Bob's Website, Bob, who hates ads never put any on his webpage, and instead makes his money through online sales of his product. Now Joe loads up Bob's webpage to purchase a widget from Bob, and he sees Ads all over Bob's Website. Bob who has GHI (Good Highspeed ISP) visits his website and there's no ads. ECI is puttin