Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Netscape Restores RSS DTD, Until July

Posted by CmdrTaco on Wed Jan 17, 2007 10:25 AM
from the that's-kinda-lame dept.
Randall Bennett writes "RSS 0.91's DTD has been restored to it's rightful location on my.netscape.com, but it'll only stay there till July 1st, 2007. Then, Netscape will remove the DTD, which is loaded four million times each day. Devs, start your caching engines."
+ -
story

Related Stories

[+] Ask Slashdot: Is Dedicated Hosting for Critical DTDs Necessary? 140 comments
pcause asks: "Recently there was a glitch, when someone at Netscape took down a page that had an important DTD (for RSS), used by many applications and services. This got me thinking that many or all of the important DTDs that software and commerce depend on are hosted at various commercial entities. Is this a sane way to build an XML based Internet infrastructure? Companies come and go all of the time; this means that the storage and availability of those DTDs is in constant jeopardy. It strikes me that we need an infrastructure akin to the root server structure to hold the key DTDs that are used throughout the industry. What organization would be the likely custodian of such data, and what would be the best way to insure such an infrastructure stays funded?"
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Redirect (Score:3, Insightful)

    by cynicalmoose (720691) <giles.robertson@westminster.org.uk> on Wednesday January 17 2007, @10:30AM (#17646718) Homepage
    And they can't set up a redirect to the new hosting location?
    • Re: (Score:3, Insightful)

      Wouldn't they then be serving 4 million redirects per day? The point is that they need to eventually break it to make people stop relying on that path.
    • Re:Redirect (Score:4, Funny)

      by AndroidCat (229562) on Wednesday January 17 2007, @10:45AM (#17646984) Homepage

      HTTP/1.1 301 Moved Permanently
      Content-Type: text/html
      Location: http://127.0.0.1/
    • Re:Redirect (Score:5, Insightful)

      by werewolf1031 (869837) on Wednesday January 17 2007, @10:45AM (#17646986) Homepage
      And they can't set up a redirect to the new hosting location?
      What in the world would be the point? That would merely duplicate the problem to a different location. As was clearly stated in the article by Mr. Finke, four-million hits every day is a crapload of bandwidth wasted re-downloading a file that will never change. The RSS 0.91 spec is finished, complete, and yes, for all intents and purposes, written in stone. Stop looking at it every damned day. It will not change. Ever. It's truly stupid for client-side software to be accessing it over the Internet to read its forever-static contents. That's like checking the writings of a dead poet every day to see if anything's changed.

      And any dev who codes his app to check a file like this every day instead of caching it client-side should be smacked oh-my-god-so-frickin-hard.
      • Re:Redirect (Score:5, Funny)

        by AndroidCat (229562) on Wednesday January 17 2007, @10:59AM (#17647172) Homepage
        And any dev who codes his app to check a file like this...
        They might not even know that they're doing it if they're using Microsoft's Swiss Army Chainsaw XMLHTTP COM object and set the flags wrong.
          • And naturally that's Microsofts fault? Not the developer who doesn't know anything about their tool?

            I wouldn't worry about it, many developers have firsthand experience with their tools...

          • Re: (Score:3, Informative)

            I didn't say that it was Microsoft's fault. It's just that it's a powerful tool with thousands of uses that's simple (on the surface) to use, but it pays to read the fine print carefully because many things aren't obvious. (/me remembers wasting time wondering why my XPath queries weren't working...)
      • Re: (Score:3, Interesting)

        And any dev who codes his app to check a file like this every day instead of caching it client-side should be smacked oh-my-god-so-frickin-hard.

        Ironic because Netscape is guilty of this poor practive themselves. I have an old sun u2 box that I recently revived. I had a copy of netscape messaging server/netscape enterprise server on it (used by the isp where I worked at the time). I wanted to archive some old mail off of it before I wiped the drive. I couldn't start it up because there were so many files containing references to http://developer.netscape.com/products/servers/ent erprise/dtds/nes-webapps_6_1.dtd [netscape.com] which of course doesn't even e

      • Re: (Score:3, Funny)

        It's truly stupid for client-side software to be accessing it over the Internet to read its forever-static contents.

        Hey, you're challenging one of the cherished principles on which the web was based.

        The next thing you know, you're going to be talking about the separation of document id from location.

    • Re: (Score:3, Insightful)

      To be fair, the article points out that they have already put in place a redirect.

      They point out that it might not be entirely sensible for millions of newsreaders to rely upon downloading a static file from the web each time they open a feed. Most newsreaders (like the one built into Firefox use a local cached copy.

      They restored the file so these newsreaders will continue to work for a period long enough that they can be altered to use a local copy.

      Whether it's reasonable or not for them to remove the
    • Sending Expires and Cache-Control headers [slashdot.org] that say "Don't bother retrying for 3 years" might help mitigate some of the bandwidth waste.

      That said, he's got a point that the feed readers should work if the DTD isn't retrievable -- but deliberately removing it looks like a great way to say "Netscape isn't reliable."
      • Re:URIs (Score:5, Informative)

        by Schraegstrichpunkt (931443) on Wednesday January 17 2007, @02:27PM (#17650624) Homepage
        No. This is the perfect example of why a URI is not necessarily supposed to be treated as a URL. http://my.netscape.com/publish/formats/rss-0.91.dt d is just a unique identifier for the RSS DTD. It used to also be hosted there as a convenience, but your software isn't supposed to rely on that.
  • by Anonymous Coward on Wednesday January 17 2007, @10:32AM (#17646770)
    Developers who made the mistake to use that external resource in their code most likely don't have the brain resources to adapt until July.

    (This is not a troll. Resignation and bitterness, maybe. But not a troll.)
    • by Anonymous Coward on Wednesday January 17 2007, @10:39AM (#17646900)
      That is kind of like declaring PI to be a volatile double variable, in case it changes in real time...
      • In Greg Bear's book Eon, one of the ideas is building with geometry. A mathematician investigating one such structure asked some engineers to build a pi-meter to use when she was exploring. I wondered what such a thing could mean, and indeed how one would build such a device...
          • You could also do the same thing with a piece of string and a ruler, but it wouldn't be convenient enough to call it a "pi meter".

            Yeah, but if you attached the string to a sleeping cat's tail, then when the value of pi changes it would pull the cat's tail and the cat would jump, hitting the lever above its head, which would release a ball which would roll down a spiral ramp into a container of water balanced on a thin beam, so that when the ball sinks to the bottom of the container it would tip it over o

    • then they can use that time to find a new job?
  • by Anonymous Coward on Wednesday January 17 2007, @10:34AM (#17646806)
    Developers should take the opportunity to move to Atom. In the mean time we could use something as simple as round-robin DNS to share the load or have Mozilla, Google or the internet archive host it. It's a historical document and should reside at a permanent URI.
      • Re: (Score:3, Informative)

        You know, if you're gonna be a smartass on this topic, you should at least understand the difference between a URI and a URL.

        There's nothing flawed about the notion of a permanent URI. A permanent URL is the tricky bit.
          • by metamatic (202216) on Wednesday January 17 2007, @01:26PM (#17649716) Homepage Journal
            URLs are a subset of URIs. A URL defines a location where a resource can be accessed. A URI may merely be the name of a resource, i.e. a URN.

            For example, globally unique IDs in Atom feeds are often URNs, and hence URIs; but URNs aren't URLs, and you shouldn't need or want to try to connect to something just because it's used as a globally unique identifier in an Atom feed and looks a bit like a URL.

            This is relevant because many Internet specifications use URNs (or in the case of HTML, FPIs) as spec identifiers. For instance, XML namespace identifiers are URIs; and while some of them happen to be URLs too, the XML namespace recommendation [w3.org] says:

            The namespace name, to serve its intended purpose, should have the characteristics of uniqueness and persistence. It is not a goal that it be directly usable for retrieval of a schema (if any exists).

            In the case of RSS 0.91, Netscape wrote the spec, and they used a URL and told people to connect to it to fetch the necessary information to parse the file. They could have used a URN, but I'm guessing they wanted to keep their options open as far as changing the spec on the fly.

            (Of course, Dave Winer has a different approach to changing RSS specs on the fly...)
        • Re: (Score:3, Interesting)

          Kind of like Example.com [example.com]. That was set up in RFC-2606.
            • Re: (Score:3, Interesting)

              I bet the bandwidth costs from attempted email delivery are huge even though there are no MX records and the server doesn't accept SMTP connections. In addition to spam harvesting, people like me have been using xyz@example.com to satisfy email address requirements for years.

              That's what the .invalid TLD is for, also defined in RFC 2606 [ietf.org].

              ".invalid" is intended for use in online construction of domain names that are sure to be invalid and which it is obvious at a glance are invalid.

  • CmdrTaco (Score:5, Funny)

    by MagicM (85041) on Wednesday January 17 2007, @10:40AM (#17646906)
    Netscape Restores RSS DTD, Until July - from the that's-kinda-lame dept.
    Two Stargate SG1 Films Announced - from the good-for-them dept.
    Linux: x86 Linux Flash Player 9 is Final - from the i-still-hate-flash dept.

    Looks like somebody is having a case of the mondays.

    (On Wednesday.)
  • I don't get it (Score:5, Interesting)

    by Thansal (999464) on Wednesday January 17 2007, @10:40AM (#17646910)
    I admit, I am not familiar enough with RSS. However this is a 2.3KB file that is not supposed to change. Why would developers NOT hardcode it into their RSS tools?
    • by 140Mandak262Jamuna (970587) on Wednesday January 17 2007, @10:51AM (#17647046) Journal
      No one ever writes a new XML (and most other Web2.0) application from the scratch. They all take an app they are familiar with and modify it to do new things. And some of the initial boot-strap processes are never looked into. If it works, dont mess with it attitude is pervasive. So someone long ago may be in a galaxy far away wrote an application that replicated and mutated by developers and others took it and did more mutations and it propagated. One side effect of this and similar cut&paste code development tactics is that bugs, security holes, inefficient algorithms, brain dead implementations also propagate.

      Richard Dawkins asks this very fundamental question, why reproduce (sexually or asexually) using seeds and embryos? Why not propagate by cuttings and cloning? It happens in nature. Many fern like plants do it. Bananas have been reproducing by new shoots. Then he discusses how harmful mutations too propagage and how going back to the basics and recreating the embryo selects the beneficial mutations and puts a check on deletrious mutations. Books The Selfish Gene, Climbing the Mount Improbable.

      • Re: (Score:3, Interesting)

        That was insightful (hint to mods).

        Now we need software that can breed sexually.

        Or, more realistically, software that has a finer granularity and greater modularity so that the piece of ancient code that does this can be easily identified and swapped out, without needing to be understood by developers.
         
        • Now we need software that can breed sexually.

          Nahh, the risk of virus transmission is too high...

    • Re:I don't get it (Score:5, Informative)

      by jrumney (197329) on Wednesday January 17 2007, @10:58AM (#17647138) Homepage

      Developers use off the shelf XML parsers, which generally take care of validation for you. Netscape created this problem themselves when they stated in the spec for RSS 0.91 that well-formedness was not enough, RSS 0.91 feeds should be validated against the DTD. They then specified that document authors must use a PUBLIC doctype specifier, so the option of using a SYSTEM one (where the DTD is looked up in a local catalog) is not an option.

      • Re: (Score:3, Insightful)

        by Anonymous Coward
        PUBLIC doctypes simply give the URI of the DTD, and are exptected to always resolve to the same content. But there's no requirement that you use the default resolver.
    • Re: (Score:3, Interesting)

      I'm also not an expert, but from what I know about DTDs they are supposed to be referenced when the content should validate against them. For example:

      <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
      "http://www.w3.org/TR/html4/strict.dtd">

      This is at the top of every Slashdot page. Should IE or FF break if the W3 were to remove that file? Certainly not. But should it be loaded and validated if possible? I believe so.

      If any XML or RSS gurus want to correct me on thi

    • Re:mirror ;) (Score:5, Informative)

      by geoffspear (692508) on Wednesday January 17 2007, @10:52AM (#17647054) Homepage
      Great, the entire internet community can rely on one random person's server instead of on one really big corporation's server. That should fix things.
      • I'm not entirely joe-random user (i say jokingly yet seriously - i registered nether.net before aol registered aol.com), but what i'll say is that it's useful to have copies of these files around for all sorts of reasons, either historical or otherwise. Folks are welcome to add my host in as one in their list of places to find this. I've survived slashdottings in the past before with not a lot of effort (as my pages are primarily static, no ads), and hosted/mirrored large content before without trouble an
  • .. and I thought it was only Microsoft and Google that tried to "break the web" on purpose ....
  • by KrisWithAK (32865) on Wednesday January 17 2007, @11:01AM (#17647202)
    As I replied for the previous Netscape RSS DTD article http://slashdot.org/comments.pl?sid=216818&cid=176 03480 [slashdot.org], caching DTDs from the network is not the answer if there is the possibility they will not be there in the future:

    The proper thing to do is for your application to use an XML catalog for resolving entities/URIs and bundle the DTD files with the application. There is a good article at http://xml.apache.org/commons/components/resolver/ resolver-article.html [apache.org] that helped me out. In addition, if you are using Eclipse with the web tools platform, you can customize the catalog so it resolves DTDs and entities locally. See http://wiki.eclipse.org/index.php/Using_the_XML_Ca talog [eclipse.org].
  • by mmurphy000 (556983) on Wednesday January 17 2007, @11:03AM (#17647226)

    (I tried posting this as a reply to the blog posting, but I'm not getting the confirmation email, so I'll post it here)

    From a purely technical standpoint, I agree with your assertion that, for well-baked files like RSS DTDs, clients should not be relying on a file hosted by an arbitrary service.

    That being said, please understand that the emotional message you're sending is: "Don't rely on Netscape".

    Why?

    Back when RSS was first starting out, Netscape's documentation said to use Netscape URLs for the RSS DTDs. Witness this page [archive.org], published by Netscape, from late 2000:

    Now, a shade over six years later, Netscape is saying "Oh, yeah, what we told you to do? Never mind. We're not supporting it any more."

    If Netscape/AOL was shutting its doors, that'd be one thing. If the service in question was obviously onerous, that too would be understandable. Or, if Netscape told people "For the love of all that is holy, don't use our URLs for your DTD needs!" from the get-go (based on that document, you didn't), any such reliance would be our own fault.

    But, because AOL does not want to serve up two static files, each of which is smaller than the "Netscape Reports" graphic on the netscape.com home page, Netscape is abandoning a service they told people to use.

    So what are we to think about Netscape's current services and their long-term usability?

  • I never understood why web pages need references to these external things (or do they?). Why embed into a page a pointer to a document that you don't have direct control over? My own dumb pages do this as well since I switched from plain HTML to using CSS and SVG, but I don't have the time to figure out why it's in there or if it's needed. I just pasted it in like the examples I found. Now if I thought my web page was really important, I'd look into this a bit more...
  • You have five months to update your apps to use RSS DTD version 0.92!
  • by kabdib (81955) on Wednesday January 17 2007, @12:09PM (#17648386) Homepage
    This is why whenever I hear the words "architecture" and "web" in the same sentence that I snicker. Unpolite, but OMFG who designed this junk?

    Oh, right. Nobody, really. It's amazing it works at all (... and sometimes it doesn't!)

    Djikstra's quip, "If programmers build houses they way they built programs, the first woodpecker to come along would topple civilization" was and remains insightful.
    • Probably (Score:2, Funny)

      by Anonymous Coward
      As that would give Google another way to track your every online move.
    • Re: (Score:3, Informative)

      What's so fucking hard about spelling "its" correctly?

      An old Jedi mind trick:

      Its apostrophe is missing, because it's been moved over here.