Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Celebrate the XML Decade

Posted by CowboyNeal on Thu Nov 16, 2006 09:49 PM
from the happy-birthday dept.
IdaAshley writes "IBM Systems Journal recently published an issue dedicated to XML's 10th anniversary. Take a look at XML application techniques, and general discussion of the technical, economic and even cultural effects of XML. Learn why XML has been successful, and what it would take for XML to continue its success."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Celebrate the XML Decade
    I tried. Oh Lord, how I tried!

    I started this morning by talking to everyone in XML.

    I hope the black eye my coworker gave me heals before my presentation to the CTO tomorrow morning :-(
    • by Duhavid (677874) on Thursday November 16 2006, @10:45PM (#16880022)
      Really...

      We all needed to leave the first post in this to the guy with
      the sig

      "XML is like violence, if it doenst fix the problem, you arent using enough"

      Or words to that effect.
      • by Dahamma (304068) on Friday November 17 2006, @03:45AM (#16881356)
        Someone put that in our Bugzilla quips a while back - it's still one of my favorites!

        My conspiracy theory is that XML was secretly invented by Intel in order to require 3GHz processors for the simplest of tasks.
    • by Randolpho (628485) on Thursday November 16 2006, @11:24PM (#16880262) Homepage Journal
      I started this morning by talking to everyone in XML.
      <conversation>
      <greeting type="friendly">Hello, fellow coworker type dude!</greeting>
      <response type="violent">Have a black eye!</response>
      </conversation>
      • by Anonymous Coward on Thursday November 16 2006, @11:43PM (#16880382)
        <greeting type="friendly">Hello, fellow coworker type dude!</greeting>
        That's a poorly designed format. You should make "greeting" a complex type and use elements to represent the greeting text and the greeting type. Then, the greeting type can be properly validated against a W3C XML Schema. There's no valid reason to use an attribute in cases like these.
        • by zootm (850416) on Friday November 17 2006, @05:31AM (#16881764)

          That's a poorly designed format. You should make "greeting" a complex type and use elements to represent the greeting text and the greeting type. Then, the greeting type can be properly validated against a W3C XML Schema. There's no valid reason to use an attribute in cases like these.

          I took the liberty of revising the format a little, is this better?

          <?xml version="1.0" encoding="UTF-8" standalone="no"?>
          <conversation
          xmlns="http://slashdot.org/sarcasm/XML/conversatio n"
          xmlns:html="http://www.w3.org/1999/xhtml">

          <participants>
          <participant>
          <short-name>OP</short-name>
          <full-name>Original poster</full-name>
          </participant>
          <participant>
          <short-name>CW</short-name>
          <full-name>Unwitting coworker</full-name>
          </participant>
          </participants>

          <relationships>
          <two-way-relationship name="coworker">
          <person>OP</person>
          <person>CW</person>
          </two-way-relationship>
          </relationships>

          <greeting time="2006-11-17T10:12:10Z" speaker="OP" targets="CW">
          <type>
          <demeanour>friendly</demeanour>
          </type>
          <speech>
          <text type="text/plain">
          Hello, fellow coworker type dude!
          </text>
          </speech>
          </greeting>

          <response time="2006-11-17T10:12:34Z" speaker="CW" targets="OP">
          <type>
          <demeanour>angry</demeanour>
          <context>
          <divorce type="messy"/>
          <custody-battle type="messy"/>
          </context>
          </type>
          <speech>
          <text type="application/xhtml+xml">
          Have a <html:em>black eye</html:em>!
          </text>
          </speech>
          <action>
          <punch>
          <recipient>OP</recipient>
          <aim>eye</aim>
          </punch>
          </action>
          </response>

          </conversation>

          I'm sort of disappointed that I only got to use two namespaces. Can't get indentation to work either, unfortunately.

    • Actually, I was looking at the title and I did a double-take, since the first time I saw it I thought it said "Celebrate the XML Debacle". Oop. I thought, surely it's not that bad...

      Eh, what do I know? Maybe it is that bad. =)

    • I started this morning by talking to everyone in XML.

      Care to share the DTD and schema you used for that?
  • by Ant P. (974313) on Thursday November 16 2006, @10:02PM (#16879730) Homepage

    Marketing to PHBs, mostly.

    However here on earth a lot of people still hand-code the stuff. IMO a C-like syntax using nested {}s would've been better.

    • by MP3Chuck (652277) on Thursday November 16 2006, @10:15PM (#16879812) Homepage Journal
      "IMO a C-like syntax using nested {}s would've been better."

      JSON [wikipedia.org]?
      • by Ankh (19084) * on Friday November 17 2006, @12:19AM (#16880588) Homepage
        A lot of people ask about using a different syntax, such as @name{....} as Scribe (and later LaTeX) did. Note that @element{xxx} is in fact a possible syntax that can be defined using SGML. But we were after something different.

        When we designed XML, we had over a decade of solid experience with interoperability in the world of SGML, and we also knew about the kinds of problems that different sorts of users had with different sorts of syntax.

        The primary users of SGML-based documentation systems were not programmers. They were people who were often not likely to know about a bracket-matching option in an editor or about code indenting, for example. But they were still legitimate users.

        You can't easily test the markup in a declarative system: if in an HTML document I used H3 instead of P in a document it might not look right, but it would still parse OK. If I muddle up Author and Title in a bibliography, same thing.

        So, the redundancy of end tags in XML is there because, in practice, if you didn't have it, we had learned that our users had problems correcting their documents, and we knew that, in general, it was only rarely possible for software to give the users much help. There were some experiments early on with </>, allowed by SGML (with various options set) to end any element; it soon became obvious that this caused more problems than it was worth, and even Microsoft disabled the troublesome feature in their XML parser.

        It's true that today XML is used in lots of situations we didn't predict. We were amazed that by the time we got XML published as a Recommendation there were over 200 users. So no, we didn't predict the future percfectly. But the popularity of XML shows we can't have done all that badly, really ;-)

        Liam

        (Liam Quin, currently W3C XML Activity Lead)
        • Re: (Score:3, Insightful)

          > So, the redundancy of end tags in XML is there because, in practice, if you didn't have it, we had learned that our users had problems correcting their documents, and we knew that, in general, it was only rarely possible for software to give the users much help. There were some experiments early on with , allowed by SGML (with various options set) to end any element; it soon became obvious that this caused more problems than it was worth, and even Microsoft disabled the troublesome feature in their XML
          • by Ankh (19084) * on Friday November 17 2006, @10:40AM (#16884552) Homepage
            > The error message does not help people all that much

            One case where it helps most is when an incorrect start tag was applied; with the empty end tag this could not be detected, and it turned out to be more comman than one might expect. You're right that the error messages often aren't good, but did you ever try debugging a large SGML document with OMITTAG and SHORTREF in use? The error message was almost always "characters found after end of document" because the required strategy by SGML (in one of the most common error situations) was to close elements until you got a match, so the parser typically closed elements all the way up the tree to the document element, and then gave up.

            We were bound, at the time, to strict SGML compatibility; perhaps if we had known XML would succeed we could have made more changes, but then we would have strayed further from the well-trodden path of implementation experience.

            As to comments for attributes, I agree with you; we lost them, though because we needed a language simple enough it could be processed e.g. with Perl. We didn't dare dream that Perl would support XML natively!

            I agree with you that structured tools should generally be used. The redundancy and simplicity help computer-generated XML, and help to detect, say, missing portions of documents. If xml-rpc is scary, s-expr rpc is even scarier! :-)

            Liam
          • Re: (Score:3, Interesting)

            Thank you for your kind words :-)

            We weren't really aiming at HTML users.

            I'm afraid the only useability studies of SGML tools that I saw were not released to the public. At the time I worked for a vendor of SGML-based software (e.g. including an editor, a viewer, a development environment) and it was a matter of great concern to us.

            It's possible we could open up the archives of the XML Working Group, but it would mean getting the permission of several hundred people. I'll ask some people at the upcoming XM
        • You know there is an entire programming language designed to manipulate JSON datastructures, a language that quickly pentrating all of the web.

          It is known as: JavaScript
    • by porkThreeWays (895269) on Thursday November 16 2006, @10:20PM (#16879842)
      Sorta... XML came at a time when there weren't a whole lot of good viable data representation standards. Those that did (i.e. SGML) were too complicated for light use. XML was meant to be used by the masses while still technically remain an SGML subset. We have better alternatives today, but once something is in widespread use, it's not going away for awhile.
    • I keep hearing this and it sees foolish every time. If you just used {} how would you easily tell which tag you were closing? It would be too easy to mistake one brace for another, especially when there are several tags. Sure it'd be more efficient: but the idea was to have something that was equally readable by machines and humans. You take any non-trivial piece of XHTML or other XML and convert it to your new {} syntax. Then go try to add some more mark-up to it. And to non-technical users it would be eve

      • It would be too easy to mistake one brace for another, especially when there are several tags

        I hack LISP, you insensitive clod!
      • People seem to have coped fine reading code without closing tags for the past 30 years.
    • by smallpaul (65919) <paul@pres c o d . n et> on Thursday November 16 2006, @11:07PM (#16880132)
      A curly brace syntax would have been a better format for "large scale enterprise publishing"? As someone who has spent more than a decade in that field, I must disagree strongly. A curly brace would have been better to allow enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. [w3.org] Please do not confuse what XML is used for with what it was designed for. There is a reason that XML delivery units are called "documents" and not "messages".
  • by d3ik (798966) on Thursday November 16 2006, @10:02PM (#16879734)
    ... and most "enterprisey" Java developers have never met a problem that couldn't be fixed with more XML.
    • That gets my goat too.

      Nothing makes a set of code harder to deal with than taking half of it and writing it in a variety of XML config files and then scattering them throughout the distribution. That way you ensure that anyone doing something foolish like trying to understand it through javadoc or use their IDE to learn it gets nowhere.

      I'm going to go cry now.
    • by dch24 (904899) on Thursday November 16 2006, @11:26PM (#16880276) Journal
      My bosses were wary when I suggested XML as our data representation for a new project. Here were some of the arguments:

      Pro
      • Easy to change the schema, don't have to convert old data.
      • They didn't know exactly what XML was, so if I recommended it, ... (a.k.a. "gee whiz" factor?)
      • The other developers liked the idea
      Con
      • They weren't sure whether this would increase (better system = save time?) or decrease (reinvent the wheel = waste a lot of time in meetings?) productivity
      • Takes lots of space (no "binary XML")
      • Slow processing, right? (see "Takes lots of space")

      Eventually we settled on gzipped xml. It required a little more code, but everyone seemed happy. Oh, and we stored images as separate .png.

      I think my experience is pretty common, though. And from experience, libxml2 + libz is still very, very fast, and there's not a (whole lot) of wasted space.

      I'd like to hear other people's success stories, if anyone wants to reply... I liked reading the article, too.
      • by j. andrew rogers (774820) on Friday November 17 2006, @03:21AM (#16881266) Homepage
        The "slow processing" is caused by more than taking a lot of space. XML is basically a document markup but is frequently and regular used as a wire protocol, which has very different design requirements if you want a good standard. And in fact we already have a good standard for this kind of thing called "ASN.1", which was actually engineered to be extremely efficient as a wire protocol standard. (There is also an ITU standard for encoding XML as ASN.1 called XER, which solves many of the performance problems.)

        Arguably the single biggest problem with XML that causes slow processing is that software can predict almost nothing about an XML stream and therefore has to allow for anything. The opening bracket tells you very little about what to expect, and creates few implicit failure or non-conformance tests that allows one to terminate processing because there is no definition of "unreasonable". If I want to embed a terabyte of data between XML tags, there is no built-in basic mechanism to inform the software of how much data I should expect to see before a closing tag and no basic mechanism to cue the software as to the type of data to expect. (Yes, you can sort of do it with lots of other layers strapped on, but it isn't core and strapping it on adds complexity.) This is the primary reason it gives miserable performance as a wire protocol format -- the software cannot make decisions about the data without slurping most or all of it, with no way to predict what "most" or "all" actually is. In well engineered standards such as ASN.1, they use the good old tag-length-value (TLV) format. The "tag" tells you what to expect, the length tells you how many bytes to expect, and the value is the actual data. In short, the encoding tells the software exactly what it is about to do before it does it in enough detail that the software can make smart and performant handling decisions.

        The only real advantage XML has is that it is (sort of) human readable. Raw TLV formatted documents are a bit opaque, but they can be trivially converted into an XML-like format with no loss (and back) without giving software parsers headaches. There is buckets of irony that the deficiencies of XML are being fixed by essentially converting it to ASN.1 style formats so that machines can parse them with maximum efficiency. Yet another case of computer science history repeating itself. XML is not useful for much more than a presentation layer, and the fact that it is often treated as far more is ridiculous.
        • The only real advantage XML has is that it is (sort of) human readable.

          Actually, it is not. Many people I know, and me, have trouble looking at XML config files that span more than a few rows. You need a tool that presents the XML document as a tree, so you can collapse some nodes in order to focus in the interesting ones.

  • This year I'll be sending out christmas cards in XML and then placing a large banner outside my house with the appropriate schema.

    Then with every following year, I'll be sending a stylesheet card which they can apply to the original XML.

    And if they need to locate their names on the card, they can use //recipient[@name='mum']
  • by elving (133577) on Thursday November 16 2006, @10:07PM (#16879766)
    Strange that an article celebrating XML [w3.org]'s anniversary would neglect to mention XML's creator [tbray.org]. I wonder if the fact he works for a competitor [sun.com] has anything to do with it...
    • by tbray (95102) on Friday November 17 2006, @12:49AM (#16880714) Homepage Journal

      I have to do this once per year or so, here's the 2006 iteration: I am not XML's inventor. There were 150 people in the debating society and 11 people in the voting cabal and 3 co-editors of the spec. Of the core group, I (a) was the loudest mouth, (b) was independent so I didn't have to get PR clearance to talk, and (c) don't mind marketing work.
      -Tim

      • by jlowery (47102) on Friday November 17 2006, @01:39AM (#16880936)
        Al Gore declaims the same every anniversary of the Internet.
      • you're too modest ;-)

      • Re: (Score:3, Informative)

        In addition, XML was never intended to be an "invention". It was a simpification. Some innovation slipped in, but the vast majority was just debating what aspects of SGML to strip out and how to fix some well-known flaws in it. The innovation primarily was about how to integrate modern standards like URLs and Unicode.
  • Take a look at XML application techniques, and general discussion of the technical, economic and even cultural effects of XML.

    Cultural Effects? This is a spec for structuring data, not a Picasso.
    • Take a look at XML application techniques, and general discussion of the technical, economic and even cultural effects of XML.
      Cultural Effects? This is a spec for structuring data, not a Picasso.

      Philistine. You just don't appreciate abstraction.

      8^)

  • XML Decade? (Score:5, Funny)

    by RealGrouchy (943109) on Thursday November 16 2006, @10:35PM (#16879954)
    Wait... let me figure this one out...

    MCMXC was 1990...
    MDCCCLX was 1860...

    I give up! Which decade was XML?

    - RG>
    • Converting XML to Decimal is 1060. Long time ago ;-)
      • Re: (Score:3, Informative)

        by Anonymous Coward
        Actually, that would be 1040 -- 'X' (10) before 'M' (1000) = 990 + 'L' (50) = 1040
  • Stuck (Score:2, Insightful)

    So we're officially stuck with this crap forever.

    Yay! Lets party!

    XML is for data interchange, nothing else. Unfortunately, it's being used for everything but.
    • Re: (Score:2, Insightful)

      XML is for data interchange, nothing else.

      Isn't all data interchanged? From client to server, from blogger to browser, from developer to developer, etc. Any data which is not interchanged is either useless or forgotten. And XML has shown its strength in all these areas: Ease of human and computer parsing.

  • Apple replacing the perfectly fine, hand editable plist format with an XML version. ick.
  • a decade of ... (Score:3, Insightful)

    by The Pim (140414) on Thursday November 16 2006, @11:36PM (#16880358)
    vague semantics, confusing specifications, unwarranted complexity, standards proliferation, poor tools, and wildly inappropriate application. Not to mention rampant disregard for existing work in nearly every arena it entered. So the essence of XML is this: the problem it solves is not hard, and it does not solve the problem well. [bell-labs.com]
  • Celebrate the XML Decade:

    Bah, it's too late to tell us to celebrate during the decade of XML because that decade is now over!

    Yeah, should have done that; celebrating.
    • Re: (Score:3, Interesting)

      <?xml version="1.0"?>
      <content name="Shameless Self Promotion">

      Good point, though there's a better way to edit binary files.

      For example, I make a product called FileCarver which allows you to create a file format definition (in XML! heh), that describes the format of a binary file, and the program will automatically provide you with a GUI to edit it. Check it out at http:/fizzysoft.net/filecarver/

      </content>
    • S-expressions, of the sort used in Common Lisp and Scheme, would have been a good alternative. They're simple, use a minimal number of characters, and are very easy to parse. Hell, most Comp Sci grads have written at least once such parser during their education.

      This article [prescod.net] argues the other side of that point. I'm not sure how convincing it is, but there are at least some benefits to the XML approach. Where the balance falls, I don't know.
    • Re: (Score:2, Informative)

      by Anonymous Coward
      HL7 [hl7.org] is a "standard" for moving patient information from system to system. I call it a "standard" because the 1.x and 2.x versions were largely "advisory", with more MAY than MUST, with a huge amount of wiggle room... I've worked on 4 information exchange projects now, and all of them started from scratch because none of their HL7 "specs" are compatible.

      Supposedly the new version 3 standard (which uses the "modeling approach") will be much more firm with the implementors, which will hopefully mean that ever
    • It flows better this way, and I think it originally referred to sex instead of XML:

      XML is like violence. If it's not solving your problem, just use more.

    • You are trolling, right? Your rant basically consists of few obvious misunderstandings or statements that are factually wrong.

      Seems like a knee-jerk reaction from someone who doesn't understand what XML is and its intended purpose. Seeing the HTML remark was rather amusing though. Way to go to show your ignorance on the subject.

      BTW: XML is not designed to or intended to be a SQL replacement. Only morons would think that, claim that or use it as such.