Tim Bray Says RELAX 180
twofish writes to tell us that Sun's Tim Bray (co-editor of XML and the XML namespace specifications) has posted a blog entry suggesting RELAX NG be used instead of the W3C XML Schema. From the blog: "W3C XML Schemas (XSD) suck. They are hard to read, hard to write, hard to understand, have interoperability problems, and are unable to describe lots of things you want to do all the time in XML. Schemas based on Relax NG, also known as ISO Standard 19757, are easy to write, easy to read, are backed by a rigorous formalism for interoperability, and can describe immensely more different XML constructs."
Couldn't agree more (Score:5, Insightful)
On the other hand, RELAX NG "just works".
(all IME of course...:)
ant.
I have to agree. (Score:4, Insightful)
Re:Don't do it. (Score:0, Insightful)
Re:it's a rather straightforward observation (Score:2, Insightful)
xml is a b**ch to read
Re:XML Totally Sucks - All of it! (Score:3, Insightful)
For flat data, sure a flat file is fine...for structured/hierarchical data, a flat file is
Re:XML Totally Sucks - All of it! (Score:2, Insightful)
Re:XML Totally Sucks - All of it! (Score:2, Insightful)
While XML may have it's places (I've yet to encounter one in the commerical world), passing large amount of data is not one of them. A good flat file design is a lot more efficent than XML, and short of hardware accelartion I don't see that changing.
I'm currently trying to assist a customer, whose changing from one system to another, the current system generates flat files of approx 2gig in size every couple of days (billing data). The new system produces files of approx 13gig. The data contained within files result in the exact same bill being produced for the customers.
Needless to say, the extra diskspace (yes we do compress them), and processing time to parse/compress is such a waste.
In my mind, XML trades shorter development time / 'portability' (well so the theory goes), for greater resource usage (CPU/Disk), whereas most customers I've dealt with would rather take a little longer to develop, and have a lot less resource limitation issues on the production systems. The old methods of 'just throw more hardware at it' just don't work in the real world anymore.
Great job, now to clean up XML itself (Score:3, Insightful)
Re:XML Totally Sucks - All of it! (Score:5, Insightful)
Yeah, well I have to look at EDI every day. I'd switch to XML in a heartbeat if it were up to me.
You picked some obvious strawmen to shoot down. XML isn't for building gigabyte databases (regardless of whether some people try to use it for that). It's for easily moving data between applications. If you think writing a flat text parser is easy, then you've never had to deal with nested data or escaped characters. Say what you will about XML, but it's nice to have one set standard that deals with all that, even if suboptimally, because I never want to write another ad-hoc parser for as long as I live. Been there, done that, have no desire to bother again.
XML uses a binary format (Score:5, Insightful)
There are more sophisticated binary standards that are more efficient than ASCII and it wouldn't take a lot of effort to create viewers/editors for them as well. Of course most markup documents would be significantly smaller if tags didn't have to be S-P-E-L-L-E-D O-U-T character by character. Each HTML tag could be encoded in just two bytes with lots of room to spare.
It always fascinates me that we have no problem making customers use a new specialized tool like a browser, but it's taboo to use a non-ASCII tool for development. So we continue to structure our data as if it were going to be processed by a VT100.
Re:One fix to XML I'd like to have... (Score:5, Insightful)
SGML is full of fun little hacks like that, and it was a pain in the ass to read. Omitting the tag name from the end tag makes it impossible to know you have an improperly closed tag til the end of the document, and then you have no idea which tag wasn't closed. And no, that wasn't a theoretical problem either, this became a real problem with giant SGML docs that used all the shortcuts.
If you really hate XML's verbosity so much, realize that it was designed for easy reading, not easy writing. I whipped up my own xml mode in emacs and made '</' trigger an "electric-slash" behavior that closes the tag automatically. Not rocket science.
XML nightmare (Score:4, Insightful)
Re:Great job, now to clean up XML itself (Score:4, Insightful)
XML is like Electricity (Score:5, Insightful)
It's good for transmitting information/energy, but it's not good for storing it.
-Don
I call this the LineOfView (as in PoV) Problem (Score:5, Insightful)
The question now is: where do you draw the line of view? Along which line do I take my knife to cut open my n-dimensional structure to unravel it and flatten it out into a 1-dimesional string of characters? This is a problem that is impossible to solve satisfactory for all possible PoVs or - as I say - Lines of View, or better yet, Horizons to the structure. Will I unravel my DB of books by authors? By issues? By vendors? By publishers or by weight and size?
What I'm getting to is this: mapping n-dimensional stuff to 1-dimensional structures will allways suck one way or the other. It's just that with XML we all start agreeing upon in which way it's supposed to suck. I don't think that changing the Schema standard (or worse: introducing additional standards) will actually attack this hard problem. I have a strong suspicion that Relax NGs relief is illusional, short term and re-introduces downsides that XML Schema allready has takled with it's pesky and strict nature. For one it would be consistency with the View-Horizon once chosen all the way through the given data-structure. I don't know for shure - go test and find out - but I do know that universal serialization will allways come with downsides and RelaxNG (or any other schema) won't change that.
Re:Maximizing Composability and Relax NG Trivia (Score:2, Insightful)
"This document does not describe any algorithms for transforming a RELAX NG schema into simplified form, nor for determining whether a RELAX NG schema is correct."
From the Jing implementation:
"This version of Jing implements:
* RELAX NG 1.0 Specification,
* RELAX NG Compact Syntax, and
* parts of RELAX NG DTD Compatibility, specifically checking of ID/IDREF/IDREFS."
also from the Jing implementation:
"Jing also has experimental support for schema languages other than RELAX NG; specifically
* W3C XML Schema (based on Xerces-J);
* Schematron;
* Namespace Routing Language."
Implement the same level of functionality in Haskell as is being implemented in Jing, then come back and compare.
Also, number of lines of code is only one standard, how does the Haskell implementation hold up under heavy loads? How well does it scale?
I personally think Jing tries to do too much, and I think there is definitely a need for a better java implementation of a RelaxNG validator, but your post (largely dealing with a non-sensical argument about semantics) is rather lazy.
YES YES YES! (Score:1, Insightful)
Shit, you think people are born knowing what an asterisk postfix means? Terseness != Clarity.