Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Apache Software

Apache Releases Xerces 2.0 XML Parser 16

GeneOff writes: "Xerces 2.0 is out! You can get it here. This XML parser features full XML Schema conformance and partial support for DOM Level 3. The XNI (Xerces Native Interface,) is very cool. Xerces is arguably the most widely deployed Java based XML parser. Open source projects like those from Apache do a very good job of tracking upcoming standards like XML Schema (no more weird DTDs.) Kind of like Linux 2.5 and the USB 2.0 spec."
This discussion has been archived. No new comments can be posted.

Apache Releases Xerces 2.0 XML Parser

Comments Filter:
  • Sweet (Score:4, Insightful)

    by TRoLLaXoR ( 181585 ) <trollaxor@trollaxor.com> on Tuesday February 05, 2002 @02:41PM (#2956854) Homepage
    Our product uses both Xerces and Xalan.. Xalan-J just jumped to 2.2, and now Xerces has made a very important leap ahead. There's been a lot of issues with both and the latest versions fix a lot of them. This is very cool indeed.
  • I'll have to upgrade my pet project right away.
  • Weird DTDs? (Score:3, Informative)

    by Sam Lowry ( 254040 ) on Tuesday February 05, 2002 @05:54PM (#2958316)
    >no more weird DTDs

    Dont know about the others, but when I write a spec, I do it in DTD, then convert to XML Schema and add missing features.

    From then on, the XML Schema version servers for me as a format to validate against while DTD is the format at which I look to refresh in mind the structure.

    Tastes differ ;-)

    • Thats certainly a way to do it. I should mention that DTD parsing is always available in Xerces-J 2, and you have to set it on by setting a feature, just as you do for XML Schema validation.

      With XML Schema you get a richer array of data types as well as a XML syntax. I think the latter point is important because it can simplify automated schema generation as well as combining schemas in the same document.

      If you think about namespaces, combining data from multiple sources requires a better way of validating each separate namespace.
  • by jkakar ( 259880 ) on Wednesday February 06, 2002 @02:15AM (#2960345)
    It'd be nice if the story content mentioned that it's a new version of Xerces-J- ie: Java. Our product (shameless plug: Expressus Design Studio [expressus.com]) is built around Xerces-C (C++) which seems to lag a bit behind Xerces-J in terms of features. I believe development on Xerces-C started AFTER Xerces-J so it makes sense.

    I'd have to agree with a previous poster who mentioned the Apache folk's as being good at keeping up with emerging technology- they've also proven quite helpful on the -dev mailing list.

    It'll be interesting to see if the XML Schema handling is close to or as fast as the DTD handling. I know that in our particular application (real-time automated classification) parsing the document takes almost as much time as what we have to do to "learn" where it belongs in a given data set.

    Another thought, partially off-topic, pertains to a previous poster's comments about working from a DTD and then migrating to XML Schema. I have to wonder how much of that is simply habit; I know that I've certainly had to solve problems that DTD's just can't handle. In my mind, even though habit may dictate starting with a DTD, starting with something that clearly will not accomplish the task at hand seems inherently flawed.

    I have a few questions:
    1. Are XML developer's ever NOT programmers?
    2. If so, would XML developers be willing to use XML Schema design tools. My own reaction (as a [mostly C++] programmer) is that I'll stick with emacs and do everything by hand, thank you very much. I get the impression that most programmer's shun code-generating type products. Of course, last I checked there was a holy-war over such ideas... I may have just shot myself in the foot by mentioning emacs. =)
    3. Do programmer's view XML (and it's friends) as a programming language? I feel a bit ridiculous even asking such a question because I certainly don't view XML as anything similar to C, C++, Java, etc. But then, I have been surprised on more than one occasion in the past.

    Anyway, I ramble... =) Congratulations to the Apache folks on doing a great job with all of their projects I've come across!
    • Well... I would say the ML in XML states the obvious, it is not a programming language.
      But as a meta language it can be used to define programming languages.

      XSL-T for example... that is one weird language, an unholy mix of procedural programming and whatnot...

      Then, all you need is a "runtime" (like Xcalan) and you're good to go.

      The chameleon like capabilities XSL gives XML is the #1 (to me at least) reason XML is so useful.
      • Ummm... eXtensible Markup Language. The meta language (the description of a language) would be DTD, XML Schema, or RELAX/RELAX-ng. After all, you can't just throw any old bunch of tags together and expect something useful to come of it. Only a particular set of tags defined either explicitly or implicitly by a schema can work for a task at hand.

        This is a analogous to C being a programming language with BNF being a common meta language for it. Or more appropriately, CORBA's IDL is closer to the concept of a schema than an XML document. XML by itself is analogous to the marshalled data that CORBA uses for it's serialization between ORBs.

        Although a particular instance document or class of documents (according to a schema) can be used as a meta language (Mozilla's XBL comes to mind although that blurs the line somewhat), it is primarily a data/content markup language.

      • I assume by your affection for XSL-T that you are a web designer -- transforming XML data to XHTML, WML, etc. for presentation.

        If you haven't already, I would suggest checking out SVG and XSL:FO. As an XML-saavy web developer, I assume you are familiar with client-side DOM programming in a modern web browser.

        Well imagine being able to manipulate graphics in much the same way (think Flash-style graphics without the plugin). Creating pages that are aesthetically pleasing without having all of those GIFs and PNGs flitting about. Imagine being able to embed contextual information in this graphical data so that the visually impaired (or lynx/links users) can still use your site -- "alt" attribute on steroids.

        With regard to XSL:FO, imagine being able to write a stylesheet that represented exactly the appearance of the document. Whereas XHTML is intended for a wide variety of clients where height, width, and color depth are all variable, XSL:FO is perfect for generating PDF (think online software documentation as an atomic unit of data instead of a bunch of distinct pages in a tarball/zipfile). It's perfect for generating PostScript for consistent layout going to a printer. Or RTF, MS Word, or TEX/LaTEX for those who like to use a word processor (or emacs ;-)) to view their content.

        ...just some food for thought. I agree, the "chameleon-like qualities" are nice too. :)
    • Another thought, partially off-topic, pertains to a previous poster's comments about working from a DTD and then migrating to XML Schema. I have to wonder how much of that is simply habit; I know that I've certainly had to solve problems that DTD's just can't handle. In my mind, even though habit may dictate starting with a DTD, starting with something that clearly will not accomplish the task at hand seems inherently flawed.
      Perhaps it's because XML Schema is the most horrible concrete syntax for expressing syntactic grammars and associated data constraints that I have ever seen. DTDs are not so much better, but at least they resemble classic description possibilities from the programming world (i.e., EBNF) more than XML Schemas.

      Anyway, instead of XML Schemas or DTDs I'd go for canonical description possibilities every time. E.g., annotated grammars in EBNF like in ANTLR [antlr.org] with automatic transformation to XML Schemas for usage with XML tools.

      And besides, XML Schemas are not very powerful if one compares them to real syntax description systems (i.e., parser specs) and real data constraint systems (i.e., formal spec languages, be they axiomatic or algebraic).

      • While I have my issues with XML Schema, it was never intended to replace or compete with something like ANTLR or BNF. The latter were meant to describe an entire language down to the nitty gritty. XML Schema is only intended to describe the format of a document within the confines of well-formed XML syntax.

        Might I suggest taking a look at RELAX NG [oasis-open.org].
        It has the much simpler syntax for which you may be looking. But it is still a well-formed XML document and can therefore be parsed with the same tools as the instance documents.

        Don't forget that a primary reason XML Schema came to be was that it is a pain in the ass to make both an XML parser and a DTD parser (let alone the validation logic). I've done it. While not as hard as many other things out there, it's made a lot easier with having unified parsing code. And it allows for one schema syntax to be converted to another with an XSL-T stylesheet. Or for conversion to a group of schemas (think one stylesheet for generating the database schema and others to make your programming header files). Need to make some changes? A stylesheet is easier to update than classic source code especially when that source code requires a compiler which may not be handy on the box in question.

        And let's not forget documentation! Another stylesheet to convert to XHTML (or whatever).
  • Now if only Xerces and Xalan would really work together. Sure there are ways to instantiate a Xalan object with a Xerces Document, but that's no real integration. I would love to just use my previously parsed Xerces document fragment for Xalan's Document object. There is a bunch of redundancy that needs to come out before these two projects in a similar arena can really play together. It would really make working with XML a breeze

You know, the difference between this company and the Titanic is that the Titanic had paying customers.

Working...