W3C launches Binary XML Packaging 239
Spy der Mann writes "Remember the recent discussion on Binary XML? Well, there's news. The W3C just released the specs for XML-binary optimized packaging (XOP). In summary, they take binary data out of the XML, and put it in a separate section using MIME-Multipart. You can read the press release and the testimonials from MS, IBM and BEA."
Testimonial: Dancin Santa (Score:5, Funny)
That's when I found Binary XML. They were able to help with the debt. They got the creditors off my back and got me back on my feet.
Thanks Binary XML!
(I thought this was going to be about a standardization of compressing XML files that got rid of the excess bloat in the markup.)
Comment removed (Score:5, Insightful)
Re:More bloat! (Score:5, Insightful)
So did I. Then I looked at that example [w3.org] and my heart sank. What the hell! 12 lines of bloated crap text turned into 46+ lines of worse bloated crap!
The examples given in the article haven't included the binary data for berevity. The problem that exists now is that binary data has to be encoded into a form compatible with the charset of the document, which usually means base64. This increases the size of binary documents enourmously (think twice), and also requires CPU cycles to encode it.
Being able to send the binary data in a seperate MIME payload means it doesn't need to be encoded in this manner which is a big help for any reasonable sized binary resources. It also means they become first class MIME objects and can have associated headers which provides additional benefits.
Re:More bloat! (Score:2)
What the hell is wrong with just gzipping it?
It's just another encoding that happens to be source-language agnostic and provide redundancy elimination.
You have no problem with the overhead of parsing binary XML, but dictionary lookups and tree rotations involved in decoding a compressed file.. that's out of the question?
Not to mention the added benefit that a standard compression layer shrinks not just the tags, but the content as well.
Look, stop thinking of gzip (or bzip, or whatever), as a "compression
Re:More bloat! (Score:2)
huh... last time I checked binary was the language the computer natively understood and it didn't need to be parsed or processed in anyway by software.
Also, it seems to me that he did have a problem with the parsing of the XML part.
Re:More bloat! (Score:3, Insightful)
Technically, ASCII is binary, too. 'A' is 65, which is 01000001. Binary XML will not do away with parsing. The tags will still be there, the content will still be there. Only the restriction that the tags must be an alphanumeric string will be lifted.
Making things "binary" doesn't magically remove the burden of parsing. You know the binary executables you run? The system loader loads it.. and parses it, and arranges it in memory the way it needs to be a
Re:More bloat! (Score:2, Funny)
And as I suggested above he did not like the XML tags either calling them:"12 lines of bloated crap" and all.
I feel like Im talking to a two year old. I don't know what else to say. If you can't comprehend that binary is much faster to parse than XML theres nothing I can do. Oh I give up you're right. I propose to ch
Re:More bloat! (Score:2)
<ASM instruction="JMP"><PARAMETER type="32bitAdress">A3D2</ASM>
<ASM instruction="NOP"></ASM>
<ASM instruction="ADDA"><PARAMETER type="32bit Integer">D22A</ASM></ASM>
Re:More bloat! (Score:4, Insightful)
If you can't comprehend that binary is much faster to parse than XML theres nothing I can do.
Where is your numerical proof that binary is much faster to parse than text? It is amateurish to just assume this is true. Good parsers are damn fast and can operate in O(n) time.
Of course binary may be faster. I doubt that it will be much faster when compared to a decent parser and when you realise that the binary format should be platform agnostic for word size, endianness and forward and backward compatibility.
For instance, gzip'ed text files can sometimes be much faster to access than uncompressed binary files because it reduces the amount of file IO. e.g. 64 bits of binary to encode the number 1 rather than 8 bits of text.
While compression increases the CPU usage because the disk is so much slower and because the CPU might otherwise be idle waiting for the disk it can lead to an overall win. The same may apply to a slow network link. Unless you measure it is difficult to know. I've lost count of the number of binary formats I've seen that in hex dump had vast numbers of zero bytes and were thus highly inefficient. The people who work at a "high level" designing such file formats without checking such simple things are poor programmers. Even when using indexes the saving of a single extra random disk/network access can sometimes justify a huge amount of CPU usage.
---
Don't be a programmer-bureaucrat; someone who substitutes marketing buzzwords and software bloat for verifiable improvements.
Re:More bloat! (Score:3, Insightful)
Even a horrendously slow XML parser operates in O(n) time.
Re:More bloat! (Score:3, Insightful)
Why are databases fast? Indexes. What do all XML databases do? Store XML internally in a way that machines read much faster, but makes it a pain for humans to update. Indexes. So if you have all these computer programs passing around data,
Re:More bloat! (Score:3, Insightful)
Re:More bloat! (Score:2)
Re:More bloat! (Score:2)
This is about packaging other binary data _within_ XML. RTFA
Well, doh... (Score:2)
1. External link (unpractical)
2. XML/Base64 encoding (~450kB)
3. XOP/binary encoding (~300kB)
In that case, your 30+ lines of extra code are completely irrelevant. That being said, I was under the impression that you could do this already by sending your binary data in a "document fragment"
Re:More bloat! (Score:2)
Re:More bloat! (Score:2)
Why XML isn't like the following? it certainly has much less characters, plus it is similar to the most popular programming languages:
It's easy to parse, and editing tools need small modifications to handle it.
Re:More bloat! (Score:2)
The extra lines are the additional Mime and XOP packaging.
Re:Testimonial: Dancin Santa (Score:2)
Your life is a country song. For better results, try playing it backwards.
I got my wife back, my car back, my house back, and a full bottle of whiskey at the end!
nothing else to work on? (Score:3, Interesting)
Binary file formats are hard.
Let's use XML because it's easier.
No wait... let's represent that XML in a more efficeint binary format.
Ah yeah that's the ticket - the best of both worlds!
Now let me just fire up my code-morphing processor which, through emulation ahieves x86 compatibility with "low" power consumption. Never mind it's slower overall and has worse MIPS/mW than an underclocked x86 - look Ma, we *inveted* something!!!!
There are some real technical problems out there... why are people chasing non-problems like XML?
Re:nothing else to work on? (Score:5, Insightful)
Whatever happened to the virtues of simplicity, like a file containing a header record detailing the field names, and rows containing the data in either fixed-length or delimited form? Damn fast to implement, debug, read from and write to. Parsing? What parsing? Read the first line, split it to get your headers, and read 1 line per record.
Ideal for data exchange. Easy to manipulate via javascript on the client. Simple to display and manipulate via the DOM (Document Object Model). Not resource-hungry. Handles both text and binary data. Dirt easy on the server.
I ran a test to compare, and I'm able to select, format, and serve 1000 records this way in less time than 100 records in simple HTML, never mind xml. By doing this, the client can page through, say, 25 records at a time without having to hit the server every few seconds to see the next/prev pages.
Re:nothing else to work on? (Score:2, Informative)
Re:nothing else to work on? (Score:2)
Well, I was actually talking about fixed-length records as well (even quicker to manipulate - no complicated parsing involved, random access for r/w, etc). Need some data?
Re:nothing else to work on? (Score:2)
The big advantage to XML is its ability to represent tree-based structures. The anount of fiddling needed to represent self-referential structures is on a par with table-based encoding of tree structures, so that extention is eas
Re:nothing else to work on? (Score:2)
If the issue is one of "Which external data format is better?", I don't think it's fair to answer it on the basis of which one is supported better by existing tools. While that is certainly a factor when implementation expediency is concerned, it is not the only characteristic worthy of optimizing.
The big issues with XML encoding are encoded-object size, and incremental
Re:nothing else to work on? (Score:2)
1) Endianness will probably cause problems when you least want them.
2) Parsing wide-character string data can be a pain.
I mostly think XML is 95% overrated and 5% genuine usefulness, but, in a world of people who have never heard of a big-endian computer regardless of a degree CS/CE/EE, it's a tough call.
You know, I think colleges should start offering 4-year degrees in XML. That way we would be assured of having a few people in the world who actually know how to use i
Re:nothing else to work on? (Score:2)
"
The solution to that is: Standardize on an endianness for binary serialisation and let the computer that doesn't follow the standard do the conversion.
See? that wasn't hard, No need for XML.
Re:nothing else to work on? (Score:2)
Hehe, look at the standardization processes between companies and you'll see that it's more than hard.
Re:nothing else to work on? (Score:2)
It's true, though, that an ASCII character standard would be far more useful than XML
Re:nothing else to work on? (Score:2)
Re:nothing else to work on? (Score:2)
Re:nothing else to work on? (Score:4, Informative)
Then of course you have the problem that your data wants to be variable length. Then you want to have the deliminator actually in the data, so you have to invent escape codes. Then in some lines you want to allow multiple occurances of some of the parameters so you put in some basic markup. Then you want to be sure that any data users enter is of the correct format, so you write a verifier. Then you are basically back at XML again.
XML isn't that great. However take at face value, it saves time and programming errors, the same way I wouldn't expect to have to wite my own doubly-linked-list, or hash table. Neither are complicated, but my language should come with one pre-written which is safer and faster than one I could knock together.
Re:nothing else to work on? (Score:3, Insightful)
Not a big deal. You don't necessarily need embedded escape codes (though they work well) - you can also use overflow buckets like databases have used for, say, 30 years
regexes make this easy to implement.
Not
Re:Noscript (Score:3, Interesting)
It's time to stop thinking of "web sites" and start thinking along the lines of "web apps" - not the old-style form-based "web app", but more along the lines of gmail - heavily client-side-scripted, nice presentation and data manipulation.
What I see is very few pages (or even just 1 page) as the UI, data exchanged between server and client w/o page refreshes (can be done just w. javascript by sticking the data in iframes with a width and height of 0px, and reading/w
Don't need an iframe (Score:2)
Re:Don't need an iframe (Score:2)
Re:Don't need an iframe (Score:2)
Re:Don't need an iframe (Score:2)
Re:Noscript (Score:2)
Where can I download Firefox for PocketPC [or whatever it's called this week] ?
Re:Noscript (Score:2)
it certainly isn't anything like "the primary focus of Minimo to date has been system with ~32-64 MB of RAM, running Linux"
I said where can I download Symbian not where can I look at a project that doesn't even have Symbian on the roadmap.
Re:nothing else to work on? (Score:2)
No, the script won't break when adding or removing a field:
Heck, you can even use a style sheet to determine how it's displayed by having a style for each field type. Don't like how it's displayed - change th
Re:nothing else to work on? (Score:2, Informative)
You, and whoever modded you up as "interesting", are an idiot.
This standard is not about representing XML in binary format.
This standard is about representing binary content in an XML document in binary format.
See, previously, if one wanted to include binary data in an XML file one had to Base64 encode it. This takes space and processor time.
This standard moves the bloated Base64 content into a pure binary MIME object.
Maybe you should have RTFA first, eh?
Mod Parent Up (Score:2)
Re:nothing else to work on? (Score:2)
Except that's not what they're doing at all. They're encoding binary data IN an XML document. They're using a principle similar to how one would go about attaching a file to an email.
Binary XML Lite (Score:4, Funny)
The following data is in binary.
UH)(&T^( @#t79nui**&tb x9#@ $Y*_@$ji[P{O@JIOHXIOU$HIIU#$hiuoHOP$UJ [etc.]
Re:Binary XML Lite (Score:2)
-b
Seems like a great way to package media... (Score:2, Insightful)
Re:Seems like a great way to package media... (Score:2)
How this is different than simply base64 encoding the image inside a tag is yet to be seen. Perhaps because it's a standard?
Re:Seems like a great way to package media... (Score:2, Interesting)
As a software developer (Score:4, Interesting)
While I myself would prefer to write a binary protocol and send the data through a TCP socket I can no longer do that.
When we land big contracts at work that deal in government and health the key thing they need now is interoperability with others. What does this mean? XML. Whether or not you like it, XML is here to stay. Its what everyone is pushing.
Therefore we had to adapt and start using it. Not just for B2B, our rich desktop clients now communicate with the server using XML web services.
The problem we've encountered is sending binary data. Right now we have to encode the data in base64 XML which uses lots of resources. I will give more look at this but it looks particularly good.
Re:As a software developer (Score:3, Insightful)
In your case, ASN1 is what you should be using, not XML in the first place.
Re:As a software developer (Score:2)
Uhhh... (Score:5, Informative)
Thank you! (Score:4, Informative)
a bit confused (Score:2)
XML:
XOP:
Is this right? So the benefit is just standardizing the binary representation using MIME? But that doesn't make the tags less verbose... so how is
Re:a bit confused (Score:2)
Re:a bit confused (Score:3, Interesting)
XML, being a text format, required proper text encoding. In particular, XML does not allow most of the codepoints (speaking in unicode terms) between 0 and 31 (tab and newline excluded). If you use UTF-8, you cannot use byte values beyond 126 as those are used for forming higher-value unicode characters. In addition, the five main XML markup characters (< > and &) can only be used in some places.
So, to make a long story short, you base64 everything. For every three bytes you have, yo
This is NOT binary XML (Score:5, Insightful)
This is simply a way to reference binary data from within an XML document and to have that binary data included in the same payload (using MIME).
Passing binary data in XML is a big problem. Everybody just invents their own method of doing it (although most are just variations on the theme presented here).
There is a need for this specicification but it is not ground breaking or even particularly /. newsworth.
Re:This is NOT binary XML (Score:2)
And you find this less absurd?!?
Re:This is NOT binary XML (Score:2)
Can you put arbitrary bytes into a "CDATA" section? Sure, prolly. But there are people out there who get paid by the hour to cook up new standards for stuff we'd already figured out. You don't want them to have to go out and get real jobs, do you?
Critiques (Score:5, Informative)
First of all, it's completely impossible to stream this format. All the binary chunks have to be read at some point in the future when the actual XML non-opaque content is complete. In a stream, that never happens. (Of course, XML isn't the most stream friendly protocol...you can't validate a stream.)
Secondly, this isn't wonderful for large files either; you're constantly seeking for binary data that can be many megabytes away. We solve this in web pages by having the images be completely separate (binary) files.
Thirdly, its telling that they used a PNG as a data type. Besides being yet another file format that needs its own custom binary parser (heh, I like PNG, I'm just complaining about it in the XML whinespace), it's big and simple and there's just one there. One of the things I really liked about the various Binary XML formats was the degree to which they expressly typed things like arrays of floating point values or little-endian integers. Converting values between binary and string format is an enormously painful process, one that frankly I'm astonished hasn't received CPU acceleration at this point. Every other Binary XML format has seriously thought about how to efficiently but correctly manage large arrays of such values. XOP just says...heh...you wanna dump alot of data efficiently? Check your typing at the door. Feel free to bring a buffer-overflow ridden parser in with you if you like, though.
Don't get me wrong, there's a fundamental simplicity to XOP that I can certainly understand how it's appealing. But it seems to go so massively against what XML represents that I'm not entirely sure XOP encoded content deserves to be compliant with the very regulations that forced XML adoption in the first place: Opaque formats are too expensive to maintain for any amount of time, therefore either self-describe or don't get deployed. A self-decribing document that says "All performance-critical content is opaque" seems rather counter to this spirit.
Re:Critiques (Score:2)
Results:
So we're looking at maybe 787K symbols per second on my machine, at 100% CPU. How does this translate to XML parsers? You're right, this is something I should look into.
Headline should read... (Score:5, Funny)
I, for one, welcome our new bandwidth eating plaintext overlords.
Dave
Maybe I forgot to mention... (Score:2)
The main application of this XML-referencing-to-binary-attachments is SOAP, and that means web services.
In other words, you can simplify your God-help-me-XML-handling-and-parsing-code into something maybe 10% simpler. This means leaving the binary stuff OUT OF THE XML PARSER, putting it into the upper levels or processing. Cleaner, faster.
Also, it helps adaptive compression (gzip) by tightening up the textual data - remember web services are about information transfer, not stora
Script Data Structures in place of XML (Score:4, Interesting)
In addition the server code is written in perl so for storing status and configuration information, I used serialized perl data strucures processing requirements fell dramatically. With serialized scipt you still have the clear text editing and inspection capabilities without the speed and space issues. for example instead of It seems like serialized script code, in either perl, python, java provides the benefits of xml without the headaches.
Re:Script Data Structures in place of XML (Score:2)
Plus, on the server-side, you should be able to write a JavaScript serializer in 30 lines or less. For example, the one used in our project (sitellite.org) is exactly 30 lines of PHP code, and properly handles strings, numbers, booleans, arrays, and objects.
Oh so we store it using binary (Score:2)
Wait, you say this allows xml to reference binary data? I say "href" attribute, bi-atch, look it up.
You say, but no, it allows you to send the binary data along in the same stream / document? Check out multipart/mime. It's been around a long time.
Here's a wild thought. Have the XML file reference it's binary resources by relative filenames. Tar the XML file together with the resources. Now pay
Full Circle (Score:2, Insightful)
Obviously someone needs a knock on the head - when you design your application, don't you think about such things as a balance between performance and maintainability first and then implement what is suited better for your specific case? Obviously not! Just a little while ago everyone and their grandmother switched to XML for whatever reason but then they realized:
This is awesome (Score:2)
Why not XBop? (Score:2)
I like it: XMLs strength abstract, not concrete (Score:2, Insightful)
XML has become at least two things since its evolution:
The interesting part of the story is that #2 came first. Since then, the W3C has recommended the Infoset [w3.org] abstract concept.
For the developers out there, think of how often you parse the "angle brackets" yourself. Most everyone these days (yes, I know there are exceptions) uses an API which presents ele
The buzzword importance (Score:3, Funny)
Bascially we were promoting an automated trading system and the first question I get is...
"Does it use XML?"
There you have it.
Very exciting but ... (Score:2, Funny)
I hear it's going to introduce 263 special MS tags and nodes and extra layers into the standard that only works on MSWord in Windows XP. It won't validate as XML anymore but who cares. You will use a special version of Front Page to do this.
The files will be a little bigger too, so with MSBinaryXML will add approx 257k thanks to the special proprietary MS extensions but will have superior functionality compared to other types.
It will be particu
Attachments, Not Binary XML (Score:3, Informative)
XML For Buzzword Based Engineers (Score:2)
XML will come full circle when true binary XML is a w3c standard. People will be using high-level GUIs to generate text-based XML files, which will be converted into binary XML. On the other end, somebody will receive binary XML, convert it
XML 'is' useful, just not this binary XML spec (Score:4, Insightful)
I create data-driven web apps for a living (i.e. data-driven graphics, UI and text via SVG and HTML), and I firmly believe that XML is the way to go for such creations. It offers a hierarchical structure that is excellent for temporarily storing data pulled from a database, which can then be converted to HTML or SVG or some UI markup (XUL, XForms, or your own thing) via XSLT.
I don't really care that XML is human-readable--I like the fact that because it is extremely well structured, it is therefore easy to create with authoring applications as well as being easy to manipulate real-time by with script (i.e. manipulating its DOM).
I have long wished for a true binary XML spec to make the transmission and parsing/decoding quicker, and this spec isn't it. But I think one day we'll have it, and that won't mean that we've "come full circle" and therefore XML is useless. It just means that we'll have the best of both worlds--speed plus standardized, hierarchical data structures.
MS sez..... (Score:2)
Yea right.... It will be MSMTOM and won't work with anything BUT IE and M$ products. Look at what they did with their version of XML.
Re:Binary... XML... Nah! (Score:4, Insightful)
if you're using an XML file in a place where you need a high performance SQL database then you're doing something wrong. If you're using XML as datastorage for some small webapp who cares so long as it's fast enough for that particular application.
Re:Binary... XML... Nah! (Score:3, Insightful)
As you point out, it is the wrong tool for the job, much like using tables to layout HTML pages (as the CSS religionists like to point out).
My 64 million dollar question is why they put an acronym inside another acronym: XOP stands for XMLOP? WTF??!!
They REA
Re:Binary... XML... Nah! (Score:2)
GNU = GNU's Not Unix (and many other recursive acronyms)
To name a few
Re:Binary... XML... Nah! (Score:2)
Like I say, I'm sure there are some people who think the old "it's an acronym" joke is a real knee-slapper. But it's kind of a shame that the people w
Re:Binary... XML... Nah! (Score:2)
XPath
XSL
XLink
XHTML
XForms
XPointer
It makes sense to put the X in front of tech that uses XML. CSS doesn't need an X because it works on SGML too for instance
Re:Binary... XML... Nah! (Score:2)
Never mind the question why people say "acronym" when they mean "abbreviation"...
Re:Binary... XML... Nah! (Score:2)
Be fair...(and I am a Firefox guy) NO [ right...NO] browser fully conforms to CSS standards. I am a CSS relig
Re:Binary... XML... Nah! (Score:2)
Which kinda makes one question whether having such so-called "standards" is really worth all the trouble.
Amazingly, HTML compatibility was easier before it was "standards" this and "standards" that. There were certain constructs that only worked in certain browsers, sure, but we didn't have the god-awful mess of supposed-tos and should-nots that we have today.
It seems to me, from a distant perspective, that the problem with Web standards isn't that
I'd have to agree with you on this... ;-( (Score:4, Interesting)
Conmtrast this to IEEE standards -- they get developed when a bunch of companies are ready to invest several mega$$ for a chip spin -- and they just want to choose the best course, arguing with each other about technical merit of this or that approach. And in the whole HT|X/ML world there can be (almost) no competition on technical merits, just a bunch of guys arguing if it should be or BAR
I wish I'd have the time on my hands and their budgets to actually try something revolutionary. Leke the original WWW, which was NOT designed by a committee...
Paul B.
Re:I'd have to agree with you on this... ;-( (Score:2)
Paul B.
Re:Binary... XML... Nah! (Score:4, Insightful)
Are you *sure* about that ?
<blink >
<marquee >
<object >
<bgsound >
No-one forces you to validate your html (unless you work for me =). Why I come from it's comformance first, compatibility second.
So, You're Against Innovation? [pantos.org]
A common misconception is that folks who advocate HTML validation are retro-thinking, "backwater unix geeks" who stubbornly oppose innovation. It's true that many advocates of HTML validation are indeed seasoned computer professionals, who have learned the hard way that portability and compatibility are key elements to ensuring the longevity of any software product (including Web pages).
Re:Binary... XML... Nah! (Score:2)
Re:Binary... XML... Nah! (Score:2)
Amazingly, HTML compatibility was easier before it was "standards" this and "standards" that.
If you're referring to the days of Netscape 3 and IE 1, you must have a very bad memory or never been engaged in making a webpage these days. The XHTML standard, for instance, is really easy to understand, and as long as you use simple CSS, you'll get the same result in a lot of browsers. Yes, IE has misunderstood some of the CSS specification, and the CSS2 layers model is far too advanced for today's browsers.
Re:Binary... XML... Nah! (Score:2)
So much for the "standard," huh?
Look, I guess I was unclear or something. We have these massive and baffling standards, standards that overlap, standards that contradict, standards that no mere mortal could possibly understand. Nobody conforms to them. Your solution? "Just use very simple CSS." Um. Bad solution. Bad! Go lay down!
Better solution: Take a moment to wonder if, maybe, it's the standards themselves that suck. If nob
Obligatory ... (Score:2)
Re:Binary... XML... Nah! (Score:2, Insightful)
You mention SVG, but then you fail to see the benefit of reducing the size of, say, a large SVG file in a standards compliant way so that it can be transferred and take up less bandwidth. A good binary standard will DEFINITELY be smaller than the verbosity that is XML. Sure you can compress it, but when you compress a whole bunch of unneeded crap, you still have a whole bunch of unneeded crap... just compressed. If this standard reduces the amount of space it takes to
RTFA - Re:Binary... XML... Nah! (Score:3, Informative)
Re:Binary... XML... Nah! (Score:2)
I doubt a binary version of the XML format would see any popular use, but a decent compression system for XML would be a nice bonus, just something to put alongside the other compression standards commonly seen in HTTP 1.1's Accept-Encoding header.
Re:Binary... XML... Nah! (Score:3, Interesting)
Seriously, you guys need to re-read the article again.
The problem with XML binary payloads now is that you find out that you have a large chunk of payload too late in the game and can't avoid it
Re:Binary... XML... Nah! (Score:2)
Surely you must remember the Free iPods guy, that one wasn't so long ago.
I can't remember any other specific ones because they fade into insignificance.
Re:This business will get out of control. (Score:2)
Using an XML representation for method parametrization provides a common interchange format when linking bitween different implementation languages, and, obviously, across network boundaries. Furthermore, the natural XML specification of method signatures makes possible the compilation of said method signatures by different compilers.
Of course, the right thing to do is to compil
Re:I am enlightened (Score:2)
The main reason it is defined th
Re:base64? (Score:4, Informative)
And you're right, most people don't want to include huge binary stuff in their XML. But sometimes you DO need to combine XML with huge amounts of binary data. So far, the alternatives have been non-standard wrappers (including people doing more or less what this standard does, by using MIME multipart documents) or base64 or some other space wasting encoding inside the XML document, or wrapping everything in an archival format (like OpenOffice does, for instance).
All this does is define a standard way of letting you keep a document and associated raw binary data together, while allowing you to treat it as if it is inlined in the XML if you so choose.
The principles are exactly the same as for sending an HTML e-mail containing images (or other data) as attachments and referring to them with url's of the format "cid:foo" (they refer to the MIME element with the matching "Content-ID: foo" header.