Slashdot Log In
Opera CTO Hits Back at Microsoft's Standards Push
Posted by
Zonk
on Sat Feb 24, 2007 01:22 AM
from the going-to-make-the-nighly-news dept.
from the going-to-make-the-nighly-news dept.
Michael writes "Opera CTO Håkon Wium Lie hit back today at Microsoft's push to fast track Office Open XML into an ISO standard, in a
blistering article on CNET. He also took a swipe at Open Document Format: 'I'm no fan of either specification. Both are basically memory dumps with angle brackets around them. If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML). But I think there is a better way.' The better way being the existing universally understood standards of HTML and CSS. Putting this to the test, Håkon has published a book using HTML and CSS."
Related Stories
[+]
IT: Microsoft Wins Industry Standard Status for Office 281 comments
everphilski writes "The International Herald-Tribune reports that Microsoft has won industry standard status for Office. EMCA International, a group of hardware and software makers based in Geneva, approved the MS file formats with only one dissenting vote - IBM. IBM backs the OpenDocument standard, which was approved by the ISO in May of this year." From the article: "Bob Sutor, IBM's vice president for open source and standards, called Microsoft's Office formats technically unwieldy - requiring software developers to absorb 6,000 pages of specifications, compared with 700 pages for OpenDocument. 'The practical effect is the only people who are going to be in a position to implement Microsoft's specifications are Microsoft,' Sutor said."
[+]
Microsoft Blasts IBM Over XML Standards 323 comments
carlmenezes writes "Ars Technica has up an article discussing Microsoft's latest salvo against IBM. Microsoft's open letter to IBM adds fresh ammunition to the battle of words between those who support Microsoft's Open XML and OpenOffice.org's OpenDocument file formats. Microsoft has strong words for IBM, which it accuses of deliberately trying to sabotage Microsoft's attempt to get Open XML certified as a standard by the ECMA. In the letter, general managers Tom Robertson and Jean Paol write: 'When ODF was under consideration, Microsoft made no effort to slow down the process because we recognized customers' interest in the standardization of document formats.' In contrast, the authors charge that IBM 'led a global campaign' urging that governments and other organizations demand that International Standards Organization (ISO) reject Open XML outright."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Yes. Well... (Score:3, Funny)
Ok... Cheese, anyone?
fsck'n ugly (Score:5, Insightful)
Re:fsck'n ugly (Score:5, Insightful)
You don't typeset with Microsoft Word, either. Which makes the entire argument specious. Word processors like MS Word and OOo Writer are for creating common documents like letters, memos, and maybe the occasional flyer. Neither one is particularly good at anything even close to professional publishing work. Even the book authors just use Word (or surprisingly, OOo Writer!) to do the text content. That text is then exported to a more sophisticated program, where the actual typesetting and page layouts are done.
I think this fellow's point is that HTML/CSS formats can store any information that a Word Processor might need to store, with no need to invoke new technologies. To a certain extent, he may be correct. Unfortunately, HTML/CSS may make a good intermediary format, but it is not particularly good from a performance or usability perspective. Then again, XML formats in general are fairly poor choices for the same reason.
I think if we want to break this conundrum, the industry is going to have to learn how to keep local data stores that are of high performance, while exporting intermediary formats when emailing or uploading to external computers. The only problem is finding a way of doing this so that it's completely transparent to users. The mythical "mom" doesn't want to worry about emailing a document in the right format, or having the right program to read the attachment she received. She just wants it to do what she tells it, with no bloody prompting with questions she has no answers for.
Parent
Re: (Score:3, Informative)
-too many versions of html (4, and perhaps 5 soon) and xhtml (1.0, 1.1, strict, transitional, etc)
-different versions of CSS, browser support for it varies quite a bit (and is pretty much non-existent for CSS3)
-too many render
Re:fsck'n ugly (Score:5, Informative)
This would have to be done by the tool displaying it, same as a self-updating TOC in a Word or OpenOffice Writer document. The information is present in a correctly-structured HTML document in the form of Hx tags.
Hell, how can you even tell the page numbers in a html "document" anyways?
The same way you would in a Word document. It doesn't make sense if you're looking at it as a web page in your browser, but if your editor used HTML it would work the same way. (This also partially alleviates the rendering issues.)
Parent
Re:fsck'n ugly (Score:5, Insightful)
Parent
Re:fsck'n ugly (Score:5, Insightful)
- position an image on page 4 of my document?
- add footnotes?
- embed fields (date, last editor...)?
- mark the embedded TOC as TOC so that it gets regenerated on reload?
etc.
And on the CSS side, there are quite a lot of shortcomings, too.
Of course, all of this would work with custom XML tags or special id/class conventions, BUT then you'd have to specify those. And getting this below 700 pages won't be easy.
So repeat after me:
HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.
Parent
Re:fsck'n ugly (Score:5, Interesting)
I just dug out the template I wrote, and the pagination and ToC worked fine in Safari. The auto-numbering of headers, however, didn't. This is due to a lack of support for counters in generated content, and the same problem with Mozilla was a significant reason for abandoning the whole idea in the first place; the only browser everything worked in was Opera.
Another significant reason for abandoning this idea (not entirely relevant when talking about document formats being generated by tools) was that HTML is a huge pain to type, and XHTML is even worse. Something semantically equivalent to XHTML but using S-expressions would have been fine, but typing XHTML just involves spending far too much time hitting > and < keys (not to mention the redundancy of close tags having the full tag name). I turned to LaTeX, which is easier to type and also (being a Turing-complete programming language) much easier to extend than HTML.
Parent
Re:fsck'n ugly (Score:5, Informative)
- position an image on page 4 of my document?
You don't, nor do you want to. But you can anchor, float or bind the images to the text easily enough. This would be handled by css... for the HTML side, it would just be div and object tags --- not that you would ever see them, since this is an word app.
<p class="footnote">My footnote</p> with the appropriate CSS rule (presumably something like float: page or whatever.)
Using XML entities, presumably
Regenerated on reload? Come on, have some ambition.. it should be in sync at all times. Anyway, by keeping tracks of the header tags, presumably.
XHTML+CSS would need some expansions... but probably not much. A good layout program propably doesn't care about the device, but if it did, there are already @media tags to handle this situations. There are also a couple of other truly dedicated layout namespaces on w3 to consider.
But all this matters not. This is politics. Sadly.
Parent
Re:fsck'n ugly (Score:4, Insightful)
Footnotes are easy, too: Text Text that needs a footnote.<div class="footnote">This is the footnote</div>. That's the same concept as in LaTeX, the best typesetting software out there.
Parent
Re: (Score:3, Funny)
Re: (Score:3, Insightful)
The M in HTML stands for MARKUP. And it means it. HTML is NOT a layout language.
Re: (Score:3, Interesting)
I think that's why he says HTML/CSS. HTML ta
Re:fsck'n ugly (Score:4, Insightful)
No offense, but I'm getting sick of this line of reasoning. You're right, mom wants the computer to read her thoughts, know exactly what she really meant when she said X, anticipate every need she might have, and pre-calculate its complexity out of existence.
In other news, my boss would like this entire website built in one hour ($40), never need support, and scale to 300,000 users.
At a certain point IT's job goes from "give every user what heshe wants" to "educate users about what is feasible in the current technological situation.
Parent
Classic quote for the books, gotta love XML play (Score:5, Insightful)
Is it mature enough? (Score:4, Interesting)
I'm a latex junkie. Latex though is a PITA to create templates and styles for. Someone willing to take up the task to modernize latex or completely replace it?
Re:Is it mature enough? (Score:5, Informative)
Parent
Re: (Score:3, Interesting)
For me, I write my user manuals [for my FL/OSS projects] in LaTeX because the layout is much better, and the process much simpler than wrestling with a word processor.
Why anyone writes books in anything else is beyond me.
My first book [math text] that was published was all LaTeX, and while it wasn't all super simple the vast majority of the layout and setting work was handled by TeX itself. My second b
Re: (Score:3, Informative)
Why anyone writes books in anything else is beyond me.
I couldn't agree more. I am currently writing a book, and I can't imagine how people use tools like Word. It has a lot of technical content, particularly code snippets. With LaTeX, I can easily insert a few lines from a code file, and have it automatically syntax highlighted. I never have to worry about copy-and-paste errors, since the source code is included directly from the source files, which I can compile and test.
I can also define short commands like \code{} for inline code snippets (e.g. variab
Re: (Score:3, Insightful)
Re:Is it mature enough? (Score:4, Informative)
Tables shouldn't be used for page layout -- that's what CSS is for. It's as simple as that.
Parent
Re:Is it mature enough? (Score:4, Funny)
Parent
huh? (Score:5, Funny)
Uhm. I'm no expert, but isn't a book that uses HTML and CSS called a website?
Re:huh? (Score:5, Informative)
Parent
Re: (Score:3, Insightful)
If they spelled everything out without any ambiguity it would make a better standard.. but then it would be another "600 page long" standard with is what he seems to be against in the first place.
CSS for Documents? (Score:5, Insightful)
Having a word processor act more like a web browser would be awesome. Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?
While turning word processors into web browsers would be stupid, things like CSS would be awesome to have in word processors.
Re:CSS for Documents? (Score:4, Informative)
Such things exist. TeX provides a decent the base for such things, so it's a matter of finding a TeX centric editor. LyX would be a good example, and indeed it has the sort of functionality and general approach to document creation that you seem to be after. Of course it doesn't necessarily have all the other features that other word processors might have (like mail merge or what have you).
Parent
Re: (Score:3, Insightful)
Lyx provides a GUI front end, but you lose a lot of flexibility.
Texmacs might work for you as well, although I found it very clunky.
Re:CSS for Documents? (Score:4, Interesting)
I am finding myself wishing that OpenOffice had pursued putting a vastly better interface on TeX and LaTeX, rather than writing their own standard. It would probably have been faster and certainly would have been a lot more stable. Microsoft couldn't have even thought about it: its clean, open standards would not have lent themselves to the proprietary "extend" part of Microsoft's "embrace and extend" approach, or Microsoft's software licensing models.
Parent
Re: (Score:3, Informative)
But it is tricky to use for any language other than English. Out of the box, it's English or nothing. Other European languages are complicated; more complex languages like Arabic, Hindi, or Chinese require some very involved hacks indeed.
Really? All of my LaTeX files are UTF-8, and most include some non-English characters. I tend to use the raw unicode, rather than the LaTeX sequences because they are easier to type on a Mac. I'm not using a custom version of LaTeX although I vaguely remember having to include a package that told LaTeX to use UTF-8. Things like Greek letters and accents just work. I've not tried Arabic, Hindi or Kanji, however.
Re:CSS for Documents? (Score:4, Insightful)
Every word processor I've seen like forever has support for styles. The problem is:
1) It's impossible to avoid creating a million new styles by accident. Try looking at the styles list and you'll see it's full of junk
2) It's impossible to clean up a document with such a bunch of styles, for example say you have a document which has been completely fucked up with pseudo-styles. You've set "Normal" to be what the bulk text should be, and "Headings" to what they should be. What happened last time I tried it? Well, it was impossible to easily apply it without killing any bullet lists, bold, italics or any other intended variation of the normal text. Headers and numbering went beserk. Trying to do the same with the bullet list style lead to numbers going completely nutzoid, for some reason it thought everyone in the same style belonged to the same list so later lists would start at some random number.
3) If you for some reason is stuck copying between different versions of Word (norwegian and english comes to mind) then you'll have double the number of styles, which obviously aren't in synch.
So to sum it up what I would like:
1) Don't auto-create styles
2) This sentence does not contain three styles
3) Sane "apply style" functions
- Parituclary directed at fixing a mess
4) Make styles have an ID, at least for the default ones make them international so header 1 is header 1 in every language
5) Ability to "style-lock" documents for things like company standards, you can create new styles but not just randomly change around sizes and fonts
6) More visible styles (OpenOffice does this, MS word doesn't) because people don't see them
Parent
Re:CSS for Documents? (Score:4, Insightful)
Word DOS (version 4 at least) had it back almost 20 years ago. And actually it was much easier to use styles back in the DOS version. Current versions try so hard to second guess you in the quest for user-friendliness and layering features on top of features that you can change or create new styles without knowing or intending to. Old-school required you to RTFA, but then you could use styles very efficiently. Now styles are much more sophisticated, but hardly anyone uses them correctly. I get docuements from all kinds of people, including many university lecturers. None, out of hundreds over the last 15 years, has had a clue of how to style their documents. Headings are "Normal" with font commands to make them large; body text is "Heading 1" converted to 12-point Times; bulleted and numbered lists are a minefield, tables are a quagmire of hacks, spaces and tabs, etc...
Parent
I don't know that I agree completely (Score:5, Insightful)
Sure, it works, with enough tweaking, and CSS3, and a $350 download of a product to turn HTML/CSS3 into a PDF. This is better how? What about LyX, LaTeX, or even OpenOffice if you are just going to convert to PDF?
The whole HTML/CSS-to-print thing shoots the real argument in the foot.
Re:I don't know that I agree completely (Score:4, Insightful)
Parent
Re: (Score:3, Informative)
PostScript (and PDF) have the adobe problem, but there is a better format that doesn't: DVI (the device independent format created by Donald Knuth).
The DVI format doesn't even have the capability to include bitmap images. LaTeX cheats and uses the comment section to point to an external encapsulated postscript file. dvips will read this and include the EPS, and so will some DVI viewers but this can lead to all sorts of hard-to-track-down bugs. I ditched latex for pdflatex a while ago, and haven't looked back.
How come? (Score:5, Funny)
Re: (Score:3, Insightful)
My speculation would be that no-one wants to sit and read a 6,000 page specification. 700 pages is far more palletable.
It's a crap way of judging the relative merits of specifications, but human nature will out.
Re:How come? (Score:5, Informative)
Since nobody gets it, I'll spoil it: That's how Håkon advises people to pronounce his name. It's even on his business card.
Parent
Re: (Score:3, Insightful)
Well, the basic parser isn't really an issue. I haven't investigated either standard in any detail, but assuming they're actual XML, or even reasonably close, there are a million libraries that can handle the parsing. Expat, Xerces, Arabica, the Qt XML parser, and the Java library XML parser come to mind.
The majority of the work is interpretting the tags and actually laying out the document in a standardized way.
Can != Should (Score:3, Insightful)
HTML + CSS vs. Word vs. OO.o seems to me to be an argument related to formatting documents, not a "book". It's not that you couldn't do it, but I'd consider using Quark or InDesign (what seems to be Adobe's successor to PageMaker) or even Tex and its variants (haven't used any Tex-based stuff, but heard wonderful things) for typesetting.
Arguments about standards aside, proof of concepts aside, I'd think that the real issue when it comes to any job is using the best tool for it. It's not a question of whether you can use these tools to typeset a book, but if you should.
The point of the proof of concept is to prove that the system is flexible or capable enough to go beyond its original intended use. I get that. But proving a chainsaw can be used to spread butter, doesn't mean it's inherently superior to a coping saw.
- Greg
to kill a mockingstandard (Score:3, Insightful)
fonts (Score:3, Informative)
Too true (Score:4, Insightful)
I wrote my thesis book this way (Score:3, Informative)
http://software-libre.rudd-o.com/ [rudd-o.com]
Used MediaWiki to write the chapters, wrote a small python proggie (available there) to consolidate the wiki into a single HTML file (mostly conforming to the Boom! microformat), then used Prince and Hakom's book CSS to generate the PDF.
Great typesetting, collaborative book editing, screw LaTeX!
Hakom was right.
Re: (Score:3, Insightful)
Those who don't understand LaTeX are doomed to reinvent it... poorly.
Open Office Herecy (sold here) (Score:5, Interesting)
That said, ODF it kind of blows. Really.
I write novel-length "books" and it is FREAKING IMPOSSIBLE to do some very basic things in any/every ODF based word processor I have tried to date.
Exercise for the Interested:
Make a "Book" with an automatic table of contents, said table to contain an "Authors Note", "Prologue", auto-numbered chapters 1 to N with their associated chapter titles (where the actual chapter number is the chapter number internal variable), and finally "Epilogue" all at the same level of the index.
This simple task is essentially impossible. The flaw is caused by the fact that everything goes through the "styles" and the styles don't inherit their list membership properties. You should be able to make a style "TOC Entry" that is assigned to a particular table of contents level (e.g. level 1) then make a sub-style "Chapter Heading" based on "TOC Entry" but with the chapter numbering magic attached, and in so doing, create "different styles" that go to the same level/point in the list.
Exercise for the Interested:
Make a "Book" with each chapter, and the prolog, and the epilog in separate sub documents. The linkage thing is a mess, it is hard to move "the pile of files" around especially if you want to use subdirectories (etc). If you have a custom style in the master document style list you have to _USE_ it in the master document if you want it to be pushed into the created sub-documents. Once the sub-documents are created it is a royal pain (read effectively impossible, or "supremely hidden feature required") to update those styles in those sub documents if you change that style.
Exercise for the Interested:
Put three separate "outlines" into one ODF Document. In ODF the outline is a function of the style headers, they only exist as implications of structure instead of first class abstractions. This is largely the fault of Microsoft Word, since the Word folks totally messed this up when they supplanted WordPerfect (which did this inset outline/object sort of thing right).
ODF was, IMHO, poisoned by the slavish attempt by someone trying to make a Word killer instead of a "good word processor."
And there are stacks more of these issues.
And all that said, I *STILL* use ODF (Open Office etc) because I CATEGORICALLY REFUSE to _RENT_ the right to access my own work from a third party. Microsoft has plainly stated that such rental model is their intended business plan, which makes them a non-starter.
In my opinion, having used both Word and OpenOffice for years; and having used Word Perfect and wordstar before them, ODF is a "workman like effort" to create a document format suitable for "normal business purposes". There is a reason that the legal profession never moved over to Word, and they likewise will not move to ODF, when you need to get to a tightly proscribed document format, both Word and ODF have a "you can't get there from here" fundamental limitation. Both formats simply refuse to represent some things because the designers "know" that a different format is better. Neither ODF nor Word has any allowances for _art_, professional or poetical.
So, governments should use ODF because it is "no worse" than Word in terms of the ability to represent the documents it can represent, and given that congruence, the shorter, 100% open standard is, or should be, a hard minimum requirements.
In terms of ODF being the be-all and end-all of document representation, I'd have to say "hardly!" I looked into the OpenOffice code base a while back to see if adding/changing the format to allow for "a book" would be reasonable. It didn't appear to be. Too many of the original StarOffice assumptions about document structure seemed pathologically uninspired. It was like looking at a big pile of Visual Basic. Everything in the standard is way too global, nothing "nests organically" it all nests pedagogically. (Every
You're using the wrong tool. (Score:4, Insightful)
Parent
Um... NO (Score:5, Informative)
Validation is relevant (Score:3, Insightful)
But if the Oasis pages did validate, the basic argument goes like this: "How can they claim to care about standards if they can't even bother to support that most universal standard of standards, HTML?" And indeed, I could still make that argument -- just look at the sad, sad state of affairs that is Internet Explorer's CSS [mis]handling.
Re: (Score:3, Informative)
An example of the HTMLDOC specific code used in the conversion