Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Opera CTO Hits Back at Microsoft's Standards Push

Posted by Zonk on Sat Feb 24, 2007 01:22 AM
from the going-to-make-the-nighly-news dept.
Michael writes "Opera CTO Håkon Wium Lie hit back today at Microsoft's push to fast track Office Open XML into an ISO standard, in a blistering article on CNET. He also took a swipe at Open Document Format: 'I'm no fan of either specification. Both are basically memory dumps with angle brackets around them. If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML). But I think there is a better way.' The better way being the existing universally understood standards of HTML and CSS. Putting this to the test, Håkon has published a book using HTML and CSS."
+ -
story

Related Stories

[+] IT: Microsoft Wins Industry Standard Status for Office 281 comments
everphilski writes "The International Herald-Tribune reports that Microsoft has won industry standard status for Office. EMCA International, a group of hardware and software makers based in Geneva, approved the MS file formats with only one dissenting vote - IBM. IBM backs the OpenDocument standard, which was approved by the ISO in May of this year." From the article: "Bob Sutor, IBM's vice president for open source and standards, called Microsoft's Office formats technically unwieldy - requiring software developers to absorb 6,000 pages of specifications, compared with 700 pages for OpenDocument. 'The practical effect is the only people who are going to be in a position to implement Microsoft's specifications are Microsoft,' Sutor said."
[+] Microsoft Blasts IBM Over XML Standards 323 comments
carlmenezes writes "Ars Technica has up an article discussing Microsoft's latest salvo against IBM. Microsoft's open letter to IBM adds fresh ammunition to the battle of words between those who support Microsoft's Open XML and OpenOffice.org's OpenDocument file formats. Microsoft has strong words for IBM, which it accuses of deliberately trying to sabotage Microsoft's attempt to get Open XML certified as a standard by the ECMA. In the letter, general managers Tom Robertson and Jean Paol write: 'When ODF was under consideration, Microsoft made no effort to slow down the process because we recognized customers' interest in the standardization of document formats.' In contrast, the authors charge that IBM 'led a global campaign' urging that governments and other organizations demand that International Standards Organization (ISO) reject Open XML outright."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by Frosty Piss (770223) on Saturday February 24 2007, @01:33AM (#18132136)

    Opera CTO Håkon Wium Lie sort of "bitch slapped" a picture of Bill Gates and splashed some white wine around...

    Ok... Cheese, anyone?

  • fsck'n ugly (Score:5, Insightful)

    by Anonymous Coward on Saturday February 24 2007, @01:37AM (#18132152)
    Yeah, but that "book" is fsck'n ugly. It doesn't even compare to a professionally typeset book, or something produced in LaTeX. I hope that isn't the "solution" to this standards "problem". Let's face it, the average Joe is going to use whatever Microsoft pushes at them. Case closed.
    • Re:fsck'n ugly (Score:5, Insightful)

      by AKAImBatman (238306) * <akaimbatman@gm a i l . com> on Saturday February 24 2007, @02:02AM (#18132278) Homepage Journal

      Yeah, but that "book" is fsck'n ugly. It doesn't even compare to a professionally typeset book, or something produced in LaTeX.

      You don't typeset with Microsoft Word, either. Which makes the entire argument specious. Word processors like MS Word and OOo Writer are for creating common documents like letters, memos, and maybe the occasional flyer. Neither one is particularly good at anything even close to professional publishing work. Even the book authors just use Word (or surprisingly, OOo Writer!) to do the text content. That text is then exported to a more sophisticated program, where the actual typesetting and page layouts are done.

      I think this fellow's point is that HTML/CSS formats can store any information that a Word Processor might need to store, with no need to invoke new technologies. To a certain extent, he may be correct. Unfortunately, HTML/CSS may make a good intermediary format, but it is not particularly good from a performance or usability perspective. Then again, XML formats in general are fairly poor choices for the same reason.

      I think if we want to break this conundrum, the industry is going to have to learn how to keep local data stores that are of high performance, while exporting intermediary formats when emailing or uploading to external computers. The only problem is finding a way of doing this so that it's completely transparent to users. The mythical "mom" doesn't want to worry about emailing a document in the right format, or having the right program to read the attachment she received. She just wants it to do what she tells it, with no bloody prompting with questions she has no answers for.
      • Re: (Score:3, Informative)

        by Anonymous Coward
        You're entirely right. Word/OOo aren't used for pro typesetting and page layout. But if we exclude that, then we still have many, many other formats, like RTF too (or why not even BBCode while we're at it?). Yes it's quite ugly, but I don't see (x)html + css as being the answer either:

        -too many versions of html (4, and perhaps 5 soon) and xhtml (1.0, 1.1, strict, transitional, etc)
        -different versions of CSS, browser support for it varies quite a bit (and is pretty much non-existent for CSS3)
        -too many render
        • Re:fsck'n ugly (Score:5, Informative)

          by EvanED (569694) <evaned&gmail,com> on Saturday February 24 2007, @03:10AM (#18132490)
          html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!)

          This would have to be done by the tool displaying it, same as a self-updating TOC in a Word or OpenOffice Writer document. The information is present in a correctly-structured HTML document in the form of Hx tags.

          Hell, how can you even tell the page numbers in a html "document" anyways?

          The same way you would in a Word document. It doesn't make sense if you're looking at it as a web page in your browser, but if your editor used HTML it would work the same way. (This also partially alleviates the rendering issues.)
        • Re:fsck'n ugly (Score:5, Insightful)

          by Anonymous Coward on Saturday February 24 2007, @03:26AM (#18132554)

          I don't see (x)html + css as being the answer either:
          Only because you can't tell the difference between "XHTML + CSS" and "web pages".

          -too many versions of html (4, and perhaps 5 soon) and xhtml (1.0, 1.1, strict, transitional, etc)
          So? Pick one as your word-processor standard, and rule all the others out. The existence of too many versions of MS Word doesn't seem to have hurt the .doc format.

          -different versions of CSS, browser support for it varies quite a bit (and is pretty much non-existent for CSS3)
          What does browser support have to do with word processing? We're talking about word processors, not web sites.

          -too many rendering engines, css hacks required so the content displays the same in most of them, etc
          And this is different from word processors how? Microsoft's XML format is absolutely crammed full of hacks to duplicate obscure rendering features of obsolete versions of Word, WordPerfect, etc. And it would surprise me very much if the rendering of ODF was pixel-identical between all the products that support it.

          -html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!)
          You're thinking of web pages, not HTML. HTML used for a document could easily have an auto-generated table of contents. Remember that we're talking about using HTML as the file format for a word processor. A word processor can trivially parse the DOM for header tags and update a table of contents without requiring any JavaScript at all. It's kind of what word processors are for.

          Hell, how can you even tell the page numbers in a html "document" anyways?
          By looking at the little "Page N of N" display in your word processor, I would assume.

          -while word/OOo formats aren't real typesetting (like InDesign CS2 would do), at least they have half-way decent typography. Yeah, no fancy glyphs or super precise kerning, but it's still usable. On the web there's only a handful of "just OK" fonts one can use (unless everything is rendered server-side as images).
          What does "on the web" have to do with word processors? We're not talking about the web here. We're talking about word processors, which will have access to all the fonts the user owns, just like any other application.

          -if people use html/css, there would basically be no standards *at all* or anything even resembling it (much like anything we see on the web).
          Why not? We're talking about word processors, not the web. We're talking about computer-generated HTML, not something some 13-year-old hacked together by copying-and-pasting examples into Notepad. It would be trivial to enforce valid XHTML 1.1 + CSS2.1, for example.
          • Re:fsck'n ugly (Score:5, Insightful)

            by Lost my low ID nick (1035980) on Saturday February 24 2007, @06:24AM (#18133080)
            So, McSmarty, how do I
              - position an image on page 4 of my document?
              - add footnotes?
              - embed fields (date, last editor...)?
              - mark the embedded TOC as TOC so that it gets regenerated on reload?
            etc.

            And on the CSS side, there are quite a lot of shortcomings, too.

            Of course, all of this would work with custom XML tags or special id/class conventions, BUT then you'd have to specify those. And getting this below 700 pages won't be easy.

            So repeat after me:

            HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.
            • Re:fsck'n ugly (Score:5, Interesting)

              by TheRaven64 (641858) on Saturday February 24 2007, @08:07AM (#18133464) Homepage Journal
              I had a little go at using HTML for this kind of thing a few years ago. One thing that you might not be aware of is that CSS has a few things related to pagination. While you can't say 'put this image on page 4,' you can say 'if you need to put a page break in, put it before or after this div, so that this text and this image are on the same page.' For the table of contents, I wrote some ECMAScript that scanned the DOM tree for h1-4s and built a set of nested lists to display it, with links to the real headings. It didn't print the page number because, although this is possible with CSS it wasn't implemented in any browsers when I tried it. The embedded fields are already supported by meta tags in the document head. Footnotes, however, are a tremendous pain to get right with HTML.

              I just dug out the template I wrote, and the pagination and ToC worked fine in Safari. The auto-numbering of headers, however, didn't. This is due to a lack of support for counters in generated content, and the same problem with Mozilla was a significant reason for abandoning the whole idea in the first place; the only browser everything worked in was Opera.

              Another significant reason for abandoning this idea (not entirely relevant when talking about document formats being generated by tools) was that HTML is a huge pain to type, and XHTML is even worse. Something semantically equivalent to XHTML but using S-expressions would have been fine, but typing XHTML just involves spending far too much time hitting > and < keys (not to mention the redundancy of close tags having the full tag name). I turned to LaTeX, which is easier to type and also (being a Turing-complete programming language) much easier to extend than HTML.

            • Re:fsck'n ugly (Score:5, Informative)

              by EsbenMoseHansen (731150) on Saturday February 24 2007, @08:26AM (#18133512) Homepage

              So, McSmarty, how do I
              - position an image on page 4 of my document?

              You don't, nor do you want to. But you can anchor, float or bind the images to the text easily enough. This would be handled by css... for the HTML side, it would just be div and object tags --- not that you would ever see them, since this is an word app.

              - add footnotes?

              <p class="footnote">My footnote</p> with the appropriate CSS rule (presumably something like float: page or whatever.)

              - embed fields (date, last editor...)?

              Using XML entities, presumably

              - mark the embedded TOC as TOC so that it gets regenerated on reload?

              Regenerated on reload? Come on, have some ambition.. it should be in sync at all times. Anyway, by keeping tracks of the header tags, presumably.

              HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.

              XHTML+CSS would need some expansions... but probably not much. A good layout program propably doesn't care about the device, but if it did, there are already @media tags to handle this situations. There are also a couple of other truly dedicated layout namespaces on w3 to consider.

              But all this matters not. This is politics. Sadly.

            • Re:fsck'n ugly (Score:4, Insightful)

              by TheoMurpse (729043) <kylegoetz AT gmail DOT com> on Saturday February 24 2007, @11:53AM (#18134556) Homepage

              So, McSmarty, how do I
                  - position an image on page 4 of my document?
                  - add footnotes?
                  - embed fields (date, last editor...)?
                  - mark the embedded TOC as TOC so that it gets regenerated on reload?
              I'm on your side in this debate, but as a web dev I have knowledge over these things which you apparently do not. To embed a field, how about <meta name="author" content="TheoMurpse">. As for marking the embedded TOC, how about <div id="TOC">? For positioning an image on page 4, well, I don't know if you've ever looked at a DOC or ODT file, but the file itself says nothing about where page 3 ends and page 4 begins. Instead, you see that once the word processor has rendered the file. Thus, I see no difference between HTML and any other format. Hell, I don't even know if you can say "put this on page 4" in a LaTeX document. First of all, you'd never want to put it on page 4. Instead, you'd want to put it in between other elements, which may end up placing it on page 4, but then when you update your text on page 3, it may cause the image to need to be on page 5.

              Footnotes are easy, too: Text Text that needs a footnote.<div class="footnote">This is the footnote</div>. That's the same concept as in LaTeX, the best typesetting software out there.
        • Looks to me like Opera has only one tool: a hammer (or is that a web browser?)
          Actually, I think it's a high-pitched voice capable of shattering glass.
      • Re: (Score:3, Insightful)

        ***I think this fellow's point is that HTML/CSS formats can store any information that a Word Processor might need to store, with no need to invoke new technologies. To a certain extent, he may be correct. Unfortunately, HTML/CSS may make a good intermediary format, but it is not particularly good from a performance or usability perspective. Then again, XML formats in general are fairly poor choices for the same reason.***

        The M in HTML stands for MARKUP. And it means it. HTML is NOT a layout language.

        • Re: (Score:3, Interesting)

          The M in HTML stands for MARKUP. And it means it. HTML is NOT a layout language. Never has been, and apparently never will be despite unending attempts to use it for page layout. In fact, HTML documents look different in every browser -- which is not, I think, a characteristic that most users are going to desire for a large subset of documentation. How, for example, can you specify a an OCRable form, if the rendering program is free to move the damn boxes around?

          I think that's why he says HTML/CSS. HTML ta

      • Re:fsck'n ugly (Score:4, Insightful)

        by Mateo_LeFou (859634) on Saturday February 24 2007, @09:36AM (#18133786) Homepage
        "The mythical "mom" doesn't want to worry about emailing a document in the right format, or having the right program to read the attachment she received. She just wants it to do what she tells it, with no bloody prompting with questions"

        No offense, but I'm getting sick of this line of reasoning. You're right, mom wants the computer to read her thoughts, know exactly what she really meant when she said X, anticipate every need she might have, and pre-calculate its complexity out of existence.

        In other news, my boss would like this entire website built in one hour ($40), never need support, and scale to 300,000 users.

        At a certain point IT's job goes from "give every user what heshe wants" to "educate users about what is feasible in the current technological situation.
  • by Tablizer (95088) on Saturday February 24 2007, @01:38AM (#18132154) Homepage Journal
    "Both are basically memory dumps with angle brackets around them."
  • Is it mature enough? (Score:4, Interesting)

    by Goalie_Ca (584234) on Saturday February 24 2007, @01:38AM (#18132160)
    HTML and CSS are quite capable of rendering and displaying webpages. What happens with a simple thing like a file header showing page number and author name. Footers with footnotes? How about dealing with table of contents etc. How would a page in a document be broken down? Anyone who's tried to print HTML knows there are many issues with layout. What's sad though is that even HTML and CSS is not supported the same in all browsers.

    I'm a latex junkie. Latex though is a PITA to create templates and styles for. Someone willing to take up the task to modernize latex or completely replace it?
    • by willy_me (212994) on Saturday February 24 2007, @01:45AM (#18132194)

      I'm a latex junkie. Latex though is a PITA to create templates and styles for. Someone willing to take up the task to modernize latex or completely replace it?
      Done. It's called ConTeXt [pragma-ade.com].
    • Re: (Score:3, Interesting)

      The trick to using LaTeX safely is automation. The less TeX twiddling you have to do manually the better.

      For me, I write my user manuals [for my FL/OSS projects] in LaTeX because the layout is much better, and the process much simpler than wrestling with a word processor.

      Why anyone writes books in anything else is beyond me.

      My first book [math text] that was published was all LaTeX, and while it wasn't all super simple the vast majority of the layout and setting work was handled by TeX itself. My second b
      • Re: (Score:3, Informative)

        Why anyone writes books in anything else is beyond me.

        I couldn't agree more. I am currently writing a book, and I can't imagine how people use tools like Word. It has a lot of technical content, particularly code snippets. With LaTeX, I can easily insert a few lines from a code file, and have it automatically syntax highlighted. I never have to worry about copy-and-paste errors, since the source code is included directly from the source files, which I can compile and test.

        I can also define short commands like \code{} for inline code snippets (e.g. variab

    • Re: (Score:3, Insightful)

      Keep in mind this was published by a bigwig at Opera. The Opera web browser tends to stay way ahead of the other browsers in terms of standards compliance. This includes things like the ability to use the page elements to force page breaking and to help create layouts useful for things like books, reports, etc. Opera is a great engine for rendering HTML & CSS, I personally just can't get past the UI.
      • by indiechild (541156) on Saturday February 24 2007, @02:47AM (#18132406)
        Tables are not obsolete. Tables are still used for tabular data, which is what they were originally intended to be used for, and that has not changed.

        Tables shouldn't be used for page layout -- that's what CSS is for. It's as simple as that.
      • by MrNaz (730548) on Saturday February 24 2007, @03:42AM (#18132604) Homepage
        You mean you display tabular data *without* tables? Dude, you missed the point in a big way. Like say for example Andre Agassi was serving a tennis ball at you, by "missed" I mean he was serving the ball on a court in California while you were standing waiting to receive on a court in Florida.
  • huh? (Score:5, Funny)

    by User 956 (568564) on Saturday February 24 2007, @01:40AM (#18132166) Homepage
    Putting this to the test, Håkon has published a book using HTML and CSS.

    Uhm. I'm no expert, but isn't a book that uses HTML and CSS called a website?
    • Re:huh? (Score:5, Informative)

      by 8-bitDesigner (980672) on Saturday February 24 2007, @01:55AM (#18132232) Homepage
      Actually one of the highlights of the CSS spec is support for non-standard display types, such as screen readers, projectors, PDA, and yes, print. CSS is a rather brilliant standard, but since W3C hasn't really seen fit to publish a reference platform for it, there's no real compliance checking in the major browers.
      • Re: (Score:3, Insightful)

        CSS would be a great standard, but it leaves too much to the people who implement it; is this a block type or inline? What should the default for this nonstandard tag be? etc, etc.

        If they spelled everything out without any ambiguity it would make a better standard.. but then it would be another "600 page long" standard with is what he seems to be against in the first place.
  • CSS for Documents? (Score:5, Insightful)

    by zaydana (729943) on Saturday February 24 2007, @01:42AM (#18132174)

    Having a word processor act more like a web browser would be awesome. Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

    While turning word processors into web browsers would be stupid, things like CSS would be awesome to have in word processors.

    • by Coryoth (254751) on Saturday February 24 2007, @01:56AM (#18132240) Homepage Journal

      Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

      Such things exist. TeX provides a decent the base for such things, so it's a matter of finding a TeX centric editor. LyX would be a good example, and indeed it has the sort of functionality and general approach to document creation that you seem to be after. Of course it doesn't necessarily have all the other features that other word processors might have (like mail merge or what have you).
    • Re: (Score:3, Insightful)

      Latex: its not that hard to learn.

      Lyx provides a GUI front end, but you lose a lot of flexibility.

      Texmacs might work for you as well, although I found it very clunky.
      • by Antique Geekmeister (740220) on Saturday February 24 2007, @02:50AM (#18132418)
        Indeed: LyX is extremely handy for providing to undergraduates or research assistants whose thesis advisors insist on using TeX or LaTeX, who lack the time to learn yet another language. LyX is the difference between having slightly more elegant .tex files, and getting an hour more of sleep a night when writing your thesis because you can edit in a GUI and don't have to debug your .tex files.

        I am finding myself wishing that OpenOffice had pursued putting a vastly better interface on TeX and LaTeX, rather than writing their own standard. It would probably have been faster and certainly would have been a lot more stable. Microsoft couldn't have even thought about it: its clean, open standards would not have lent themselves to the proprietary "extend" part of Microsoft's "embrace and extend" approach, or Microsoft's software licensing models.
        • Re: (Score:3, Informative)

          But it is tricky to use for any language other than English. Out of the box, it's English or nothing. Other European languages are complicated; more complex languages like Arabic, Hindi, or Chinese require some very involved hacks indeed.

          Really? All of my LaTeX files are UTF-8, and most include some non-English characters. I tend to use the raw unicode, rather than the LaTeX sequences because they are easier to type on a Mac. I'm not using a custom version of LaTeX although I vaguely remember having to include a package that told LaTeX to use UTF-8. Things like Greek letters and accents just work. I've not tried Arabic, Hindi or Kanji, however.

    • by Kjella (173770) on Saturday February 24 2007, @07:59AM (#18133420) Homepage
      Having a word processor act more like a web browser would be awesome. Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

      Every word processor I've seen like forever has support for styles. The problem is:

      1) It's impossible to avoid creating a million new styles by accident. Try looking at the styles list and you'll see it's full of junk
      2) It's impossible to clean up a document with such a bunch of styles, for example say you have a document which has been completely fucked up with pseudo-styles. You've set "Normal" to be what the bulk text should be, and "Headings" to what they should be. What happened last time I tried it? Well, it was impossible to easily apply it without killing any bullet lists, bold, italics or any other intended variation of the normal text. Headers and numbering went beserk. Trying to do the same with the bullet list style lead to numbers going completely nutzoid, for some reason it thought everyone in the same style belonged to the same list so later lists would start at some random number.
      3) If you for some reason is stuck copying between different versions of Word (norwegian and english comes to mind) then you'll have double the number of styles, which obviously aren't in synch.

      So to sum it up what I would like:
      1) Don't auto-create styles
      2) This sentence does not contain three styles
      3) Sane "apply style" functions
            - Parituclary directed at fixing a mess
      4) Make styles have an ID, at least for the default ones make them international so header 1 is header 1 in every language
      5) Ability to "style-lock" documents for things like company standards, you can create new styles but not just randomly change around sizes and fonts
      6) More visible styles (OpenOffice does this, MS word doesn't) because people don't see them
      • by 1u3hr (530656) on Saturday February 24 2007, @06:36AM (#18133118)
        though at least before 2007 (which I haven't used so can't comment on) they haven't done much to bring attention to the feature

        Word DOS (version 4 at least) had it back almost 20 years ago. And actually it was much easier to use styles back in the DOS version. Current versions try so hard to second guess you in the quest for user-friendliness and layering features on top of features that you can change or create new styles without knowing or intending to. Old-school required you to RTFA, but then you could use styles very efficiently. Now styles are much more sophisticated, but hardly anyone uses them correctly. I get docuements from all kinds of people, including many university lecturers. None, out of hundreds over the last 15 years, has had a clue of how to style their documents. Headings are "Normal" with font commands to make them large; body text is "Heading 1" converted to 12-point Times; bulleted and numbered lists are a minefield, tables are a quagmire of hacks, spaces and tabs, etc...

  • by Evardsson (959228) on Saturday February 24 2007, @01:44AM (#18132190) Homepage
    While I do agree that the ISO doesn't need more than one standard for printable documents, I don't think that Håkon Wium Lie is on the right track with HTML/CSS for print.

    Sure, it works, with enough tweaking, and CSS3, and a $350 download of a product to turn HTML/CSS3 into a PDF. This is better how? What about LyX, LaTeX, or even OpenOffice if you are just going to convert to PDF?

    The whole HTML/CSS-to-print thing shoots the real argument in the foot.
    • by panaceaa (205396) on Saturday February 24 2007, @05:08AM (#18132876) Homepage Journal
      Why is anyone even talking about the opinion of a CEO? Opera is an HTML company -- they make HTML browsers. Why would the CEO of Opera have anything objective to say about OOXML or OpenXML? He wouldn't, which is why his pushes his own company's core competency: HTML. While Opera doesn't have a huge market share, if the market for HTML viewers grows, his company's likely to take a piece of that pie. But it's completely bunk because HTML's a mess of different standards, with many people using HTML 4.01 Transitional to this day, and the idea of people adopting CSS3 and writing documents using HTML is pretty far fetched. But you would never hear that from the CEO of an HTML browser company.
        • Re: (Score:3, Informative)

          PostScript (and PDF) have the adobe problem, but there is a better format that doesn't: DVI (the device independent format created by Donald Knuth).

          The DVI format doesn't even have the capability to include bitmap images. LaTeX cheats and uses the comment section to point to an external encapsulated postscript file. dvips will read this and include the EPS, and so will some DVI viewers but this can lead to all sorts of hard-to-track-down bugs. I ditched latex for pdflatex a while ago, and haven't looked back.

  • How come? (Score:5, Funny)

    by ShaunC (203807) * on Saturday February 24 2007, @01:56AM (#18132242) Homepage

    If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML).
    So I'd ask Håkon, "how come?" :)
    • Re: (Score:3, Insightful)

      My speculation would be that no-one wants to sit and read a 6,000 page specification. 700 pages is far more palletable.

      It's a crap way of judging the relative merits of specifications, but human nature will out.

    • Re:How come? (Score:5, Informative)

      by PCM2 (4486) on Saturday February 24 2007, @04:54AM (#18132822) Homepage

      So I'd ask Håkon, "how come?" :)

      Since nobody gets it, I'll spoil it: That's how Håkon advises people to pronounce his name. It's even on his business card.

          • Re: (Score:3, Insightful)

            I'm no programmer but it wouldn't take me a whole lot of time to write a basic parser.

            Well, the basic parser isn't really an issue. I haven't investigated either standard in any detail, but assuming they're actual XML, or even reasonably close, there are a million libraries that can handle the parsing. Expat, Xerces, Arabica, the Qt XML parser, and the Java library XML parser come to mind.

            The majority of the work is interpretting the tags and actually laying out the document in a standardized way.

  • Can != Should (Score:3, Insightful)

    by gbulmash (688770) <semi_famous&yahoo,com> on Saturday February 24 2007, @02:58AM (#18132438) Homepage Journal
    Been a long time since I typeset anything, but I used Adobe Pagemaker when I typeset a couple of college magazines in the mid-90s and FrameMaker when I was maintaining courseware in the late '90s for Nortel.

    HTML + CSS vs. Word vs. OO.o seems to me to be an argument related to formatting documents, not a "book". It's not that you couldn't do it, but I'd consider using Quark or InDesign (what seems to be Adobe's successor to PageMaker) or even Tex and its variants (haven't used any Tex-based stuff, but heard wonderful things) for typesetting.

    Arguments about standards aside, proof of concepts aside, I'd think that the real issue when it comes to any job is using the best tool for it. It's not a question of whether you can use these tools to typeset a book, but if you should.

    The point of the proof of concept is to prove that the system is flexible or capable enough to go beyond its original intended use. I get that. But proving a chainsaw can be used to spread butter, doesn't mean it's inherently superior to a coping saw.

    - Greg
  • by mennucc1 (568756) <d3@tonelli.sns.it> on Saturday February 24 2007, @02:58AM (#18132440) Homepage Journal
    An extract of H Wium arguments:

    ODF is an XML-based dump of the internal data structures of OpenOffice, while OOXML is an XML-based dump of the internal data structures of Microsoft Office.

    In 2006, a year or so after ODF entered the fray, Microsoft submitted OOXML to the standardization process. Are we seeing a pattern here? Is Microsoft undermining standards by submitting them? Could it be that it wants both ODF and OOXML to fail?
    so Wium proposes to build a new standard from scratch , starting from HTML and CSS ; but, recognizing that they would not cover all "Office" documents, he goes to saying

    Additional semantics (say, formulas in spreadsheets) can be encoded as attributes, as do microformats, and CSS 3 offers advanced features for printing (e.g., footnotes and header and footers).
    My thoughts:
    • Suppose MicroSoft were to listen to Wium (which they wont). Guess what ? Those additional fields containing formulas (and anything else that makes {MS,Open}Office much more useful than HTML) again would be just an XML-based dump of the internal data structures of so and so.
    • I dont like , more in general this article. Wium is saying that MicroSoft is proposing OOXML to kill ODF ; and at the same time he is proposing to kill ODF in favour of a non-existent extension of HTML+CSS. It is like the guy saying : "I dont like the power plugs in my new house, lets tear the house down and rebuild it" , and at the same time saying "why are they taking so much time to build the house?". Suppose MicroSoft would use arguments as those by Wium to convince ISO to reject ODF and then start a new draft based on HTML, drafted in cooperation between MicroSoft and other partners (including OpenOffice). That would really kill any hope of an ISO standard for "office" documents.
  • fonts (Score:3, Informative)

    by cybpunks3 (612218) on Saturday February 24 2007, @03:33AM (#18132576)
    The problem with using HTML for publishing is that to this day there is no viable downloadable font system. So you are limited to a lowest-common-denominator list of 2-3 fonts like verdana and new times roman. With Flash and PDF you can do a lot more, but obviously authoring becomes a problem.

  • Too true (Score:4, Insightful)

    by iamacat (583406) on Saturday February 24 2007, @03:41AM (#18132598)
    700 pages is not understandable by anyone but authors. "C programming language" book is 1/3 in size, have endured for 20 years and was instrumental in solving many more problems than word processing. Also, creating an ODF document is a minor function in most applications and is not worth the effort to understand such a huge standard. Proponents of both standards should come up with a modular design instead. At the base level, stick with basic HTML - bold and italic tags, fonts and sizes, paragraph breaks. Define many extensions that can be implemented independently or in any combination, in a manner convenient for both computers and, in a pinch, humans. Opera guy is biased as well - while basic HTML is great at its limited function, CSS is not very readable by humans. Nor does it solve pagination, collaborative editing, resolution independence, color profiles for printing...
  • by Rudd-O (20139) on Saturday February 24 2007, @03:44AM (#18132616) Homepage
    And it worked out great.

    http://software-libre.rudd-o.com/ [rudd-o.com]

    Used MediaWiki to write the chapters, wrote a small python proggie (available there) to consolidate the wiki into a single HTML file (mostly conforming to the Boom! microformat), then used Prince and Hakom's book CSS to generate the PDF.

    Great typesetting, collaborative book editing, screw LaTeX!

    Hakom was right.
    • Re: (Score:3, Insightful)

      Great typesetting, collaborative book editing, screw LaTeX!

      Those who don't understand LaTeX are doomed to reinvent it... poorly.
  • by IBitOBear (410965) on Saturday February 24 2007, @05:09AM (#18132884) Homepage Journal
    I use OpenOffice. I support Open Document Format over MS/XML and .doc.

    That said, ODF it kind of blows. Really.

    I write novel-length "books" and it is FREAKING IMPOSSIBLE to do some very basic things in any/every ODF based word processor I have tried to date.

    Exercise for the Interested:

    Make a "Book" with an automatic table of contents, said table to contain an "Authors Note", "Prologue", auto-numbered chapters 1 to N with their associated chapter titles (where the actual chapter number is the chapter number internal variable), and finally "Epilogue" all at the same level of the index.

    This simple task is essentially impossible. The flaw is caused by the fact that everything goes through the "styles" and the styles don't inherit their list membership properties. You should be able to make a style "TOC Entry" that is assigned to a particular table of contents level (e.g. level 1) then make a sub-style "Chapter Heading" based on "TOC Entry" but with the chapter numbering magic attached, and in so doing, create "different styles" that go to the same level/point in the list.

    Exercise for the Interested:

    Make a "Book" with each chapter, and the prolog, and the epilog in separate sub documents. The linkage thing is a mess, it is hard to move "the pile of files" around especially if you want to use subdirectories (etc). If you have a custom style in the master document style list you have to _USE_ it in the master document if you want it to be pushed into the created sub-documents. Once the sub-documents are created it is a royal pain (read effectively impossible, or "supremely hidden feature required") to update those styles in those sub documents if you change that style.

    Exercise for the Interested:

    Put three separate "outlines" into one ODF Document. In ODF the outline is a function of the style headers, they only exist as implications of structure instead of first class abstractions. This is largely the fault of Microsoft Word, since the Word folks totally messed this up when they supplanted WordPerfect (which did this inset outline/object sort of thing right).

    ODF was, IMHO, poisoned by the slavish attempt by someone trying to make a Word killer instead of a "good word processor."

    And there are stacks more of these issues.

    And all that said, I *STILL* use ODF (Open Office etc) because I CATEGORICALLY REFUSE to _RENT_ the right to access my own work from a third party. Microsoft has plainly stated that such rental model is their intended business plan, which makes them a non-starter.

    In my opinion, having used both Word and OpenOffice for years; and having used Word Perfect and wordstar before them, ODF is a "workman like effort" to create a document format suitable for "normal business purposes". There is a reason that the legal profession never moved over to Word, and they likewise will not move to ODF, when you need to get to a tightly proscribed document format, both Word and ODF have a "you can't get there from here" fundamental limitation. Both formats simply refuse to represent some things because the designers "know" that a different format is better. Neither ODF nor Word has any allowances for _art_, professional or poetical.

    So, governments should use ODF because it is "no worse" than Word in terms of the ability to represent the documents it can represent, and given that congruence, the shorter, 100% open standard is, or should be, a hard minimum requirements.

    In terms of ODF being the be-all and end-all of document representation, I'd have to say "hardly!" I looked into the OpenOffice code base a while back to see if adding/changing the format to allow for "a book" would be reasonable. It didn't appear to be. Too many of the original StarOffice assumptions about document structure seemed pathologically uninspired. It was like looking at a big pile of Visual Basic. Everything in the standard is way too global, nothing "nests organically" it all nests pedagogically. (Every
    • by Luke (7869) on Saturday February 24 2007, @11:39AM (#18134466)
      Using a word processor to write a book is like using stone tablets and and abacus for spreadsheets. You really ought to look at markup-based typesetters like LaTeX or DocBook or software specifically designed for book production.
  • Um... NO (Score:5, Informative)

    by salesgeek (263995) on Saturday February 24 2007, @07:26AM (#18133282) Homepage
    ODF is not about web pages or word processing. It's a standard for office documents including spreadsheets, presentation and word processing. That's a big difference from what Opera's CTO is talking about. CSS/HTML might make a good format for one part of the suite (word processing) with a lot of work on the standard. The issue: that's not what is needed for a standard. It's about doing for office documents what HTML did for websites. ODF is actually an opportunity for opera - extend the browser to support ODF so people can post ODF documents, make dynamic applications render to ODF and so on. It takes the web to the next level and further erodes the big monopoly.
    • Only problem is, the Oasis page itself doesn't validate. However, it seems Wikipedia does...

      But if the Oasis pages did validate, the basic argument goes like this: "How can they claim to care about standards if they can't even bother to support that most universal standard of standards, HTML?" And indeed, I could still make that argument -- just look at the sad, sad state of affairs that is Internet Explorer's CSS [mis]handling.
    • Re: (Score:3, Informative)

      CSS not withstanding, you can use HTMLDOC [htmldoc.org] to produce PDFs from html pages. If you are creating reports etc dynamically anyway, just create a temporary html file and convert it through HTMLDOC. I use Perl to generate reports and interface with HTMLDOC, but YMMV.
      An example of the HTMLDOC specific code used in the conversion :

      # Run HTMLDOC to provide the PDF file to the user...
      system "htmldoc --continuous --browserwidth 800 --landscape --size A4 --header ... --left 1in --embedfonts -f $fileref.pdf $filename"