Forgot your password?
typodupeerror
Microsoft Software Technology

Opera CTO Hits Back at Microsoft's Standards Push 246

Posted by Zonk
from the going-to-make-the-nighly-news dept.
Michael writes "Opera CTO Håkon Wium Lie hit back today at Microsoft's push to fast track Office Open XML into an ISO standard, in a blistering article on CNET. He also took a swipe at Open Document Format: 'I'm no fan of either specification. Both are basically memory dumps with angle brackets around them. If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML). But I think there is a better way.' The better way being the existing universally understood standards of HTML and CSS. Putting this to the test, Håkon has published a book using HTML and CSS."
This discussion has been archived. No new comments can be posted.

Opera CTO Hits Back at Microsoft's Standards Push

Comments Filter:
  • by Frosty Piss (770223) on Saturday February 24, 2007 @02:33AM (#18132136)

    Opera CTO Håkon Wium Lie sort of "bitch slapped" a picture of Bill Gates and splashed some white wine around...

    Ok... Cheese, anyone?

  • fsck'n ugly (Score:5, Insightful)

    by Anonymous Coward on Saturday February 24, 2007 @02:37AM (#18132152)
    Yeah, but that "book" is fsck'n ugly. It doesn't even compare to a professionally typeset book, or something produced in LaTeX. I hope that isn't the "solution" to this standards "problem". Let's face it, the average Joe is going to use whatever Microsoft pushes at them. Case closed.
    • Re:fsck'n ugly (Score:5, Insightful)

      by AKAImBatman (238306) * <akaimbatmanNO@SPAMgmail.com> on Saturday February 24, 2007 @03:02AM (#18132278) Homepage Journal

      Yeah, but that "book" is fsck'n ugly. It doesn't even compare to a professionally typeset book, or something produced in LaTeX.

      You don't typeset with Microsoft Word, either. Which makes the entire argument specious. Word processors like MS Word and OOo Writer are for creating common documents like letters, memos, and maybe the occasional flyer. Neither one is particularly good at anything even close to professional publishing work. Even the book authors just use Word (or surprisingly, OOo Writer!) to do the text content. That text is then exported to a more sophisticated program, where the actual typesetting and page layouts are done.

      I think this fellow's point is that HTML/CSS formats can store any information that a Word Processor might need to store, with no need to invoke new technologies. To a certain extent, he may be correct. Unfortunately, HTML/CSS may make a good intermediary format, but it is not particularly good from a performance or usability perspective. Then again, XML formats in general are fairly poor choices for the same reason.

      I think if we want to break this conundrum, the industry is going to have to learn how to keep local data stores that are of high performance, while exporting intermediary formats when emailing or uploading to external computers. The only problem is finding a way of doing this so that it's completely transparent to users. The mythical "mom" doesn't want to worry about emailing a document in the right format, or having the right program to read the attachment she received. She just wants it to do what she tells it, with no bloody prompting with questions she has no answers for.
      • Re: (Score:3, Informative)

        by Anonymous Coward
        You're entirely right. Word/OOo aren't used for pro typesetting and page layout. But if we exclude that, then we still have many, many other formats, like RTF too (or why not even BBCode while we're at it?). Yes it's quite ugly, but I don't see (x)html + css as being the answer either:

        -too many versions of html (4, and perhaps 5 soon) and xhtml (1.0, 1.1, strict, transitional, etc)
        -different versions of CSS, browser support for it varies quite a bit (and is pretty much non-existent for CSS3)
        -too many render
        • Re:fsck'n ugly (Score:5, Informative)

          by EvanED (569694) <evaned&gmail,com> on Saturday February 24, 2007 @04:10AM (#18132490)
          html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!)

          This would have to be done by the tool displaying it, same as a self-updating TOC in a Word or OpenOffice Writer document. The information is present in a correctly-structured HTML document in the form of Hx tags.

          Hell, how can you even tell the page numbers in a html "document" anyways?

          The same way you would in a Word document. It doesn't make sense if you're looking at it as a web page in your browser, but if your editor used HTML it would work the same way. (This also partially alleviates the rendering issues.)
        • Re:fsck'n ugly (Score:5, Insightful)

          by Anonymous Coward on Saturday February 24, 2007 @04:26AM (#18132554)

          I don't see (x)html + css as being the answer either:
          Only because you can't tell the difference between "XHTML + CSS" and "web pages".

          -too many versions of html (4, and perhaps 5 soon) and xhtml (1.0, 1.1, strict, transitional, etc)
          So? Pick one as your word-processor standard, and rule all the others out. The existence of too many versions of MS Word doesn't seem to have hurt the .doc format.

          -different versions of CSS, browser support for it varies quite a bit (and is pretty much non-existent for CSS3)
          What does browser support have to do with word processing? We're talking about word processors, not web sites.

          -too many rendering engines, css hacks required so the content displays the same in most of them, etc
          And this is different from word processors how? Microsoft's XML format is absolutely crammed full of hacks to duplicate obscure rendering features of obsolete versions of Word, WordPerfect, etc. And it would surprise me very much if the rendering of ODF was pixel-identical between all the products that support it.

          -html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!)
          You're thinking of web pages, not HTML. HTML used for a document could easily have an auto-generated table of contents. Remember that we're talking about using HTML as the file format for a word processor. A word processor can trivially parse the DOM for header tags and update a table of contents without requiring any JavaScript at all. It's kind of what word processors are for.

          Hell, how can you even tell the page numbers in a html "document" anyways?
          By looking at the little "Page N of N" display in your word processor, I would assume.

          -while word/OOo formats aren't real typesetting (like InDesign CS2 would do), at least they have half-way decent typography. Yeah, no fancy glyphs or super precise kerning, but it's still usable. On the web there's only a handful of "just OK" fonts one can use (unless everything is rendered server-side as images).
          What does "on the web" have to do with word processors? We're not talking about the web here. We're talking about word processors, which will have access to all the fonts the user owns, just like any other application.

          -if people use html/css, there would basically be no standards *at all* or anything even resembling it (much like anything we see on the web).
          Why not? We're talking about word processors, not the web. We're talking about computer-generated HTML, not something some 13-year-old hacked together by copying-and-pasting examples into Notepad. It would be trivial to enforce valid XHTML 1.1 + CSS2.1, for example.
          • Re:fsck'n ugly (Score:5, Insightful)

            by Lost my low ID nick (1035980) on Saturday February 24, 2007 @07:24AM (#18133080)
            So, McSmarty, how do I
              - position an image on page 4 of my document?
              - add footnotes?
              - embed fields (date, last editor...)?
              - mark the embedded TOC as TOC so that it gets regenerated on reload?
            etc.

            And on the CSS side, there are quite a lot of shortcomings, too.

            Of course, all of this would work with custom XML tags or special id/class conventions, BUT then you'd have to specify those. And getting this below 700 pages won't be easy.

            So repeat after me:

            HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.
            • Re:fsck'n ugly (Score:5, Interesting)

              by TheRaven64 (641858) on Saturday February 24, 2007 @09:07AM (#18133464) Journal
              I had a little go at using HTML for this kind of thing a few years ago. One thing that you might not be aware of is that CSS has a few things related to pagination. While you can't say 'put this image on page 4,' you can say 'if you need to put a page break in, put it before or after this div, so that this text and this image are on the same page.' For the table of contents, I wrote some ECMAScript that scanned the DOM tree for h1-4s and built a set of nested lists to display it, with links to the real headings. It didn't print the page number because, although this is possible with CSS it wasn't implemented in any browsers when I tried it. The embedded fields are already supported by meta tags in the document head. Footnotes, however, are a tremendous pain to get right with HTML.

              I just dug out the template I wrote, and the pagination and ToC worked fine in Safari. The auto-numbering of headers, however, didn't. This is due to a lack of support for counters in generated content, and the same problem with Mozilla was a significant reason for abandoning the whole idea in the first place; the only browser everything worked in was Opera.

              Another significant reason for abandoning this idea (not entirely relevant when talking about document formats being generated by tools) was that HTML is a huge pain to type, and XHTML is even worse. Something semantically equivalent to XHTML but using S-expressions would have been fine, but typing XHTML just involves spending far too much time hitting > and < keys (not to mention the redundancy of close tags having the full tag name). I turned to LaTeX, which is easier to type and also (being a Turing-complete programming language) much easier to extend than HTML.

            • Re:fsck'n ugly (Score:5, Informative)

              by EsbenMoseHansen (731150) on Saturday February 24, 2007 @09:26AM (#18133512) Homepage

              So, McSmarty, how do I
              - position an image on page 4 of my document?

              You don't, nor do you want to. But you can anchor, float or bind the images to the text easily enough. This would be handled by css... for the HTML side, it would just be div and object tags --- not that you would ever see them, since this is an word app.

              - add footnotes?

              <p class="footnote">My footnote</p> with the appropriate CSS rule (presumably something like float: page or whatever.)

              - embed fields (date, last editor...)?

              Using XML entities, presumably

              - mark the embedded TOC as TOC so that it gets regenerated on reload?

              Regenerated on reload? Come on, have some ambition.. it should be in sync at all times. Anyway, by keeping tracks of the header tags, presumably.

              HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.

              XHTML+CSS would need some expansions... but probably not much. A good layout program propably doesn't care about the device, but if it did, there are already @media tags to handle this situations. There are also a couple of other truly dedicated layout namespaces on w3 to consider.

              But all this matters not. This is politics. Sadly.

            • Re:fsck'n ugly (Score:4, Insightful)

              by TheoMurpse (729043) on Saturday February 24, 2007 @12:53PM (#18134556) Homepage

              So, McSmarty, how do I
                  - position an image on page 4 of my document?
                  - add footnotes?
                  - embed fields (date, last editor...)?
                  - mark the embedded TOC as TOC so that it gets regenerated on reload?
              I'm on your side in this debate, but as a web dev I have knowledge over these things which you apparently do not. To embed a field, how about <meta name="author" content="TheoMurpse">. As for marking the embedded TOC, how about <div id="TOC">? For positioning an image on page 4, well, I don't know if you've ever looked at a DOC or ODT file, but the file itself says nothing about where page 3 ends and page 4 begins. Instead, you see that once the word processor has rendered the file. Thus, I see no difference between HTML and any other format. Hell, I don't even know if you can say "put this on page 4" in a LaTeX document. First of all, you'd never want to put it on page 4. Instead, you'd want to put it in between other elements, which may end up placing it on page 4, but then when you update your text on page 3, it may cause the image to need to be on page 5.

              Footnotes are easy, too: Text Text that needs a footnote.<div class="footnote">This is the footnote</div>. That's the same concept as in LaTeX, the best typesetting software out there.
        • Re: (Score:3, Funny)

          by TheoMurpse (729043)

          Looks to me like Opera has only one tool: a hammer (or is that a web browser?)
          Actually, I think it's a high-pitched voice capable of shattering glass.
      • Re: (Score:3, Insightful)

        by vtcodger (957785)
        ***I think this fellow's point is that HTML/CSS formats can store any information that a Word Processor might need to store, with no need to invoke new technologies. To a certain extent, he may be correct. Unfortunately, HTML/CSS may make a good intermediary format, but it is not particularly good from a performance or usability perspective. Then again, XML formats in general are fairly poor choices for the same reason.***

        The M in HTML stands for MARKUP. And it means it. HTML is NOT a layout language.

        • Re: (Score:3, Interesting)

          by lahvak (69490)

          The M in HTML stands for MARKUP. And it means it. HTML is NOT a layout language. Never has been, and apparently never will be despite unending attempts to use it for page layout. In fact, HTML documents look different in every browser -- which is not, I think, a characteristic that most users are going to desire for a large subset of documentation. How, for example, can you specify a an OCRable form, if the rendering program is free to move the damn boxes around?

          I think that's why he says HTML/CSS. HTML ta

      • Re:fsck'n ugly (Score:4, Insightful)

        by Mateo_LeFou (859634) on Saturday February 24, 2007 @10:36AM (#18133786) Homepage
        "The mythical "mom" doesn't want to worry about emailing a document in the right format, or having the right program to read the attachment she received. She just wants it to do what she tells it, with no bloody prompting with questions"

        No offense, but I'm getting sick of this line of reasoning. You're right, mom wants the computer to read her thoughts, know exactly what she really meant when she said X, anticipate every need she might have, and pre-calculate its complexity out of existence.

        In other news, my boss would like this entire website built in one hour ($40), never need support, and scale to 300,000 users.

        At a certain point IT's job goes from "give every user what heshe wants" to "educate users about what is feasible in the current technological situation.
  • by Tablizer (95088) on Saturday February 24, 2007 @02:38AM (#18132154) Homepage Journal
    "Both are basically memory dumps with angle brackets around them."
    • ...and sig'd in tribute.

      Even more classic perhaps, 'The "layer" element?!' Sure raised my eyebrow; a huge change from "Netscape engineers are weenies!" by any metric. :)

  • Is it mature enough? (Score:4, Interesting)

    by Goalie_Ca (584234) on Saturday February 24, 2007 @02:38AM (#18132160)
    HTML and CSS are quite capable of rendering and displaying webpages. What happens with a simple thing like a file header showing page number and author name. Footers with footnotes? How about dealing with table of contents etc. How would a page in a document be broken down? Anyone who's tried to print HTML knows there are many issues with layout. What's sad though is that even HTML and CSS is not supported the same in all browsers.

    I'm a latex junkie. Latex though is a PITA to create templates and styles for. Someone willing to take up the task to modernize latex or completely replace it?
    • by willy_me (212994) on Saturday February 24, 2007 @02:45AM (#18132194)

      I'm a latex junkie. Latex though is a PITA to create templates and styles for. Someone willing to take up the task to modernize latex or completely replace it?
      Done. It's called ConTeXt [pragma-ade.com].
      • Context has its weaknesses too.

        For example, it cannot produce print and HTML versions of the same document. This may not matter to everyone, but it was something I needed, so I stuck to Latex.
        • by lahvak (69490)
          There isn't actually any inherent reason in ConTeXt that would prevent it from doing that. It just that the tools have not been created yet.
      • Re: (Score:2, Informative)

        by Anonymous Coward
        If you want to stay in Latex use the memoir [ctan.org] document class.
    • by lewp (95638)
      http://www.alistapart.com/articles/boom [alistapart.com] describes how they handled that stuff using CSS2 and proposed CSS3 features.

      I'm not writing a book any time soon, and if I were I wouldn't take this approach, but it is an interesting read.
      • Little tip, if your book has any technical edge to it, learn LaTeX and do the layout/setting yourself.

        Being in control of the layout and setting is very important if you value your creation at all. ... Just saying. not bitter.

        Tom
    • Re: (Score:3, Interesting)

      by tomstdenis (446163)
      The trick to using LaTeX safely is automation. The less TeX twiddling you have to do manually the better.

      For me, I write my user manuals [for my FL/OSS projects] in LaTeX because the layout is much better, and the process much simpler than wrestling with a word processor.

      Why anyone writes books in anything else is beyond me.

      My first book [math text] that was published was all LaTeX, and while it wasn't all super simple the vast majority of the layout and setting work was handled by TeX itself. My second b
      • Re: (Score:3, Informative)

        by TheRaven64 (641858)

        Why anyone writes books in anything else is beyond me.

        I couldn't agree more. I am currently writing a book, and I can't imagine how people use tools like Word. It has a lot of technical content, particularly code snippets. With LaTeX, I can easily insert a few lines from a code file, and have it automatically syntax highlighted. I never have to worry about copy-and-paste errors, since the source code is included directly from the source files, which I can compile and test.

        I can also define short commands like \code{} for inline code snippets (e.g. variab

    • Re: (Score:3, Insightful)

      by sweetooth (21075)
      Keep in mind this was published by a bigwig at Opera. The Opera web browser tends to stay way ahead of the other browsers in terms of standards compliance. This includes things like the ability to use the page elements to force page breaking and to help create layouts useful for things like books, reports, etc. Opera is a great engine for rendering HTML & CSS, I personally just can't get past the UI.
    • by lahvak (69490)

      HTML and CSS are quite capable of rendering and displaying webpages. What happens with a simple thing like a file header showing page number and author name. Footers with footnotes? How about dealing with table of contents etc. How would a page in a document be broken down? Anyone who's tried to print HTML knows there are many issues with layout. What's sad though is that even HTML and CSS is not supported the same in all browsers.

      All of these are problems with browsers, not the actual file format. What's

  • huh? (Score:5, Funny)

    by User 956 (568564) on Saturday February 24, 2007 @02:40AM (#18132166) Homepage
    Putting this to the test, Håkon has published a book using HTML and CSS.

    Uhm. I'm no expert, but isn't a book that uses HTML and CSS called a website?
    • Re:huh? (Score:5, Informative)

      by 8-bitDesigner (980672) on Saturday February 24, 2007 @02:55AM (#18132232) Homepage
      Actually one of the highlights of the CSS spec is support for non-standard display types, such as screen readers, projectors, PDA, and yes, print. CSS is a rather brilliant standard, but since W3C hasn't really seen fit to publish a reference platform for it, there's no real compliance checking in the major browers.
      • Re: (Score:3, Insightful)

        by kestasjk (933987) *
        CSS would be a great standard, but it leaves too much to the people who implement it; is this a block type or inline? What should the default for this nonstandard tag be? etc, etc.

        If they spelled everything out without any ambiguity it would make a better standard.. but then it would be another "600 page long" standard with is what he seems to be against in the first place.
        • Agreed with all that CSS is very useful, but as much as I like the Opera browser it seems this guy is suffering a bit of "if your a carpenter you think every problem can be solved with a hammer" syndrome.
    • by natrius (642724)
      Isn't a book that uses Microsoft Word's .doc format called a Word document?

      A document doesn't turn into a physical book until you hit print. The book itself is about the content, not the physical form. Dive into Python [diveintopython.org] is a book that I just happened to read online in web page form.
  • CSS for Documents? (Score:5, Insightful)

    by zaydana (729943) on Saturday February 24, 2007 @02:42AM (#18132174)

    Having a word processor act more like a web browser would be awesome. Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

    While turning word processors into web browsers would be stupid, things like CSS would be awesome to have in word processors.

    • by athakur999 (44340)
      WordPerfect has had a feature like this for years called "show tags" (I think). It'd show you where formatting markers started and stopped (similar to an HTML source listing). It was pretty useful. I'd love to see OpenOffice incorporate a feature like this (if it doesn't already).
    • by Coryoth (254751) on Saturday February 24, 2007 @02:56AM (#18132240) Homepage Journal

      Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

      Such things exist. TeX provides a decent the base for such things, so it's a matter of finding a TeX centric editor. LyX would be a good example, and indeed it has the sort of functionality and general approach to document creation that you seem to be after. Of course it doesn't necessarily have all the other features that other word processors might have (like mail merge or what have you).
    • Re: (Score:3, Insightful)

      by the_womble (580291)
      Latex: its not that hard to learn.

      Lyx provides a GUI front end, but you lose a lot of flexibility.

      Texmacs might work for you as well, although I found it very clunky.
      • by Antique Geekmeister (740220) on Saturday February 24, 2007 @03:50AM (#18132418)
        Indeed: LyX is extremely handy for providing to undergraduates or research assistants whose thesis advisors insist on using TeX or LaTeX, who lack the time to learn yet another language. LyX is the difference between having slightly more elegant .tex files, and getting an hour more of sleep a night when writing your thesis because you can edit in a GUI and don't have to debug your .tex files.

        I am finding myself wishing that OpenOffice had pursued putting a vastly better interface on TeX and LaTeX, rather than writing their own standard. It would probably have been faster and certainly would have been a lot more stable. Microsoft couldn't have even thought about it: its clean, open standards would not have lent themselves to the proprietary "extend" part of Microsoft's "embrace and extend" approach, or Microsoft's software licensing models.
      • Re: (Score:2, Informative)

        by Haeleth (414428)

        Latex: its not that hard to learn.

        But it is tricky to use for any language other than English. Out of the box, it's English or nothing. Other European languages are complicated; more complex languages like Arabic, Hindi, or Chinese require some very involved hacks indeed.

        It can be done, some of the time, but it's very, very easy to mess up. I have tried numerous times to get Japanese support, using one of the several special Japanese versions that exist (it seems it simply can't be done with standard TeX

        • by zsau (266209)
          To be fair, there's a Unicode version of TeX called Omega or some such. I'd doubtless have found it very useful if I'd ever managed to get it to work at all.

          Take a look at XeTeX [sil.org]. It installed without a hitch on my computer (ppc debian) once I altered the Debian control stuff to compile against the TeXLive TeX packages rather than teTeX. Or if you run on a more normal platform (x86 ubuntu/debian/SuSE, MacOS X, maybe Windows) there's precompiled packages for you. It will use any OpenType (or TTF, or on OSX th
          • Thanks for the XeTeX link. I'm sure that will be useful for many of us (even though the installation failed for me).

            However, I think the parent post's main point was that LaTeX is not here and now usable across the globe. With MS Word or OpenOffice, I can type and mix Japanese, Korean, Russian and French in one and the same document, and I can share it with millions of users on different platforms.

            With the default installations of LaTeX, that is impossible.

            • by zsau (266209)
              With the default installation of LaTeX, that is difficult. With XeLaTeX, it's no harder than with Word or OpenOffice, and you have the advantage that it will look the same. You type a UTF-8 file, use OpenType fonts, and get a PDF that people who can't process XeLaTeX can still read.

              I'm sure that will be useful for many of us (even though the installation failed for me).

              I suppose the regular disclaimers like "make sure you have all the dependencies installed" apply. It's a pity GNU/Linux distributions are st
        • Re: (Score:3, Informative)

          by TheRaven64 (641858)

          But it is tricky to use for any language other than English. Out of the box, it's English or nothing. Other European languages are complicated; more complex languages like Arabic, Hindi, or Chinese require some very involved hacks indeed.

          Really? All of my LaTeX files are UTF-8, and most include some non-English characters. I tend to use the raw unicode, rather than the LaTeX sequences because they are easier to type on a Mac. I'm not using a custom version of LaTeX although I vaguely remember having to include a package that told LaTeX to use UTF-8. Things like Greek letters and accents just work. I've not tried Arabic, Hindi or Kanji, however.

      • Incidentally, the site linked to from my sig is generated from a latex file. I have some TCL scripts that parse the Latex and generate more Latex files for the index pages.

        I did it this was so that I could also do a print [amazon.co.uk] version [amazon.com] from the same source document.

    • by zsau (266209)
      In Word, modify your formatting toolbar. Get rid of almost everything from it, except for lists and the first dropdown (and the button before it). Click the button before it. Now you have a setup like mine (when I'm forced to use a word processor--I much prefer TeX). Use the styles. When you think "this would be better in red", just create a new style and format it as red.

      I've been doing this since Word 6.0, when I first used a GUI wordprocessor. Stylesheets aren't by any means a new thing: They're just one
    • by 1u3hr (530656)
      , i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

      The idea of styles didn't orginate in CSS, it was used in page layout decades before the web. I use Ventura, which features this heavily, but PageMaker, Quark, etc all have styles.

      And actually, Word does too, but using styles correctly in Word is fraught with difficulty. The method of updating styles is capricious. I do DTP for a living, and when

    • by Kjella (173770) on Saturday February 24, 2007 @08:59AM (#18133420) Homepage
      Having a word processor act more like a web browser would be awesome. Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

      Every word processor I've seen like forever has support for styles. The problem is:

      1) It's impossible to avoid creating a million new styles by accident. Try looking at the styles list and you'll see it's full of junk
      2) It's impossible to clean up a document with such a bunch of styles, for example say you have a document which has been completely fucked up with pseudo-styles. You've set "Normal" to be what the bulk text should be, and "Headings" to what they should be. What happened last time I tried it? Well, it was impossible to easily apply it without killing any bullet lists, bold, italics or any other intended variation of the normal text. Headers and numbering went beserk. Trying to do the same with the bullet list style lead to numbers going completely nutzoid, for some reason it thought everyone in the same style belonged to the same list so later lists would start at some random number.
      3) If you for some reason is stuck copying between different versions of Word (norwegian and english comes to mind) then you'll have double the number of styles, which obviously aren't in synch.

      So to sum it up what I would like:
      1) Don't auto-create styles
      2) This sentence does not contain three styles
      3) Sane "apply style" functions
            - Parituclary directed at fixing a mess
      4) Make styles have an ID, at least for the default ones make them international so header 1 is header 1 in every language
      5) Ability to "style-lock" documents for things like company standards, you can create new styles but not just randomly change around sizes and fonts
      6) More visible styles (OpenOffice does this, MS word doesn't) because people don't see them
  • by Evardsson (959228) on Saturday February 24, 2007 @02:44AM (#18132190) Homepage
    While I do agree that the ISO doesn't need more than one standard for printable documents, I don't think that Håkon Wium Lie is on the right track with HTML/CSS for print.

    Sure, it works, with enough tweaking, and CSS3, and a $350 download of a product to turn HTML/CSS3 into a PDF. This is better how? What about LyX, LaTeX, or even OpenOffice if you are just going to convert to PDF?

    The whole HTML/CSS-to-print thing shoots the real argument in the foot.
    • by Rosyna (80334)
      Sure, it works, with enough tweaking, and CSS3, and a $350 download of a product to turn HTML/CSS3 into a PDF. This is better how? What about LyX, LaTeX, or even OpenOffice if you are just going to convert to PDF?

      Yes, exactly. Instead of taking one of two specifications created just for rich document formats, he suggests making a brand new specification by extending CSS/HTML to do something it doesn't yet seem ready to do.
      • Yes, exactly. Instead of taking one of two specifications created just for rich document formats, he suggests making a brand new specification by extending CSS/HTML to do something it doesn't yet seem ready to do.

        This talk of not creating new standards is ludicrous: there are already existing XML schemes geared towards this sort of task. Why should HTML/CSS be extended for publishing non-web documents when the work's already been done elsewhere? It gets even more ridiculous when you stop to think about a

    • PDF would have been a candidate, but Adobe's licensing and that of ancestor, Postscript, are awkward to deal with. That's hindered their acceptance in other uses, such as Postscript display systems. (It could have been a superios display system to X, and much easier to display remotely.)

      But it hardly takes a $350 tool to handle: PDFcreator, available over at sourceforge.net, and the old Ghostview viewer both rely on Ghostscript to process PDF and work more quickly and reliably than Adobe's conversion tools,
    • by panaceaa (205396) on Saturday February 24, 2007 @06:08AM (#18132876) Homepage Journal
      Why is anyone even talking about the opinion of a CEO? Opera is an HTML company -- they make HTML browsers. Why would the CEO of Opera have anything objective to say about OOXML or OpenXML? He wouldn't, which is why his pushes his own company's core competency: HTML. While Opera doesn't have a huge market share, if the market for HTML viewers grows, his company's likely to take a piece of that pie. But it's completely bunk because HTML's a mess of different standards, with many people using HTML 4.01 Transitional to this day, and the idea of people adopting CSS3 and writing documents using HTML is pretty far fetched. But you would never hear that from the CEO of an HTML browser company.
      • by lahvak (69490)
        Why is anyone even talking about the opinion of a CEO?

        Maybe because, surprising as it may be, part of what he says makes a lot of sense. I completely agree with him that the two proposed "standards" are both complete crap. And I also agree that HTML/CSS combination has, in principle, a lot of merit.

        A good portable document format should not have anything about internal representation of the document in the memory, neither it should any specific software, or even a specific version of such software, be ment
  • Hmm... both of these standards suck. I know what, we need another choice!

    Somehow I don't think that's going to fix the problem. Oh, and pointing out that the Microsoft letter doesn't validate. Isn't that a little petty?
    • Only problem is, the Oasis page itself doesn't validate. However, it seems Wikipedia does...

      But if the Oasis pages did validate, the basic argument goes like this: "How can they claim to care about standards if they can't even bother to support that most universal standard of standards, HTML?" And indeed, I could still make that argument -- just look at the sad, sad state of affairs that is Internet Explorer's CSS [mis]handling.
    • > Hmm... both of these standards suck. I know what, we need another choice!
      >
      > Somehow I don't think that's going to fix the problem.

      Depends on what you define the problem as. That there is too many "standards", or that all of them sucks. If the later, defining a new standard that does not suck solves the problem.
      • He is not proposing a new standard that doesn't suck. He is proposing a new standard that sucks, but is already partially supported in variety of slightly different, not quite compatible, ways.

        To anyone who doesn't think XHTML/CSS sucks, look up how many ways there are of saying 'red' in CSS. I was implementing a partial CSS parser a while back, and the specification seems to have been written by document authors with no thought to implementers.

  • How come? (Score:5, Funny)

    by ShaunC (203807) * on Saturday February 24, 2007 @02:56AM (#18132242)

    If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML).
    So I'd ask Håkon, "how come?" :)
    • I don't know. I would also like to know how you can evaluate the strengths and weaknesses of a system based solely by its size.

      Besides, how often is a human planning on parsing the files manually? If you ask me, the only purpose these open document file formats serve is to be opened by other word processors, which means as long as its standardized it could probably look like Chinese and it wouldn't phase me in the least.
      • by vmcto (833771) *
        So you don't think an application developer wanting to do something as simple as text searches to provide document integration capabilities should be considered? Let the word processor people make our documents as incomprehensible as possible?
      • by jlarocco (851450)

        I don't know. I would also like to know how you can evaluate the strengths and weaknesses of a system based solely by its size.

        I don't think he was evaluating the strenghs and weaknesses by their size. He plainly said both standards suck, regardless of their size. Given that, if forced to choose which one to implement, it makes more sense to suffer through a "mere" 700 pages instead of 6000.

        Besides, how often is a human planning on parsing the files manually? If you ask me, the only purpose these ope

    • Re: (Score:3, Insightful)

      by _Shad0w_ (127912)

      My speculation would be that no-one wants to sit and read a 6,000 page specification. 700 pages is far more palletable.

      It's a crap way of judging the relative merits of specifications, but human nature will out.

      • by 1u3hr (530656)
        My speculation would be that no-one wants to sit and read a 6,000 page specification. 700 pages is far more palletable.

        Yep. You could certainly fit almost nine times as many on a standard pallet.

      • My speculation would be that no-one wants to sit and read a 6,000 page specification. 700 pages is far more palletable.

        You don't need to. The only people who read entire specifications are their authors and the standards bodies. As a designer, you only care about the parts of the specification that you are responsible for implementing.

        ODF is unfortunately rather incomplete in some areas. There are no specifications for which spreadsheet formulas have to be implemented, or how they are implemented. Tables ar

    • "Fly aeroplane."
    • Re:How come? (Score:5, Informative)

      by PCM2 (4486) on Saturday February 24, 2007 @05:54AM (#18132822) Homepage

      So I'd ask Håkon, "how come?" :)

      Since nobody gets it, I'll spoil it: That's how Håkon advises people to pronounce his name. It's even on his business card.

  • Prince is a commercial product. I have a minor need to produce PDF's from XHTML/CSS and I really don't want to deal with licensing. I would need to run it on a server where multiple people can access it which means I would have to pay $3800 for Prince. Ouch! I don't need to do this that bad. Is there any way to do this with Free/Open Source software?
    • Re: (Score:3, Informative)

      by smoker2 (750216)
      CSS not withstanding, you can use HTMLDOC [htmldoc.org] to produce PDFs from html pages. If you are creating reports etc dynamically anyway, just create a temporary html file and convert it through HTMLDOC. I use Perl to generate reports and interface with HTMLDOC, but YMMV.
      An example of the HTMLDOC specific code used in the conversion :

      # Run HTMLDOC to provide the PDF file to the user...
      system "htmldoc --continuous --browserwidth 800 --landscape --size A4 --header ... --left 1in --embedfonts -f $fileref.pdf $filename"

  • Can != Should (Score:3, Insightful)

    by gbulmash (688770) <semi_famous@yahooBLUE.com minus berry> on Saturday February 24, 2007 @03:58AM (#18132438) Homepage Journal
    Been a long time since I typeset anything, but I used Adobe Pagemaker when I typeset a couple of college magazines in the mid-90s and FrameMaker when I was maintaining courseware in the late '90s for Nortel.

    HTML + CSS vs. Word vs. OO.o seems to me to be an argument related to formatting documents, not a "book". It's not that you couldn't do it, but I'd consider using Quark or InDesign (what seems to be Adobe's successor to PageMaker) or even Tex and its variants (haven't used any Tex-based stuff, but heard wonderful things) for typesetting.

    Arguments about standards aside, proof of concepts aside, I'd think that the real issue when it comes to any job is using the best tool for it. It's not a question of whether you can use these tools to typeset a book, but if you should.

    The point of the proof of concept is to prove that the system is flexible or capable enough to go beyond its original intended use. I get that. But proving a chainsaw can be used to spread butter, doesn't mean it's inherently superior to a coping saw.

    - Greg
  • by mennucc1 (568756) <d3slash@mennucc1.debian.net> on Saturday February 24, 2007 @03:58AM (#18132440) Homepage Journal
    An extract of H Wium arguments:

    ODF is an XML-based dump of the internal data structures of OpenOffice, while OOXML is an XML-based dump of the internal data structures of Microsoft Office.

    In 2006, a year or so after ODF entered the fray, Microsoft submitted OOXML to the standardization process. Are we seeing a pattern here? Is Microsoft undermining standards by submitting them? Could it be that it wants both ODF and OOXML to fail?
    so Wium proposes to build a new standard from scratch , starting from HTML and CSS ; but, recognizing that they would not cover all "Office" documents, he goes to saying

    Additional semantics (say, formulas in spreadsheets) can be encoded as attributes, as do microformats, and CSS 3 offers advanced features for printing (e.g., footnotes and header and footers).
    My thoughts:
    • Suppose MicroSoft were to listen to Wium (which they wont). Guess what ? Those additional fields containing formulas (and anything else that makes {MS,Open}Office much more useful than HTML) again would be just an XML-based dump of the internal data structures of so and so.
    • I dont like , more in general this article. Wium is saying that MicroSoft is proposing OOXML to kill ODF ; and at the same time he is proposing to kill ODF in favour of a non-existent extension of HTML+CSS. It is like the guy saying : "I dont like the power plugs in my new house, lets tear the house down and rebuild it" , and at the same time saying "why are they taking so much time to build the house?". Suppose MicroSoft would use arguments as those by Wium to convince ISO to reject ODF and then start a new draft based on HTML, drafted in cooperation between MicroSoft and other partners (including OpenOffice). That would really kill any hope of an ISO standard for "office" documents.
  • by zerblat (785)
    HTML sucks for books. The reason is simple. HTML was designed for web pages. HTML does a fairly good job of covering the things you need when you create a web page (although, why is there no <menu>, and a bunch of other stuff that need to be fudged by using elements that don't really fit). In HTML there is no <chapter>, no <footnote>, no <toc>, no <index>. Also, with HTML, one file == one document. If you're writing a book, it would be nice to be able to for example have one fi
  • fonts (Score:3, Informative)

    by cybpunks3 (612218) on Saturday February 24, 2007 @04:33AM (#18132576)
    The problem with using HTML for publishing is that to this day there is no viable downloadable font system. So you are limited to a lowest-common-denominator list of 2-3 fonts like verdana and new times roman. With Flash and PDF you can do a lot more, but obviously authoring becomes a problem.

    • With Flash and PDF you can do a lot more, but obviously authoring becomes a problem.

      Well, maybe. Have you ever looked at font licenses? There are more than a few digital type foundries using licenses that expressly prohibit embedding outlines of the glyphs in other files. This is because the outlines of the glyphs (mathematically represented by the font software) are the one part of the font that's actually copyrightable; the actual glyphs themselves are not (this is why so many foundries have their own

  • Too true (Score:4, Insightful)

    by iamacat (583406) on Saturday February 24, 2007 @04:41AM (#18132598)
    700 pages is not understandable by anyone but authors. "C programming language" book is 1/3 in size, have endured for 20 years and was instrumental in solving many more problems than word processing. Also, creating an ODF document is a minor function in most applications and is not worth the effort to understand such a huge standard. Proponents of both standards should come up with a modular design instead. At the base level, stick with basic HTML - bold and italic tags, fonts and sizes, paragraph breaks. Define many extensions that can be implemented independently or in any combination, in a manner convenient for both computers and, in a pinch, humans. Opera guy is biased as well - while basic HTML is great at its limited function, CSS is not very readable by humans. Nor does it solve pagination, collaborative editing, resolution independence, color profiles for printing...
  • by Rudd-O (20139) on Saturday February 24, 2007 @04:44AM (#18132616) Homepage
    And it worked out great.

    http://software-libre.rudd-o.com/ [rudd-o.com]

    Used MediaWiki to write the chapters, wrote a small python proggie (available there) to consolidate the wiki into a single HTML file (mostly conforming to the Boom! microformat), then used Prince and Hakom's book CSS to generate the PDF.

    Great typesetting, collaborative book editing, screw LaTeX!

    Hakom was right.
    • Re: (Score:3, Insightful)

      by AlXtreme (223728)

      Great typesetting, collaborative book editing, screw LaTeX!

      Those who don't understand LaTeX are doomed to reinvent it... poorly.
  • If Html+Css offered a better model instead of the box model (example the point-line model) and offered some way of doing basic data structures I'd agree. The current box model is very limiting in its layout abilities.

    Modern documents have so many binary data types inserted in them (images, fonts, etc.) that Html+Css isn't enough. It isn't even enough on the web and that's why Javascript and Flash are so prevalent. There needs to be another specification to support all the needs/wants of the users (who are

    • Modern documents have so many binary data types inserted in them (images, fonts, etc.)

      Firefox and Opera (iirc) both support base64 encoded binary data being stuffed inside img src. I cant remember if IE supports it or not, or if its in the actual standard though. Not that I agree with using html for typesetting.
  • I wonder if it's true, after all there are two implementations of ODF: OOo and KOffice, it'd be interesting to hear KOffice developers on the subject.

    Recently I hear a criticism of ODF by Miguel de Icaza is that ODF doesn't reuse standards like SVG as much as it should..
  • by IBitOBear (410965) on Saturday February 24, 2007 @06:09AM (#18132884) Homepage Journal
    I use OpenOffice. I support Open Document Format over MS/XML and .doc.

    That said, ODF it kind of blows. Really.

    I write novel-length "books" and it is FREAKING IMPOSSIBLE to do some very basic things in any/every ODF based word processor I have tried to date.

    Exercise for the Interested:

    Make a "Book" with an automatic table of contents, said table to contain an "Authors Note", "Prologue", auto-numbered chapters 1 to N with their associated chapter titles (where the actual chapter number is the chapter number internal variable), and finally "Epilogue" all at the same level of the index.

    This simple task is essentially impossible. The flaw is caused by the fact that everything goes through the "styles" and the styles don't inherit their list membership properties. You should be able to make a style "TOC Entry" that is assigned to a particular table of contents level (e.g. level 1) then make a sub-style "Chapter Heading" based on "TOC Entry" but with the chapter numbering magic attached, and in so doing, create "different styles" that go to the same level/point in the list.

    Exercise for the Interested:

    Make a "Book" with each chapter, and the prolog, and the epilog in separate sub documents. The linkage thing is a mess, it is hard to move "the pile of files" around especially if you want to use subdirectories (etc). If you have a custom style in the master document style list you have to _USE_ it in the master document if you want it to be pushed into the created sub-documents. Once the sub-documents are created it is a royal pain (read effectively impossible, or "supremely hidden feature required") to update those styles in those sub documents if you change that style.

    Exercise for the Interested:

    Put three separate "outlines" into one ODF Document. In ODF the outline is a function of the style headers, they only exist as implications of structure instead of first class abstractions. This is largely the fault of Microsoft Word, since the Word folks totally messed this up when they supplanted WordPerfect (which did this inset outline/object sort of thing right).

    ODF was, IMHO, poisoned by the slavish attempt by someone trying to make a Word killer instead of a "good word processor."

    And there are stacks more of these issues.

    And all that said, I *STILL* use ODF (Open Office etc) because I CATEGORICALLY REFUSE to _RENT_ the right to access my own work from a third party. Microsoft has plainly stated that such rental model is their intended business plan, which makes them a non-starter.

    In my opinion, having used both Word and OpenOffice for years; and having used Word Perfect and wordstar before them, ODF is a "workman like effort" to create a document format suitable for "normal business purposes". There is a reason that the legal profession never moved over to Word, and they likewise will not move to ODF, when you need to get to a tightly proscribed document format, both Word and ODF have a "you can't get there from here" fundamental limitation. Both formats simply refuse to represent some things because the designers "know" that a different format is better. Neither ODF nor Word has any allowances for _art_, professional or poetical.

    So, governments should use ODF because it is "no worse" than Word in terms of the ability to represent the documents it can represent, and given that congruence, the shorter, 100% open standard is, or should be, a hard minimum requirements.

    In terms of ODF being the be-all and end-all of document representation, I'd have to say "hardly!" I looked into the OpenOffice code base a while back to see if adding/changing the format to allow for "a book" would be reasonable. It didn't appear to be. Too many of the original StarOffice assumptions about document structure seemed pathologically uninspired. It was like looking at a big pile of Visual Basic. Everything in the standard is way too global, nothing "nests organically" it all nests pedagogically. (Every
    • by digitect (217483)

      Insightful comments, I've rarely heard the argument made so well. You obviously ARE a writer. ("Renting the right to access your documents", great way to put it.)
    • There is a reason that the legal profession never moved over to Word, and they likewise will not move to ODF, when you need to get to a tightly proscribed document format, both Word and ODF have a "you can't get there from here" fundamental limitation.

      Uh, just an FYI. The legal profession uses MS Word. For years they did keep to Word Perfect in the USA, and of the 20 lawyer friend of mine at Tier 1 USA law firms, they all use MS Word. Word Perfect died for the legal profession when it failed to create

    • by Luke (7869) on Saturday February 24, 2007 @12:39PM (#18134466)
      Using a word processor to write a book is like using stone tablets and and abacus for spreadsheets. You really ought to look at markup-based typesetters like LaTeX or DocBook or software specifically designed for book production.
  • Google docs (Score:2, Interesting)

    by edxwelch (600979)
    He wants HTML/CSS documents? Isn't this what Google docs do?
    Anyways sounds like a good idea to me. I often have to share documents and I don't like to have to force people to install a specific application just to read them.
  • Um... NO (Score:5, Informative)

    by salesgeek (263995) on Saturday February 24, 2007 @08:26AM (#18133282) Homepage
    ODF is not about web pages or word processing. It's a standard for office documents including spreadsheets, presentation and word processing. That's a big difference from what Opera's CTO is talking about. CSS/HTML might make a good format for one part of the suite (word processing) with a lot of work on the standard. The issue: that's not what is needed for a standard. It's about doing for office documents what HTML did for websites. ODF is actually an opportunity for opera - extend the browser to support ODF so people can post ODF documents, make dynamic applications render to ODF and so on. It takes the web to the next level and further erodes the big monopoly.
  • Build a word processor that uses html/CSS with options and flexibility comparable to OpenOffice.

    Actually, I have every confidence that Opera can. . . I've been a happy Opera user since 1999.
  • How about Microsoft and OpenOffice just keep their own XML formats? One of the great things about XML is that you can use XSLT to transform one XML document into another one with different syntax. As long as both products can open, display, and convert the other format then I don't really see the need for a standard in this situation.

    A standard is going to limit innovation in word processors unless you specifically allow extensions in the standard, which kind of defeats the purpose of a standard.

    If the go
  • "2 standard formats is not a good idea"

    "That's the reason I'd wish to add a third"

    I wish he didn't ruin the entire opinion with all those html/css pipe dreams, they are so extremely unrealistic besides of all the mess that would actually be to adapt them in a useful way for office formats. Really this was kind of a shame.

    I could also write a book in .txt but that doesn't make it any better of a format for office software.

Is a person who blows up banks an econoclast?

Working...