Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Only 4.13% of the Web Is Standards-Compliant

Posted by CmdrTaco on Thu Oct 16, 2008 09:03 AM
from the i-woulda-guessed-less dept.
Death Metal writes "Browser maker Opera has published the early results of an ongoing study that aims to provide insight into the structure of Internet content. To conduct this research project, Opera created the Metadata Analysis and Mining Application (MAMA), a tool that crawls the web and indexes the markup and scripting data from approximately 3.5 million pages."
+ -
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • ...on which standard the designer chose.
    • 'Looks good in Internet Explorer and doesn't seem to crash Firefox or Opera' is not a standard.

      • by gnick (1211984) on Thursday October 16 2008, @09:48AM (#25399923) Homepage

        'Looks good in Internet Explorer and doesn't seem to crash Firefox or Opera' may not be a standard, but it satisfies the bulk of most web-sites' customers. I'm a FF user and include myself in that group. I realize that sites are tuned for IE because it's the leader and accept that my browser choice and add-ons sometimes make things look a little funny - As long as they work I don't care. I would guess that most visitors feel more or less the same (slashdot standards nazis excepted).

        Besides, if most of a web site's traffic is coming from a browser that doesn't support any standard but their own anyway, what motivation do they have to conform?

        • by Bogtha (906264) on Thursday October 16 2008, @10:08AM (#25400237)

          'Looks good in Internet Explorer and doesn't seem to crash Firefox or Opera' may not be a standard, but it satisfies the bulk of most web-sites' customers. I'm a FF user and include myself in that group.

          The problem with that attitude is that not so long ago, Firefox wouldn't be in the list, and for many developers (including some I worked with this week) Opera is still not on that list. It's like Internet Explorer only websites, except only slightly laxer. So you use Firefox. Lucky you! How about all the people who use something less popular, e.g. Konqueror? How about all the people who must use something that will never be popular, such as people with disabilities? Shall we just say "tough, get off the web"?

          As long as they work I don't care.

          "Working" is not a property of a website. "Working" is a property of a combination of a website and a browser. You can't say that a website "works", only that it works in particular browsers.

            • by Jellybob (597204) on Thursday October 16 2008, @11:22AM (#25401379) Journal

              The point is that if the site is built to standards, you don't have to spend a lot of money making that small number of people happy. Their browser is probably built to render sites as the standards specify, and so it'll probably work anyway.

              I build web applications at work, and only have to make it work in Firefox, but because I'm using standards, and think about what I'm doing, I can be fairly confident that it's going to work in most other things as well. I have users happily using my apps in some barely known browsers, without problems.

      • by jellomizer (103300) on Thursday October 16 2008, @09:54AM (#25400049)

        Internet Explorer is really the big trouble maker here. Any Professional knows that their site needs to render flawlessly in IE first, Good enough in Firefox, and perhaps workable on others. Following the "standards" bairly leads to this operation as IE so poorly handles the standards that you really need to break them. I am still trying to find the HTML tag that gives IE users an electric shock.

      • Yes it is. It is the standard that everyone shoots for. The defacto standard if you will. It is not a rigorously defined standard published by an internationally recognized standards body. I'm afraid there is not a single standard definition of the word standard [google.com] in the English language.

        Isn't English fun, my compeer?
        • by remmelt (837671) on Thursday October 16 2008, @09:31AM (#25399643) Homepage

          It sure makes your Slashdot comment non-standard!

          • I disagree- I'd say that's a pretty standard Slashdot comment.
              • by Lachlan Hunt (1021263) on Thursday October 16 2008, @10:12AM (#25400321) Homepage

                Does using "blink" make my code non-standard?

                Yes, because blink is not defined as conforming in any standard. However, it is possible to make a page containing blink (or any other element or attribute you like) pass validation by providing a custom DTD or an internal subset.

                But note that the claim that "4.13% of the Web Is Standards-compliant" isn't quite accurate. The study only used the W3C markup validator, which is only able to detect a subset of the machine checkable conformance criteria. It's trivial to create a non-standards compliant page that passes validation.

              • Oh, well in that case, not at all.
                Both blink and marquee are two very useful tags that should be used on every website.
              • by bunratty (545641) on Thursday October 16 2008, @10:12AM (#25400331)
                If you haven't validated an HTML page, you can fairly safely assume it's not valid HTML. Just like if you type in a program and never run it through a compiler, it probably has a syntax error in it somewhere. It's the exception that a non-trivial program compiles on the first try. Likewise, if you don't validate your HTML, it likely contains syntax errors that cause it not to validate. You should cross your fingers that all browsers, past, current, and future, deal with the syntax error in the way that's favorable to you.
    • by g0dsp33d (849253) on Thursday October 16 2008, @09:22AM (#25399513)
      But if we completely reverse the standards we should be at 95.87% compliance!
      • by elrous0 (869638) * on Thursday October 16 2008, @09:46AM (#25399875)
        There are only two standards that I care about: "How does my page look and work in Internet Explorer?" and "How does my page look/work in Firefox?" Beyond that, I couldn't really give a shit less if the W3C does or doesn't like it. My clients aren't paying me to spend extra time designing perfectly W3C-complaint sites, they are paying me to design a site that reaches real-world customers in as efficient a manner as possible.
  • More like (Score:5, Funny)

    by ODiV (51631) on Thursday October 16 2008, @09:07AM (#25399263)

    OMG 4.13% of the Web is Standards-compliant!?

  • W3C (Score:5, Informative)

    by eldavojohn (898314) * <my/.username@@@gmail.com> on Thursday October 16 2008, @09:08AM (#25399305) Homepage Journal

    W3C's validation tools

    Normally I'd go on my own rant but I'm feeling lazy today and recently I read a good article at A List Apart that sums it up [alistapart.com]. As for the W3C, I like this list they compile:

    W3C's Pros & Cons

    Pros:

    • Global
    • Academic and scientific body
    • Multiple interests represented, but mostly from paid member companies
    • Attempting to be more open via certain teams such as the HTML5 and CSS Working Groups
    • Attempting to appeal more to work-a-day world via redesigns, blogs, and more human-friendly language throughout the site

    Cons:

    • Creates "open standards" by ideal, not necessarily fact
    • Incredibly slow moving in a highly evolutionary environment
    • Poor economic model that relies on membership monies
    • Discourages independents and open process
    • Passive: only creates specs and recommends, does not do real outreach
    • "Ivory tower" perception

    You should read that article, it's pretty spot on for this subject.

    • Re:W3C (Score:5, Insightful)

      by Bogtha (906264) on Thursday October 16 2008, @09:37AM (#25399747)
      • Incredibly slow moving in a highly evolutionary environment

      That's hilarious. We still can't use CSS tables or generated content on the web - features that were published by the W3C in the CSS 2 specification over a decade ago because Internet Explorer doesn't support them yet. We need to use JavaScript frameworks or otherwise normalise event handling because Internet Explorer doesn't support DOM 2 Events - a specification published by the W3C eight years ago (event Internet Explorer 8 won't support this). And SVG anyone? XHTML? MathML?

      Get back to me when browsers make it out of the 90s before telling me the W3C is "incredibly slow moving".

        • Re:W3C (Score:4, Insightful)

          by Bogtha (906264) on Thursday October 16 2008, @12:09PM (#25402045)

          IE was the first browser to implement CSS 2 as specced by the W3C, who then, faced with a working implementation from a large company, decided to make major changes to the spec.

          This is quite simply delusional. Here is the first released specification for CSS 2 [w3.org]. Go and read the tables section. Go and read the generated content section. Go and find out when Internet Explorer had a working implementation of these features. Then go and inform Microsoft, because they, along with the rest of the world, seem to be under the impression that these are new features in the Internet Explorer 8 betas.

          In actual fact, there have been changes made to CSS 2 that make Internet Explorer more compatible. For instance, display: inline-block was originally an Internet Explorer proprietary feature that was added to CSS 2.1.

  • I wonder if (Score:3, Interesting)

    by Jane_Dozey (759010) on Thursday October 16 2008, @09:09AM (#25399321)

    I wonder if they're throwing away every page that doesn't fully comply or if they're actually including the pages that almost comply but have a typo or missing doctype or missing closing tag. I'm guessing the former by the numbers which seems a little unfair to me.

  • by cosmocain (1060326) on Thursday October 16 2008, @09:10AM (#25399339)
    ...the rest just renders perfectly in IE.

    (i would prefer if there wasn't any truth in it.)
  • Why is this a surprise? We are limited by non-standards compliant browsers.
    Unfathomable amounts of development time has been wasted over the years trying to set sites running and usable in multiple browsers.
    To complicate the issue, over the last few years there has been an explosion in the number of browsers on the market. It is really no fun navigating this modern tower of Babel.
    If I had one wish that would be granted, it would be that all browsers would be compliant to a standard. Literally millions of man years in development time could have been saved if this issue was somehow nipped in the bud earlier on.
  • by bboxman (1342573) on Thursday October 16 2008, @09:21AM (#25399489)

    This is sad. The situation is even worse in some non-English web domains.

    Why can't the web stick to something simple? 95% of the sites I use, would be fine with just plain simple HTML 2.0. Instead, we've got javascript, CSS, XHTML, and other buzzwords. Which in the end, take control of how a web page looks from the user's hand.

    I like to read text, on a monitor, green on black (or white on black). I would like to format a web page the way I want to see it.

    The vast majority of the web is simple formatted text. There is no reason for this to constantly evolve onwards and onwards.

  • 1. the web is still evolving, the standards keep changing. no pressing need to lock things in
    2. it is superior design to have a browser that gracefully degrades rather than being and brittle and refusing to render everytime someone forgets to close a <p> element. not simply because of nonstandard pages, but for a whole host of other reasons, including handling partial transmissions
    3. the strength of the web is open participation, low barrier to entry. hobbyists should publish, and this is a good sign. hobbyists should not expected to be anal retentive standards zealots

    complete standards compliance should always be low on the web because this is a sign of a HEALTHY internet, because it means nonprofessionals are contributing content. this is always a good thing, this what made the internet a powerful nw form of media in the first place. if ever there were some sort of gatekeeper organization or rigorous technical specification that enforced standards compliance, you would raise the barrier to entry onto the web by regular joes. you would reduce the variety of the web, make it more monoclonal, and hurt a vibrant ocmmunity

    low standards compliance is not only a complete nonissue and not a problem, its a good sign. the lower standards compliance is, the better for us all

  • by thermian (1267986) on Thursday October 16 2008, @09:35AM (#25399707)

    Does it mean that 94% of websites did not find the standard useful?

    Or perhaps that the standard is poorly presented, causing fewer people to be aware of it?

    My personal leaning is that the standards body lost control of their 'standards' a long time ago, but they haven't realised yet. The only real thing most web devs care about is 'does my site/application run as required in the browsers I need it to?' If the answer is 'yes, if you don't follow the standard', then the standard is ignored.

  • You only need to make one mistake in your markup to be non-compliant. I would be interested to see what the degree of failure is for the other 95.87% of sites. My website, Wii Fit Forum [wiifitforum.org] currently fails on six counts, all just simple errors in the code which I plan to fix. But currently, the site displays just fine, so I have more important things to worry about. I think this is the same for many publishers.

    Unfortunately for the novice, the ignorant, the lazy or the just plain error-prone (the last two are me), the W3C and the browser industry do not make it that easy to be compliant.

    HTML standards are the current prime example of the old joke "the great thing about standards is that there are so many of them". The W3C really needs to stop pissing around with all this semantic web crap, and concentrate on making what is already there work better.

    We need a single standard which embodies all the best elements of the existing ones in a coherent form, and then the browers manufacturers need to get their arses in gear and implement it properly. The novice developer is currently confronted with a mish-mash of alternative doc-types, each of which has different pros and cons, and which may or may not work properly depending on your browser. It needs to be done soon, not over a ten year timescale.

    When you can stop worrying about whether your site will work in various browsers, then people will spend more time on compliance. Until then, people will worry about the important things, such as their readers being able to see their site properly.

    I know I should treat standards with more importance, but while the current mess persists it is hard to care.

      • As I said, I will, but it is low down my priority list because there are other things that will actually make a difference to my site's users that I need to do first. HTML standards compliance is, for most people, irrelevant. It's like using perfect grammar, it's a good thing to have, but failing doesn't make you unintelligible. Only when your grammar is really bad, or your standards compliance is utterly terrible, do you become unintelligible (to man and browser), at which point it becomes a problem.
  • by PortHaven (242123) <saj AT easternstorm DOT net> on Thursday October 16 2008, @09:40AM (#25399791) Homepage

    When they don't work with the tools (various browsers).

    Better to build a website that works, than one that meets standards but display poorly in the browsers of your users.

    Ask yourself this simple question. If it does not look good in the browser, is your client going to accept "Well it's coded to standards!". Heck no...

  • by Ed Avis (5917) <ed@membled.com> on Thursday October 16 2008, @10:25AM (#25400525) Homepage

    Nowadays making sure your site is valid HTML is easy. Just install the excellent HTML validator plugin [skynet.be] for Firefox. It gives you a tick or cross icon on each page; double-click the cross to view the page source with a list of errors. It does the validation locally on your machine, not sending the content off to some server, so it's fast.

    If you're writing dynamically generated pages it is a great way to find bugs in your code, and it's unobtrusive enough to leave it turned on all the time.

    • Re:How compliant? (Score:5, Informative)

      by DrSkwid (118965) on Thursday October 16 2008, @09:08AM (#25399301) Homepage Journal

      It is very simple http://validator.w3.org/ [w3.org]

          • by Bogtha (906264) on Thursday October 16 2008, @09:42AM (#25399815)

            Yes. HTML 4.01 and XHTML 1.0 each have two DTDs: a "transitional" DTD that allows presentational elements and a "strict" one that disallows them.

            No, that's something different. There aren't degrees of strictness when it comes to validity. If a document claims to be a Strict document, and makes a single mistake, then it is invalid. If a document claims to be a Transitional document, and makes a single mistake, then it is invalid. In both cases, it's an absolute rule with no laxity.

    • Isn't that a bit like saying, "my C code fails to compile whenever I pass it the flag for strict ANSI checking, but other than that my code is ANSI C compliant"?

    • Re: (Score:3, Interesting)

      Also depends on how old the websites they searched are..
      only recently added websites or also websites and old pages that exist longer than the standard they validated against exists ?

      • Re:How compliant? (Score:5, Informative)

        by Bogtha (906264) on Thursday October 16 2008, @09:26AM (#25399565)

        only recently added websites or also websites and old pages that exist longer than the standard they validated against exists ?

        MAMA didn't validate against a single document type. They validated against the document type that each individual document claimed to be. So all the ancient HTML 2.0 pages out there will correctly be identified as valid in they are, in fact, valid HTML 2.0.

    • Re: (Score:3, Informative)

      Depends on how strict they're being.

      There aren't degrees of validity. A document is either valid or it isn't. You can't be "more strict" when validating something, if a tool offers you an option like that, then it is doing something other than validating, it's probably linting as well. There's at least one widely-used "validator" that doesn't actually validate at all.

      For example, I never close paragraph and line break tags, but otherwise my html is compliant.

      Yes you do. If you didn't close them

    • by sakdoctor (1087155) on Thursday October 16 2008, @09:15AM (#25399421)

      Well not in the least bit idiotic actually.
      It's up to me as a user to choose where a url opens, especially since we are all using the tabbed paradigm now.

      • It's up to me as a user to choose where a url opens, especially since we are all using the tabbed paradigm now.

        User agents currently do not allow the user to submit a form into a new window or tab. This is the nearly nine-year-old bug 17754 on bugzilla.mozilla.org with 99 votes.

      • by l0ungeb0y (442022) on Thursday October 16 2008, @09:43AM (#25399829) Homepage Journal

        XHTML-STRICT is not for everyone, it's intended for those (like me) who are more development oriented and wish to completely separate structure from presentation. A "target" attribute is clearly a presentation attribute since it defines how the linked reference is presented to the user and as the parent noted, it should be up to the user to make that choice.

        When wanting to control presentation in XHTML STRICT, you should use DOM or CSS, that way, they structure (XHTML) is removed from the presentation (JS/CSS). I typically link all scripts and stylesheets. That way the XHTML is made portable in terms of data with the JS/CSS being limited to only effecting a web client. In the OPs case, a simple ID attribute for that particular anchor would work just fine, you could bind an event listener for a click event to that element and then execute your javascript popup code when that event is triggered, canceling the event so that the browser does execute the link on it's own. That way, your default browser clients could execute the JS instructions, while a 3rd party app (an AIR desktop or mobile device) could put their own custom behavior in if desired.

        While that sort of practice may seem extreme to a designer, as a developer I can swear to it's scalability and transportability for supporting 3rd party access such as when developing a web UI that needs to support many types of clients via one codebase.

        If none of those features make sense nor strike you as worthwhile, I suggest you stick to XHTML TRANSITIONAL, which is probably better suited to your needs.

    • For example, xhtml-strict does not include support for "target" attributes in links. What kind of idiotic decision was that?

      A very good decision, there are two main uses for the "target" attribute:

      • Frame-based sites - Old-school, annoying way of designing sites that I and many others feel should not be used for new sites.
      • To automatically open links in a new window - Annoying behaviour by web developers who think no one could possible want to, god forbid, leave their site in favor of another site.

      /Mikael

        • Not really a lot of point to it, though -- savvy users will simply middle-click on the link if they want it in a new tab/window. If they don't, that generally implies they want it right where it is, and your attempt to open a new tab/window is going to be annoying.

          But hey, at least using a target for that is better than linking to a javascript: URL. A lot of sites are even worse -- they add an onClick event, and they set the link href to #, or to javascript:void(), meaning that middle-clicking on it inevita

        • Re: (Score:3, Insightful)

          When a link is possibly important to a user but would in fact break the flow of their current activity, a link should be set to open in a new window - preferably one which does not go full screen to hide the window they are really using.

          If you use the target attribute, you have no control over the size of the window and it is very likely that it will obscure the current window. You need JavaScript to get the effect you desire, and if you are using JavaScript, why bother with a new window when you can d

    • Re: (Score:3, Interesting)

      The lack of a target attribute really bothered me when I first ran into it. Their argument was something like how websites shouldn't be controlling the browser, as in creating tabs/windows, etc. Of course you can hack it in with Javascript which is something I refused to do, what's the point of striving to be standards compliant when you break it a minute later with Javascript? Anyways, I thought about it and kind of agreed with the notion, so now I just externally link a lot less.

    • Re: (Score:3, Interesting)

      Even if your app is 100% standard compliant, it may not be cross browser. Not even if you pull IE out of the equation. Not even if you ONLY TARGET FIREFOX (there are differences between FF2 and FF3. FF2 doesn't even fully implement CSS 2 itself...)

      So by now, devs have reverted to another philosophy: make websites that are crossbrowser, and -mostly- standard compliants.

      If you look at some of the most heavily cross browser web sites out there, especially the ones that are extremely backward compatible (down t

    • Re: (Score:3, Interesting)

      You can make a site that works fine in every browser that's XHTML 1.0 Strict. Chances are that you'll have to use some non-standard (and probably invalid) CSS in a conditional stylesheet, but that a) doesn't count in terms of X/HTML validation and b) is only linked from the inside of a comment read only by IE, so even if CSS validity was part of being valid X/HTML it wouldn't be checked by the validator.

      Granted, you may need to add in some extraneous tags in order to implement the proper CSS hacks, but tha

      • by SanityInAnarchy (655584) <ninja@slaphack.com> on Thursday October 16 2008, @10:14AM (#25400353) Journal

        Serious issue?

        Well, yes. Unless you think the Internet isn't important, or that it wouldn't make a difference if the web was controlled by a single person. I think that certainly puts it above "what costume will I wear" kind of serious.

        And your sig betrays you -- you seem to take yourself just as seriously as the rest of us take things that actually matter.

        I mean, when I go to the bank to cash a check, I don't worry they won't give me money unless I can prove I'm using Firefox at home.

        Well, when you go to cash a check, you shouldn't really have to prove anything, other than that you can sign for it.

        But I've seen banks that only work on IE. I haven't seen banks that only work on Firefox.