Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Internet Explorer Bug Mozilla The Internet IT

IE Shines On Broken Code 900

mschaef writes "While reading Larry Osterman'a blog (He's a long time Microsoftie, having worked on products dating back to DOS 4.0), I ran across this BugTraq entry on web browser security. Basically, the story is that Michael Zalewski started feeding randomly malformed HTML into Microsoft Internet Explorer, Mozilla, Opera, Lynx, and Links and watching what happened. Bottom line: 'All browsers but Microsoft Internet Explorer kept crashing on a regular basis due to NULL pointer references, memory corruption, buffer overflows, sometimes memory exhaustion; taking several minutes on average to encounter a tag they couldn't parse.' If you want to try this at home, he's also provided the tools he used in the BugTraq entry."
This discussion has been archived. No new comments can be posted.

IE Shines On Broken Code

Comments Filter:
  • by jonwil ( 467024 ) on Tuesday October 19, 2004 @07:29AM (#10563415)
    Aparently, XPSP2 (including the new IE) was recompiled with the latest visual studio and with all the options turned on to better catch issues.
  • by Anonymous Coward on Tuesday October 19, 2004 @07:32AM (#10563445)
    Nothing crashed. I got blank pages, all the weird HTML and all, but no errors and nothing crashed. w00t.
  • Konqueror and bugs (Score:3, Informative)

    by Anonymous Coward on Tuesday October 19, 2004 @07:35AM (#10563458)
    Konqueror has a neat bug symbol on the lower right corner when displaying buhhy html code.
    I think this is a nice feature.
    I wish that konqueror would have been tested. It's a good browser.
  • by swinefc ( 91418 ) * on Tuesday October 19, 2004 @07:38AM (#10563471)

    Larry Osterman is about to demonstrate one of the great values of open source. He's identified a set of malformed HTML and within a few days/weeks someone will have fixed it.

    If this were a closed source / Microsoft browser, then there would have to be a complete release cycle before a non-security issue is resolved.

    All software has defects, it is the access to the code that allows someone to rapidly fix issues that sets open source apart.

  • Tested Konqueror (Score:5, Informative)

    by unixmaster ( 573907 ) on Tuesday October 19, 2004 @07:41AM (#10563495) Journal
    None of the samples in http://lcamtuf.coredump.cx/mangleme/gallery/ [coredump.cx] was able to crash Konqueror from KDE CVS Head. Heheh time to praise Khtml developers again!
  • by Ann Elk ( 668880 ) on Tuesday October 19, 2004 @07:41AM (#10563497)

    RTFA. Larry didn't find the broken HTML, he just referenced an article [securityfocus.com] which did.

  • by tomstdenis ( 446163 ) <tomstdenis@gma[ ]com ['il.' in gap]> on Tuesday October 19, 2004 @07:45AM (#10563520) Homepage
    This isn' insightful at all. First, you'll be the first person to bitch when a mozilla virus comes out.

    Second, "crashing when invalid" as you and many others are alluding to is NOT a good idea. What if you had another tab open with email/urls/info you needed?

    What if other software took this route? Invalid operands to open()? Time to crash. Invalid socket used in send()? Time to crash. Segfault in application? Kill the kernel processes!

    It's a problem, it has to be fixed and there aren't two ways about it.

    Tom
  • by Jedi Alec ( 258881 ) on Tuesday October 19, 2004 @07:46AM (#10563526)
    I'd really prefer it to just refuse to parse the page mentioning that the code is bad instead of crash. As much as I like Firefox/Moz, when a piece of software is fed bad data, it should say so, not die on the spot, ever.
  • Re:Tested Konqueror (Score:2, Informative)

    by Anonymous Coward on Tuesday October 19, 2004 @07:49AM (#10563545)
    In the current version of Konqueror (3.3), I had quite a few crashes on the cgi version...
  • Re:Excellent! (Score:5, Informative)

    by metlin ( 258108 ) * on Tuesday October 19, 2004 @07:51AM (#10563555) Journal
    Actually, the code does not seem that great.

    Here's the mozilla_die1.html code
    <HTML><INPUT AAAAAAAAAA>
    And the mozilla_die2.html code
    <HTML>
    <HEAD>
    <MARQUEE>
    <TABLE>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <MARQUEE HEIGHT=100000000>
    <TBODY>
    Attack of the marquees!
    It looks like he came across places where either boundary checks or type checks are not in place.

    Besides, he's had access to almost all the browswer code, hasn't he?

    I mean, these bugs are bad, but I'm sure if I had access to IE's code I could come up with a zillion bugs.
  • Re:Security Issues (Score:4, Informative)

    by say ( 191220 ) <<on.hadiarflow> <ta> <evgis>> on Tuesday October 19, 2004 @07:53AM (#10563573) Homepage

    I'm not sure what the behaviour of 4.0 Transitional and 4.0 Strict is supposed to be

    It's kind of in the name. Transitional should best-guess. Strict should not.

  • by oever ( 233119 ) on Tuesday October 19, 2004 @07:53AM (#10563574) Homepage
    Try this:
    valgrind
    It's a Free Software purify alternative and works great!

  • Who's Who (Score:5, Informative)

    by Effugas ( 2378 ) * on Tuesday October 19, 2004 @08:00AM (#10563626) Homepage
    Ugh. Not the best written Slashdot entry.

    Larry Osterman -- former Microsoft guy; someone forwarded him a post to Bugtraq.

    Michael Zalewski -- absurdly brilliant [coredump.cx] security engineer out of Poland. Did the pioneering work on visualizing [wox.org] randomness [coredump.cx] of network stacks, passively identifying operating systems [coredump.cx] on networks, and way way more.

    Nothing bad against Larry. But this is all Zalewski :-)

    --Dan
  • by Diplo ( 713399 ) on Tuesday October 19, 2004 @08:01AM (#10563636) Homepage

    Nevermind using random garbage to crash a browser, you can make IE6 crash with perfectly valid strict HTML.

    Try this page [nildram.co.uk] in IE6 and then hover your pointer over the link. Crash!!!

  • by evil_one666 ( 664331 ) on Tuesday October 19, 2004 @08:02AM (#10563638)
    Replicating the experiment on mozilla on linux (repeatedly refreshing the url: http://lcamtuf.coredump.cx/mangleme/mangle.cgi) causes NO CRASHES

    I think there may be some FUD here...
  • by Grey Ninja ( 739021 ) on Tuesday October 19, 2004 @08:04AM (#10563651) Homepage Journal
    Don't forget this one either. [neilturner.me.uk] (Mind you, this one has been fixed in XP SP2)
  • by bbuR_bbuB ( 804723 ) on Tuesday October 19, 2004 @08:05AM (#10563656)
    Huh? Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20040914 Firefox/0.10.1 No crashes here. I must not be lucky.
  • Re:Tested Konqueror (Score:1, Informative)

    by Anonymous Coward on Tuesday October 19, 2004 @08:07AM (#10563669)
    The mozilla tests all worked for me with firefox-1.0-preview, so I doubt it's fud.
  • Re:Is this for real? (Score:1, Informative)

    by Anonymous Coward on Tuesday October 19, 2004 @08:07AM (#10563670)
    Well I'm using Mozilla 1.7.2 on debian (stable) and mozilla_die1.html and mozilla_die2.html both caused my mozilla to crash. Whatever you may think of IE vs mozilla vs whatever, you have to admit that a browser crashing can be a little anoying.
  • Re:Excellent! (Score:5, Informative)

    by EMN13 ( 11493 ) on Tuesday October 19, 2004 @08:09AM (#10563679) Homepage
    As he stated in the article; the crashes are sometimes platform-specific.

    I've tried this in 1.0PR firefox on win32, and the crashes do occur there.

    I've gotta say - this really looks like a great tool; a simple and effective way of finding some bugs!

    --Eamon
  • by millahtime ( 710421 ) on Tuesday October 19, 2004 @08:09AM (#10563681) Homepage Journal
    Test if your code is good or not at http://validator.w3.org/ [w3.org]
  • by Anonymous Coward on Tuesday October 19, 2004 @08:14AM (#10563705)
    Not so fast my friend (same goes for the moderators).

    Try the known examples that "work" here:
    http://lcamtuf.coredump.cx/mangleme/gallery /

    They work for me using the latest firefox on linux.
  • by Anonymous Coward on Tuesday October 19, 2004 @08:14AM (#10563707)
    *crashing* on malicious code is *GOOD*, while *running* malicious code is *BAD*.
    Holy crap! How absolutely untrue. If your program is crashing, you've lost all control. If you still had control, it wouldn't have crashed: it would have printed an error message.

    Once you've lost control of your program, all bets are off. The only difference between crashing and taking control is exactly WHAT bad data you feed into the program. These browsers simply crashed because RANDOM data was being fed in. That random data could be changed to carefully-crafted executable code, and BAM, your harmless "crash" is a security exploit.
  • by murdocj ( 543661 ) on Tuesday October 19, 2004 @08:16AM (#10563717)
    You aren't a security expert, are you? Now, your first lesson in computer security is, write this a hundred times: *crashing* on malicious code is *GOOD*, while *running* malicious code is *BAD*.

    It's true that *catching* bad input and deliberately aborting (hopefully with a somewhat reasonable error message) is good. According to the article, that's not what's going on... the browsers are NOT checking input, e.g. scanning into uninitialized buffer areas because they aren't finding an expected end marker, or a length is incorrect. So parent is exactly right... that kind of "buffer overrun" bug is exactly what can be exploited.

  • by Scarblac ( 122480 ) <slashdot@gerlich.nl> on Tuesday October 19, 2004 @08:18AM (#10563726) Homepage

    You aren't a security expert, are you? Now, your first lesson in computer security is, write this a hundred times: *crashing* on malicious code is *GOOD*, while *running* malicious code is *BAD*.

    HTML in a browser isn't code. It's data. Running any HTML as code is *BAD*.

    The fact that it does crash some browsers indicates that they probably are trying to run part of it as code - probably because of buffer overruns and the like. The whole reason it crashes is that it's running the code. That's very bad. It's not a matter of "either run, OR crash".

    A good job by Microsoft, and the rest has work to do.

  • That's odd... (Score:2, Informative)

    by jridley ( 9305 ) on Tuesday October 19, 2004 @08:19AM (#10563736)
    I just downloaded and ran his test suite against Firefox 1.0PR.
    Everything looks fine to me. No crashes. I also ran it against his CGI and the "die" gallery in the tarball. No problems, though it did stop for a few seconds a few times.

    Now, he does mention that sometimes the crashes are due to memory exhaustion. I've got the suite running right now on another tab of the browser I'm typing in, and the memory usage is going up, but only very slowly; I think it's due to the logging in the JavaScript console that's happening due to the bad JavaScript that's running over there. After 15 minutes of constant reloading, it's up from 39 to 44 megs. So maybe if you reloaded bad JavaScript constantly for a couple of days, you might eventually run out of memory. Certainly if there's a memory leak, it should be fixed, but IMHO that's not a security hole, while crashing indicates the possibility of one.
  • by Khazunga ( 176423 ) * on Tuesday October 19, 2004 @08:20AM (#10563741)
    (Besides which, as a user, I find it infuriating that Mozilla/Firefox are so stuck up on perfectly standard HTML that they just don't work with some web sites that are perfectly usable in IE anyway.)
    Have you been giving them feedback [mozilla.org](1) lately? The Mozilla team will either add stuff to quirks mode, or pass the site reference to evangelists.

    (1) You'll have to copy+paste the URL, as bugzilla doesn't like slashdottings.

  • Bug 265027 (Score:2, Informative)

    by Val314 ( 219766 ) on Tuesday October 19, 2004 @08:25AM (#10563769)
    FYI http://bugzilla.mozilla.org/show_bug.cgi?id=265027 (copy/paste, Bugzilla doesnt like /. links)
  • by UberGeeb ( 574309 ) on Tuesday October 19, 2004 @08:36AM (#10563845)
    I've come across plenty of sites that either don't work at all or are broken unless you use IE. Generally, it's because the site looks at the browser's identification tag and sends crippled pages to non-IE browsers. I can only think of one site I use regularly (a web app at work) that actually doesn't work in Opera if I set it to report itself as IE.

    You might make sure that the sites you're having trouble with in Firefox are actually providing the same data they're giving IE before you assume it's a problem with the browser.

  • by BenjyD ( 316700 ) on Tuesday October 19, 2004 @08:37AM (#10563852)
    Why? They're finding and fixing the bugs in Firefox already (check bugzilla).
  • by ptlis ( 772434 ) on Tuesday October 19, 2004 @08:43AM (#10563899) Homepage
    The moz ones certainly crash FF 0.10.1, I have the quality feedback agent installed so I simply added the url to the example HTML pages with the report.
  • by peterhoeg ( 172874 ) <{peter} {at} {hoeg.com}> on Tuesday October 19, 2004 @08:44AM (#10563905) Homepage
    Go the "gallery" he mentionds is his entry and try the mozilla_die?.htm files. With Firefox 1.0PR the first one did the trick for me and crashed firefox.
  • by koi88 ( 640490 ) on Tuesday October 19, 2004 @08:46AM (#10563920)
    From many posts here I get the idea that most people didn't have the crashes the author had...
    So can those people who have tested his code write
    • used browser and version number
    • OS (exact)
    • result

    PS: I'm here at work on Mac OS 9 and all browsers are pretty old, so I don't write anything...
  • by ceeam ( 39911 ) on Tuesday October 19, 2004 @08:46AM (#10563923)
    OTOH - the fact that they did not _see_ crashes on MSIE does not mean it has no bugs. It may well mean that they silently swallow the exceptions. I've seen stuff like this:

    try
    p := nil; //EG
    p^ := ....;
    except // do nothing
    end;

    (if you speak Delphi).
  • by Anonymous Coward on Tuesday October 19, 2004 @09:08AM (#10564088)
    http://matt.ucc.asn.au/diesafari.html [ucc.asn.au] is a stripped-down version of the output of mangle with seed 0x5cdb0b39 (on 10.3.5, the seeding is probably different on other OSes). It certainly kills Safari here...
  • by pohl ( 872 ) on Tuesday October 19, 2004 @09:22AM (#10564193) Homepage
    I remember crashme, and I just checked the debian packages and anybody can "apt-get install crashme" to give it a whirl.

    I'd like to second the AC's suggesting of taking these HTML test cases and constructing an apache module that creates garbage HTML like this. The result would be a great contribution all browsers.

    The mozilla project did have a test that sent the browser to random pages accross the web, which exposed it to all sorts of garbaged HTML, I'm sure, but generating randomly garbaged HTML would probably be a more strenuous test.
  • by ibentmywookie ( 819547 ) on Tuesday October 19, 2004 @09:26AM (#10564231)
    I tried the mozilla ones with Mozilla 1.6 and firefox 0.9.1, and they both crashed and burned. The Opera test crashed my Opera 7.54 (opera remembers where it was, so yay).

    I tried the lynx (version 2.8.5rel.1) one, and um.. well.. Linux doesn't handle programs eating memory really quickly very well. I was getting mouse lag, and my system was really unresponsive. I hit ctrl-c about a thousand times, and managed to bring up KDE system guard and kill it.. it was using like 500MB of RAM. Nasty nasty stuff.
  • by Gilmoure ( 18428 ) on Tuesday October 19, 2004 @09:34AM (#10564289) Journal
    I tried each of his 'die*.html files and none of them killed my browser. Most just resulted in blank pages.

    I'm using Safari, on OS X (10.3.5)
  • by sysadmn ( 29788 ) <{sysadmn} {at} {gmail.com}> on Tuesday October 19, 2004 @09:40AM (#10564350) Homepage
    Possible, but unlikely to have impacted this test. The XPSP2 update is supposed to cause malformed code to crash an app, rather than subvert it. The point of the article is that IE didn't crash. Sure, it's because MS already does this sort of testing - but the point is that others ought to as well.
  • by sqlrob ( 173498 ) on Tuesday October 19, 2004 @09:47AM (#10564422)
    Executed by another process? What are you talking about? Processes in windows cannot mess with each other address space.

    They can intentionally, just not accidentially.

    ReadProcessMemory
    WriteProcessMemory
    CreateRem oteThread

    (NX bit works only at AMD64 processors and above last time I checked)

    Celeron D is now shipping with NX enabled. I don't know whether XP will take advantage of it.

  • by kryptkpr ( 180196 ) on Tuesday October 19, 2004 @10:00AM (#10564531) Homepage
    The fix for that bug is non-trivial, and breaks many other sites. A quick Ctrl + +/Ctrl + - is a good workaround.
  • by SmilingBoy ( 686281 ) on Tuesday October 19, 2004 @10:04AM (#10564562)
    Weird! I checked this in detail again. It seems that there is a difference whether other Firefox Windows with several tabs are open or not. If I have other open windows and tabs (like I normally have when surfing around), mozilla_die1 just slows down the computer, but you can actually close the tab again and you are back to normal. mozilla_die2 also slows down the computer, you can select other tabs, but you can't close the offending tab or load new pages in other tabs.

    If I only open mozilla_die 1 or 2 in a single tab in a single window and no other tabs are open, Firefox crashes immediately.

    mozilla_die3 never crashes Firefox.

  • by TheLink ( 130905 ) on Tuesday October 19, 2004 @10:17AM (#10564679) Journal
    Netscape used to crash very often. Looks like the Mozilla people didn't learn much from it.

    Mozilla is just as sucky security-wise as the old non-mozilla Netscape (3.x 4.x). Whether it is OSS or not doesn't make it secure/insecure, it's the programmers that count. Look at Sendmail and Bind (and many other ISC software), security problems year after year for many years. Look at PHPNuke - security problems month after month for years. Look at OpenSSL and OpenSSH and Apache 2.x - not very good track records. Compare with Postfix and qmail, djbdns.

    Most programmers should stick to writing their programs in languages where the equivalent of "spelling and grammar" errors don't cause execution of arbitrary attacker-code. Sure after a while some writers learn how to spell and their grammar improves but it sometimes takes years. For security you need _perfection_ in critical areas, and you need to be able to identify and isolate the critical areas _perfectly_ in your architecture.

    To the ignorant people who don't get it. Crashing is bad. A crash occurs when the (browser) process write/read data from areas where it shouldn't be touching, or tries to execute code where it shouldn't be executing. This often occurs when the process somehow mistakenly executes _data_ supplied by the attacker/bug finder, or returns to addresses supplied by the attacker...

    This sort of thing is what allows people to take over your browser, and screw up your data (and possibly take over your computer if you run the browser using an account with too many privileges).

    So while the FireFox people get their code up to scratch maybe people should reconsider IE - IE isn't so dangerous when configured correctly. Unfortunately it's not that simple to do that.

    To make even unpatched IE browsers invulnerable to 95% of the IE problems just turn off Active Scripting and ActiveX for all zones except very trusted zones which will never have malicious data. Since I don't trust Microsoft's trusted zone (XP has *.microsoft.com as trusted even though it doesn't show up in the menus), I create a custom zone and make that MY trusted zone.

    By all zones I mean you must turn those stuff off for the My Computer zone as well - but that screws up Windows Explorer in the default view mode (which is unsafe anyway).

    For more info read this: <a href="http://support.microsoft.com/default.aspx?kb id=182569">Description of Internet Explorer security zones registry entries</a>

    To make the My Computer zone visible change:
    (for computer wide policy)
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Win dows\Curr entVersion\Internet Settings\Zones\0\Flags

    To: 0x00000001

    (for just a particular user)
    HKEY_CURRENT_USER\SOFTWARE\Microsoft\Window s\Curre ntVersion\Internet Settings\Zones\0\Flags

    To: 0x00000001

    If you don't want to edit the registry and make the My Computer zone visible, you can still control the My Computer Zone settings from the group policy editor (gpedit.msc) or the active directory policy editor.

    You just have to know some Microsoft stuff. But hey, securing an OSS O/S and _keeping_ it secure (esp when u need to run lots of 3rd party software) also requires some in-depth knowledge.
  • by Tiram ( 650450 ) on Tuesday October 19, 2004 @11:02AM (#10565217) Homepage Journal
    This works fine for me, using Opera 7.54. What did you search for?
  • Re:Security Issues (Score:2, Informative)

    by Anonymous Coward on Tuesday October 19, 2004 @12:51PM (#10566662)
    Arguing that browsers should half-support broken XHTML is like saying that a C compiler should do something whenever it encounters invalid C

    While I understand the point you're making, C compilers generally do do something with invalid code. Constraint violations (sometimes referred to as syntax errors) require that the compiler output a diagnostic message, but they're free to continue to translate the code. Witness how gcc 2 dealt with:
    const int x = 0; x = 1;
    Then there's what's called undefined behavior. No diagnostic is required, but the code still violates the standard. It can compile, but the result is that anything could happen. For example:
    i = i++;
    So basically, C does what the OP is asking XHTML to do. Is this a good thing? I don't know. It makes compiler writers' jobs easier, and gives less overhead, and if you want all that safety, there are always languages like Ruby.
  • by CTachyon ( 412849 ) <chronos AT chronos-tachyon DOT net> on Tuesday October 19, 2004 @01:10PM (#10566876) Homepage

    A stack canary is a form of protection against stack overflows. And yes, the idea is named after the canaries used in coal mines. To put it in simple terms, a normal stack during a function call might look like this:

    Buffer: XXXX XXXX XXXX XXXX ...
    Saved registers: YYYY YYYY
    Return address: ZZZZ

    When the buffer is overflowed, the attacker fills it with more data than it can hold. The extra data first fills the saved registers, then overwrites the return address. The attacker can simply point the return address back into the buffer, or find more diabolical means ("return into libc", a few others), to run his own code.

    If a recent OS (first Linux, now Windows) is running on, say, an AMD64 system, then the entire stack is flagged with the NX (no execute) bit. If the attacker uses the normal technique of returning into the buffer, the processor will halt the program because it's trying to treat data as code without asking first. (This doesn't protect against return into libc attacks.)

    However, on ordinary x86 processors like Pentium 4 or Athlon XP, there is no NX bit. So, Microsoft altered their compiler to insert stack canaries into every function. The previous stack diagram is changed to something similar to this:

    Buffer: XXXX XXXX XXXX XXXX ...
    Canary: CCCC
    Saved registers: YYYY YYYY
    Return address: ZZZZ

    Ideally, the canaries are chosen randomly each time a function is called. However, this is too slow in practice, since functions get called *a lot*, so a program will randomly choose a single canary number once at startup and reuse it.

    Now the attacker can still overflow the buffer, but this time he has to overwrite the canary. If he already knows the canary, or guesses it correctly, everything works the same as in the case of an unprotected overflow. However, if he guesses wrong, the canary kicks in. To maintain the canary, there is some code inserted by the compiler at the start and end of every function. The start code inserts the canary into the stack, and the end code checks that the canary has not changed. If the canary changed, an error is triggered, and the program is halted before the function ever returns. This prevents the attacker's code from running if he doesn't know the canary number.

    There are still some scenarios that aren't protected by a stack canary, but it is rather effective overall, and actually protects against a few scenarios that the NX bit doesn't cover. It doesn't help protect against heap overflows, though, although there's no reason heap canaries can't be used also. (The heap is a lot harder to explain than the stack, but a lot of programs put some or all of their buffers in heap memory instead, and the heap can be attacked as well.)

  • by marsu_k ( 701360 ) on Tuesday October 19, 2004 @01:10PM (#10566879)
    Getting a bit offtopic, but while I really liked Code Complete, one of the most enlightening programming books I've read was The Practice of Programming [bell-labs.com]. Check it out if you haven't yet.
  • Re:Security Issues (Score:2, Informative)

    by DrPizza ( 558687 ) on Tuesday October 19, 2004 @01:44PM (#10567219) Homepage

    "Just to be clear, unparseable XHTML is not XHTML."

    And broken HTML is not HTML.

    The reason browsers try to parse broken HTML is not because the HTML spec requires them to do so (it doesn't, and it gives such documents no semantics). It's because neither early browsers nor page authors followed the specs strictly; early browsers would try to render malformed pages (either deliberately or through not explicitly rejecting such pages), and early page authors would (usually unwittingly) exploit this fact.

    If the first HTML renderers had followed the HTML spec and no more then the web would not be the mess it is today.

    XHTML doesn't really fix any of this; it resolves a small class of ambiguities that un-DTDed HTML hypothetically has (in HTML one needs to refer to the DTD to determine whether something of the form <img> is an empty element or a malformed element that lacks is closing tag (the DTD says whether things are empty or not); in XML (and hence XHTML) it's unambiguously an error because XML requires even empty elements to have closing tags, or use special shorthand). But it's not this that make it easier to parse; that alone has negligble impact on ease of parsing.

    Instead, it's the attitude that goes along with it--if it's not well-formed, reject it with an error message. There's no reason that the HTML spec couldn't be held in similarly high esteem.

  • by roca ( 43122 ) on Tuesday October 19, 2004 @01:59PM (#10567382) Homepage
    I wouldn't quite call it technical leadership; fuzz testing is old and lots of people do it on all kinds of projects. But sure, they did a better job on this than Mozilla.

    In the case of Mozilla it's really a resource and prioritization issue more than anything else: see http://it.slashdot.org/comments.pl?sid=126192&cid= 10564332
    Not that that's an excuse.
  • by prockcore ( 543967 ) on Tuesday October 19, 2004 @03:00PM (#10567974)
    Perhaps Konqueror is better than other browsers, or perhaps the involvement of Apple means that Safari is better tested than Mozilla or Opera.

    No, the reason Safari doesn't crash is because it stops reloading the page! Even though there's a meta refresh tag at the top, safari ignores it after 3 refreshes.

    If you keep reloading the page by hand, safari will eventually crash.

    So a bug in safari actually prevents the bug checker from running :)
  • by DunbarTheInept ( 764 ) on Tuesday October 19, 2004 @03:44PM (#10568498) Homepage

    NO, no, no, no!! It is a BAD thing, because at the very minimum it's a sign of non-existent exception handling. You should never get a runtime error from bad input. In some cases, you create an infinite loop-- is there any excuse for that?

    Yes. There is a perfect excuse for that - to fix it you have to solve the unsolvable halting problem from computer science which I assume you are already aware of. Can a C compiler determine if the C code it is running will loop forever? No. Can an interpreted language like the Bourne shell figure out if the input shell script it is processing will result in an infinite loop? No - being an interpreted instead of compiled language doesn't let you fix the halting problem. Looked at this way, the HTML engine inside a browser is in fact actually a program interpreter, with HTML as the source code. Thus the only way to catch the halting problem is to deny possibly valid runs, as we all learned in CompSci. In this case, that's probably exactly what IE is doing (for example, in theory rendering a table of 10,000 columns is a finishable task and not an infinite loop, andtherefore it would be wrong for an interpreter to deny the program the ability to do it. But in the case of a rendering engine for viewable content, it can safely assume that such a task would never work anyway, and cut it off at a max cap.)


    And considering the nature of the crashes (one of the links caused Firefox 1.0PR to die with a windows memory error, shutting down ALL instances of firefox) this means that some memory was accessed that shouldn't have been,

    This is not necessarily true. When some kinds of input trigger a crash when others don't, the cause MIGHT be a case where the input can stuff values into buffer overruns, but it doesn't have to be. The unusual input could trigger a conditional branch that is not normally run, and has a bug in it that crashes. The unusual input could case a variable initialization to be skipped because of such a conditional check (such that it did cause a variable to be altered, but not in a way that the input could control). The unusual input could simply be a case of picking a bigger number than the program was expecting to have to handle, and thus causing a hardcoded loop somewhere to process too far through an array (in which case there is a buffer overflow, but not one that lets the user stuff whatever he likes into that overflow.) It could be a case of the program not being able to handle the large amount of memory it would need to (validly) perform the request (as in, "try to render this 100,000 column table."), and the crash could just be the result of such a thing leading to a failed malloc().
  • by jelwell ( 2152 ) on Tuesday October 19, 2004 @06:27PM (#10570130)
    This is simply a misunderstanding of the article (no offense). The examples are specific outputs of a random machine. The outputs in question happen to kill the browser the file is named after. So by testing the html pages that are presented you've missed the beef of the article/posting, which is the random machine that came up with those specific html pages.

    The true test is to download the man's application mangleme.cgi, install in on a server, and then point Safari at it.

    I have done this and Safari (1.3 developer release) crashes quite quickly.

    So does IE 5.2 for the mac.

    As a side note. Netscape/Mozilla has had something, similar, but not quite the same as this for some time now. Called Browser Buster. It did not generate random html, but did continuously feed real websites, chosen somewhat at random until the browser crashed. I remember we used to have goals, like "last 24 hours on browser buster".

    Joseph Elwell.
  • by Tetch ( 534754 ) on Tuesday October 19, 2004 @09:50PM (#10571545) Journal
    > Given the arbitrary limits on this test, it appears to be designed specifically
    > to make IE look better than its competitors and prove some point rather
    > than be an objective investigation.

    It sounds like you have little idea who the author is, or you wouldn't make such a statement. Michal Zalewski is a well-respected security researcher, with impeccable credentials, and no particular love for Microsoft, who's made an undeniably valuable contribution in many areas of IT security.

    While he generally seems to work on Unix-like systems, he has also published work on M$ software security problems - e.g. http://www.bindview.com/Support/RAZOR/Advisories/2 001/adv_mstelnet.cfm [bindview.com]
    http://news.softpedia.com/news/2/2004/April/7797.s html [softpedia.com]
    http://cert.uni-stuttgart.de/archive/bugtraq/2000/ 06/msg00066.html [uni-stuttgart.de]

    A quick google will repay your time.

    Give the guy some credit - it seems he's uncovered a surprising lack of robustness in non-IE browsers - and admittedly an even more surprising degree of resilience in IE's handling of the HTML tag soup he played with ... strange but apparently true :-)

  • Re:Who's Who (Score:2, Informative)

    by spectecjr ( 31235 ) on Thursday October 21, 2004 @02:56AM (#10583758) Homepage
    Larry Osterman -- former Microsoft guy

    Make that Larry Osterman -- current microsoft guy.

Never test for an error condition you don't know how to handle. -- Steinbach

Working...