IE Shines On Broken Code 900
mschaef writes "While reading Larry Osterman'a blog (He's a long time Microsoftie, having worked on products dating back to DOS 4.0), I ran across this BugTraq entry on web browser security. Basically, the story is that Michael Zalewski started feeding randomly malformed HTML into Microsoft Internet Explorer, Mozilla, Opera, Lynx, and Links and watching what happened. Bottom line: 'All browsers but Microsoft Internet Explorer kept crashing on a regular basis due to NULL pointer references, memory corruption, buffer overflows, sometimes memory exhaustion; taking several minutes on average to encounter a tag they couldn't parse.' If you want to try this at home, he's also provided the tools he used in the BugTraq entry."
which version of IE was it? (Score:5, Informative)
Tried with Safari on OS X ... (Score:5, Informative)
Konqueror and bugs (Score:3, Informative)
I think this is a nice feature.
I wish that konqueror would have been tested. It's a good browser.
The power of open source (Score:2, Informative)
Larry Osterman is about to demonstrate one of the great values of open source. He's identified a set of malformed HTML and within a few days/weeks someone will have fixed it.
If this were a closed source / Microsoft browser, then there would have to be a complete release cycle before a non-security issue is resolved.
All software has defects, it is the access to the code that allows someone to rapidly fix issues that sets open source apart.
Tested Konqueror (Score:5, Informative)
Re:Conspiracy Theory time... (Score:4, Informative)
RTFA. Larry didn't find the broken HTML, he just referenced an article [securityfocus.com] which did.
Re:What about VALID html? (Score:5, Informative)
Second, "crashing when invalid" as you and many others are alluding to is NOT a good idea. What if you had another tab open with email/urls/info you needed?
What if other software took this route? Invalid operands to open()? Time to crash. Invalid socket used in send()? Time to crash. Segfault in application? Kill the kernel processes!
It's a problem, it has to be fixed and there aren't two ways about it.
Tom
Re:Coding to Standards (Score:4, Informative)
Re:Tested Konqueror (Score:2, Informative)
Re:Excellent! (Score:5, Informative)
Here's the mozilla_die1.html code And the mozilla_die2.html code It looks like he came across places where either boundary checks or type checks are not in place.
Besides, he's had access to almost all the browswer code, hasn't he?
I mean, these bugs are bad, but I'm sure if I had access to IE's code I could come up with a zillion bugs.
Re:Security Issues (Score:4, Informative)
I'm not sure what the behaviour of 4.0 Transitional and 4.0 Strict is supposed to be
It's kind of in the name. Transitional should best-guess. Strict should not.
Re:The reason for this is simple (Score:2, Informative)
valgrind
It's a Free Software purify alternative and works great!
Who's Who (Score:5, Informative)
Larry Osterman -- former Microsoft guy; someone forwarded him a post to Bugtraq.
Michael Zalewski -- absurdly brilliant [coredump.cx] security engineer out of Poland. Did the pioneering work on visualizing [wox.org] randomness [coredump.cx] of network stacks, passively identifying operating systems [coredump.cx] on networks, and way way more.
Nothing bad against Larry. But this is all Zalewski
--Dan
IE Crashes On Valid HTML! (Score:5, Informative)
Nevermind using random garbage to crash a browser, you can make IE6 crash with perfectly valid strict HTML.
Try this page [nildram.co.uk] in IE6 and then hover your pointer over the link. Crash!!!
results of testing mozilla on linux- NO CRASHES (Score:2, Informative)
I think there may be some FUD here...
Re:IE Crashes On Valid HTML! (Score:5, Informative)
No problems on Firefox 0.10.1 (Score:2, Informative)
Re:Tested Konqueror (Score:1, Informative)
Re:Is this for real? (Score:1, Informative)
Re:Excellent! (Score:5, Informative)
I've tried this in 1.0PR firefox on win32, and the crashes do occur there.
I've gotta say - this really looks like a great tool; a simple and effective way of finding some bugs!
--Eamon
Test your code at http://validator.w3.org/ (Score:3, Informative)
Re:results of testing mozilla on linux- NO CRASHES (Score:1, Informative)
Try the known examples that "work" here:
http://lcamtuf.coredump.cx/mangleme/galler
They work for me using the latest firefox on linux.
Re:An important security sidenote (Score:5, Informative)
Once you've lost control of your program, all bets are off. The only difference between crashing and taking control is exactly WHAT bad data you feed into the program. These browsers simply crashed because RANDOM data was being fed in. That random data could be changed to carefully-crafted executable code, and BAM, your harmless "crash" is a security exploit.
Re:An important security sidenote (Score:4, Informative)
It's true that *catching* bad input and deliberately aborting (hopefully with a somewhat reasonable error message) is good. According to the article, that's not what's going on... the browsers are NOT checking input, e.g. scanning into uninitialized buffer areas because they aren't finding an expected end marker, or a length is incorrect. So parent is exactly right... that kind of "buffer overrun" bug is exactly what can be exploited.
Re:An important security sidenote (Score:3, Informative)
You aren't a security expert, are you? Now, your first lesson in computer security is, write this a hundred times: *crashing* on malicious code is *GOOD*, while *running* malicious code is *BAD*.
HTML in a browser isn't code. It's data. Running any HTML as code is *BAD*.
The fact that it does crash some browsers indicates that they probably are trying to run part of it as code - probably because of buffer overruns and the like. The whole reason it crashes is that it's running the code. That's very bad. It's not a matter of "either run, OR crash".
A good job by Microsoft, and the rest has work to do.
That's odd... (Score:2, Informative)
Everything looks fine to me. No crashes. I also ran it against his CGI and the "die" gallery in the tarball. No problems, though it did stop for a few seconds a few times.
Now, he does mention that sometimes the crashes are due to memory exhaustion. I've got the suite running right now on another tab of the browser I'm typing in, and the memory usage is going up, but only very slowly; I think it's due to the logging in the JavaScript console that's happening due to the bad JavaScript that's running over there. After 15 minutes of constant reloading, it's up from 39 to 44 megs. So maybe if you reloaded bad JavaScript constantly for a couple of days, you might eventually run out of memory. Certainly if there's a memory leak, it should be fixed, but IMHO that's not a security hole, while crashing indicates the possibility of one.
Re:An important security sidenote (Score:3, Informative)
(1) You'll have to copy+paste the URL, as bugzilla doesn't like slashdottings.
Bug 265027 (Score:2, Informative)
Re:An important security sidenote (Score:5, Informative)
You might make sure that the sites you're having trouble with in Firefox are actually providing the same data they're giving IE before you assume it's a problem with the browser.
Re:The power of open source (Score:3, Informative)
Re:An important security sidenote (Score:1, Informative)
Re:An important security sidenote (Score:5, Informative)
Poll: WHO has experienced crashes? (Score:3, Informative)
So can those people who have tested his code write
PS: I'm here at work on Mac OS 9 and all browsers are pretty old, so I don't write anything...
Re:An important security sidenote (Score:1, Informative)
try
p
p^
except
end;
(if you speak Delphi).
Re:Tried with Safari on OS X ... (Score:2, Informative)
Re:This is a blessing in disguise (Score:5, Informative)
I'd like to second the AC's suggesting of taking these HTML test cases and constructing an apache module that creates garbage HTML like this. The result would be a great contribution all browsers.
The mozilla project did have a test that sent the browser to random pages accross the web, which exposed it to all sorts of garbaged HTML, I'm sure, but generating randomly garbaged HTML would probably be a more strenuous test.
Don't try the Lynx one! (Score:2, Informative)
I tried the lynx (version 2.8.5rel.1) one, and um.. well.. Linux doesn't handle programs eating memory really quickly very well. I was getting mouse lag, and my system was really unresponsive. I hit ctrl-c about a thousand times, and managed to bring up KDE system guard and kill it.. it was using like 500MB of RAM. Nasty nasty stuff.
His examples don't work (crash browser) (Score:2, Informative)
I'm using Safari, on OS X (10.3.5)
Re:which version of IE was it? (Score:3, Informative)
Re:An important security sidenote (Score:4, Informative)
They can intentionally, just not accidentially.
ReadProcessMemory
WriteProcessMemory
CreateRe
(NX bit works only at AMD64 processors and above last time I checked)
Celeron D is now shipping with NX enabled. I don't know whether XP will take advantage of it.
Re:His examples do not really crash Firefox (Score:3, Informative)
Re:His examples do not really crash Firefox (Score:5, Informative)
If I only open mozilla_die 1 or 2 in a single tab in a single window and no other tabs are open, Firefox crashes immediately.
mozilla_die3 never crashes Firefox.
OSS does not automatically mean secure (Score:5, Informative)
Mozilla is just as sucky security-wise as the old non-mozilla Netscape (3.x 4.x). Whether it is OSS or not doesn't make it secure/insecure, it's the programmers that count. Look at Sendmail and Bind (and many other ISC software), security problems year after year for many years. Look at PHPNuke - security problems month after month for years. Look at OpenSSL and OpenSSH and Apache 2.x - not very good track records. Compare with Postfix and qmail, djbdns.
Most programmers should stick to writing their programs in languages where the equivalent of "spelling and grammar" errors don't cause execution of arbitrary attacker-code. Sure after a while some writers learn how to spell and their grammar improves but it sometimes takes years. For security you need _perfection_ in critical areas, and you need to be able to identify and isolate the critical areas _perfectly_ in your architecture.
To the ignorant people who don't get it. Crashing is bad. A crash occurs when the (browser) process write/read data from areas where it shouldn't be touching, or tries to execute code where it shouldn't be executing. This often occurs when the process somehow mistakenly executes _data_ supplied by the attacker/bug finder, or returns to addresses supplied by the attacker...
This sort of thing is what allows people to take over your browser, and screw up your data (and possibly take over your computer if you run the browser using an account with too many privileges).
So while the FireFox people get their code up to scratch maybe people should reconsider IE - IE isn't so dangerous when configured correctly. Unfortunately it's not that simple to do that.
To make even unpatched IE browsers invulnerable to 95% of the IE problems just turn off Active Scripting and ActiveX for all zones except very trusted zones which will never have malicious data. Since I don't trust Microsoft's trusted zone (XP has *.microsoft.com as trusted even though it doesn't show up in the menus), I create a custom zone and make that MY trusted zone.
By all zones I mean you must turn those stuff off for the My Computer zone as well - but that screws up Windows Explorer in the default view mode (which is unsafe anyway).
For more info read this: <a href="http://support.microsoft.com/default.aspx?k
To make the My Computer zone visible change:
(for computer wide policy)
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Wi
To: 0x00000001
(for just a particular user)
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windo
To: 0x00000001
If you don't want to edit the registry and make the My Computer zone visible, you can still control the My Computer Zone settings from the group policy editor (gpedit.msc) or the active directory policy editor.
You just have to know some Microsoft stuff. But hey, securing an OSS O/S and _keeping_ it secure (esp when u need to run lots of 3rd party software) also requires some in-depth knowledge.
Re:An important security sidenote (Score:2, Informative)
Re:Security Issues (Score:2, Informative)
While I understand the point you're making, C compilers generally do do something with invalid code. Constraint violations (sometimes referred to as syntax errors) require that the compiler output a diagnostic message, but they're free to continue to translate the code. Witness how gcc 2 dealt with: Then there's what's called undefined behavior. No diagnostic is required, but the code still violates the standard. It can compile, but the result is that anything could happen. For example: So basically, C does what the OP is asking XHTML to do. Is this a good thing? I don't know. It makes compiler writers' jobs easier, and gives less overhead, and if you want all that safety, there are always languages like Ruby.
Re:An important security sidenote (Score:5, Informative)
A stack canary is a form of protection against stack overflows. And yes, the idea is named after the canaries used in coal mines. To put it in simple terms, a normal stack during a function call might look like this:
When the buffer is overflowed, the attacker fills it with more data than it can hold. The extra data first fills the saved registers, then overwrites the return address. The attacker can simply point the return address back into the buffer, or find more diabolical means ("return into libc", a few others), to run his own code.
If a recent OS (first Linux, now Windows) is running on, say, an AMD64 system, then the entire stack is flagged with the NX (no execute) bit. If the attacker uses the normal technique of returning into the buffer, the processor will halt the program because it's trying to treat data as code without asking first. (This doesn't protect against return into libc attacks.)
However, on ordinary x86 processors like Pentium 4 or Athlon XP, there is no NX bit. So, Microsoft altered their compiler to insert stack canaries into every function. The previous stack diagram is changed to something similar to this:
Ideally, the canaries are chosen randomly each time a function is called. However, this is too slow in practice, since functions get called *a lot*, so a program will randomly choose a single canary number once at startup and reuse it.
Now the attacker can still overflow the buffer, but this time he has to overwrite the canary. If he already knows the canary, or guesses it correctly, everything works the same as in the case of an unprotected overflow. However, if he guesses wrong, the canary kicks in. To maintain the canary, there is some code inserted by the compiler at the start and end of every function. The start code inserts the canary into the stack, and the end code checks that the canary has not changed. If the canary changed, an error is triggered, and the program is halted before the function ever returns. This prevents the attacker's code from running if he doesn't know the canary number.
There are still some scenarios that aren't protected by a stack canary, but it is rather effective overall, and actually protects against a few scenarios that the NX bit doesn't cover. It doesn't help protect against heap overflows, though, although there's no reason heap canaries can't be used also. (The heap is a lot harder to explain than the stack, but a lot of programs put some or all of their buffers in heap memory instead, and the heap can be attacked as well.)
Re:An important security sidenote (Score:3, Informative)
Re:Security Issues (Score:2, Informative)
"Just to be clear, unparseable XHTML is not XHTML."
And broken HTML is not HTML.
The reason browsers try to parse broken HTML is not because the HTML spec requires them to do so (it doesn't, and it gives such documents no semantics). It's because neither early browsers nor page authors followed the specs strictly; early browsers would try to render malformed pages (either deliberately or through not explicitly rejecting such pages), and early page authors would (usually unwittingly) exploit this fact.
If the first HTML renderers had followed the HTML spec and no more then the web would not be the mess it is today.
XHTML doesn't really fix any of this; it resolves a small class of ambiguities that un-DTDed HTML hypothetically has (in HTML one needs to refer to the DTD to determine whether something of the form <img> is an empty element or a malformed element that lacks is closing tag (the DTD says whether things are empty or not); in XML (and hence XHTML) it's unambiguously an error because XML requires even empty elements to have closing tags, or use special shorthand). But it's not this that make it easier to parse; that alone has negligble impact on ease of parsing.
Instead, it's the attitude that goes along with it--if it's not well-formed, reject it with an error message. There's no reason that the HTML spec couldn't be held in similarly high esteem.
Re:An important security sidenote (Score:3, Informative)
In the case of Mozilla it's really a resource and prioritization issue more than anything else: see http://it.slashdot.org/comments.pl?sid=126192&cid
Not that that's an excuse.
Re:I'd like to see a Safari test. (Score:3, Informative)
No, the reason Safari doesn't crash is because it stops reloading the page! Even though there's a meta refresh tag at the top, safari ignores it after 3 refreshes.
If you keep reloading the page by hand, safari will eventually crash.
So a bug in safari actually prevents the bug checker from running
Re:An important security sidenote (Score:3, Informative)
NO, no, no, no!! It is a BAD thing, because at the very minimum it's a sign of non-existent exception handling. You should never get a runtime error from bad input. In some cases, you create an infinite loop-- is there any excuse for that?
Yes. There is a perfect excuse for that - to fix it you have to solve the unsolvable halting problem from computer science which I assume you are already aware of. Can a C compiler determine if the C code it is running will loop forever? No. Can an interpreted language like the Bourne shell figure out if the input shell script it is processing will result in an infinite loop? No - being an interpreted instead of compiled language doesn't let you fix the halting problem. Looked at this way, the HTML engine inside a browser is in fact actually a program interpreter, with HTML as the source code. Thus the only way to catch the halting problem is to deny possibly valid runs, as we all learned in CompSci. In this case, that's probably exactly what IE is doing (for example, in theory rendering a table of 10,000 columns is a finishable task and not an infinite loop, andtherefore it would be wrong for an interpreter to deny the program the ability to do it. But in the case of a rendering engine for viewable content, it can safely assume that such a task would never work anyway, and cut it off at a max cap.)
And considering the nature of the crashes (one of the links caused Firefox 1.0PR to die with a windows memory error, shutting down ALL instances of firefox) this means that some memory was accessed that shouldn't have been,
This is not necessarily true. When some kinds of input trigger a crash when others don't, the cause MIGHT be a case where the input can stuff values into buffer overruns, but it doesn't have to be. The unusual input could trigger a conditional branch that is not normally run, and has a bug in it that crashes. The unusual input could case a variable initialization to be skipped because of such a conditional check (such that it did cause a variable to be altered, but not in a way that the input could control). The unusual input could simply be a case of picking a bigger number than the program was expecting to have to handle, and thus causing a hardcoded loop somewhere to process too far through an array (in which case there is a buffer overflow, but not one that lets the user stuff whatever he likes into that overflow.) It could be a case of the program not being able to handle the large amount of memory it would need to (validly) perform the request (as in, "try to render this 100,000 column table."), and the crash could just be the result of such a thing leading to a failed malloc().
Safari crashes just like everything else. (Score:3, Informative)
The true test is to download the man's application mangleme.cgi, install in on a server, and then point Safari at it.
I have done this and Safari (1.3 developer release) crashes quite quickly.
So does IE 5.2 for the mac.
As a side note. Netscape/Mozilla has had something, similar, but not quite the same as this for some time now. Called Browser Buster. It did not generate random html, but did continuously feed real websites, chosen somewhat at random until the browser crashed. I remember we used to have goals, like "last 24 hours on browser buster".
Joseph Elwell.
Re:This "random" test is dangerously incomplete. (Score:2, Informative)
> to make IE look better than its competitors and prove some point rather
> than be an objective investigation.
It sounds like you have little idea who the author is, or you wouldn't make such a statement. Michal Zalewski is a well-respected security researcher, with impeccable credentials, and no particular love for Microsoft, who's made an undeniably valuable contribution in many areas of IT security.
While he generally seems to work on Unix-like systems, he has also published work on M$ software security problems - e.g. http://www.bindview.com/Support/RAZOR/Advisories/2 001/adv_mstelnet.cfm [bindview.com]
s html [softpedia.com]
/ 06/msg00066.html [uni-stuttgart.de]
http://news.softpedia.com/news/2/2004/April/7797.
http://cert.uni-stuttgart.de/archive/bugtraq/2000
A quick google will repay your time.
Give the guy some credit - it seems he's uncovered a surprising lack of robustness in non-IE browsers - and admittedly an even more surprising degree of resilience in IE's handling of the HTML tag soup he played with ... strange but apparently true :-)
Re:Who's Who (Score:2, Informative)
Make that Larry Osterman -- current microsoft guy.