Forgot your password?
typodupeerror
Operating Systems Software Technology

World's First Formally-Proven OS Kernel 517

Posted by Soulskill
from the wait-for-it dept.
An anonymous reader writes "Operating systems usually have bugs — the 'blue screen of death,' the Amiga Hand, and so forth are known by almost everyone. NICTA's team of researchers has managed to prove that a particular OS kernel is guaranteed to meet its specification. It is fully, formally verified, and as such it exceeds the Common Criteria's highest level of assurance. The researchers used an executable specification written in Haskell, C code that mapped to Haskell, and the Isabelle theorem prover to generate a machine-checked proof that the C code in the kernel matches the executable and the formal specification of the system." Does it run Linux? "We're pleased to say that it does. Presently, we have a para-virtualized version of Linux running on top of the (unverified) x86 port of seL4. There are plans to port Linux to the verified ARM version of seL4 as well." Further technical details are available from NICTA's website.
This discussion has been archived. No new comments can be posted.

World's First Formally-Proven OS Kernel

Comments Filter:
  • spec? (Score:5, Insightful)

    by polar red (215081) on Thursday August 13, 2009 @07:59AM (#29049945)

    but is the spec useable ? bugfree ?

    • Is the hardware itself bug free?!
      • by VoidCrow (836595)
        Perhaps not currently but, given the work on high level hardware compilation languages, it's only a matter of time before similar techniques are applied to hardware design.
      • by EdZ (755139)
        For older hardware with well-known errata, yes. Why do you think modern fighter aircraft and spacecraft run on 486-derivatives and the like?
    • Re:spec? (Score:4, Insightful)

      by jadrian (1150317) on Thursday August 13, 2009 @08:38AM (#29050277)
      To verify the software meets its specification the specification itself must formalised in the theorem prover. This in turn gives you the possibility of proving properties of the spec itself.
    • Re:spec? (Score:5, Informative)

      by JasterBobaMereel (1102861) on Thursday August 13, 2009 @08:45AM (#29050349)

      It only means it meets the spec, not that the spec is correct ...

      It does not mean that the faulty or erratic hardware cannot crash the system

      It does not mean that other programs cannot crash and lose your data ...

      It does not mean that buggy device drivers can make your system unusable

      It does not mean that the system is perfect, only that it will always do what it is supposed to ... which may not be what you want ...

    • but is the spec useable ? bugfree ?

      And does it match the user or system needs ? And how easily can it be modified to cover new needs ?

      • Re:spec? (Score:4, Insightful)

        by TheLink (130905) on Thursday August 13, 2009 @11:42AM (#29053189) Journal
        Exactly. That's why these formal verification stuff is rather useless for most cases I see.

        If you pass the customer a mix of water, flour, yeast, eggs and sugar and the customer says "That's not cake, it's not acceptable".

        And you then say - "But we meet the cake spec we agreed on, so by that definition it's cake, you have to pay us".

        Sure you can go sue the customer and force them to pay you the full sum, but unless most other people agree the customer has just been way too fussy, you might have fewer customers in the future.
  • Thank goodness (Score:2, Insightful)

    by 2.7182 (819680)
    People are starting to see the value of this. Also of programming in logic based languages like Haskell, ML etc.
    • Re: (Score:2, Insightful)

      As opposed to programming languages that aren't logic-based? What separates Haskell, Ocaml, Lisp, Scheme, etc. from others is that they are functional programming languages, as opposed to imperative-based languages like C.

      Both use logic, just in different ways.

    • Re:Thank goodness (Score:4, Insightful)

      by johannesg (664142) on Thursday August 13, 2009 @08:16AM (#29050071)

      People are starting to see the value of this. Also of programming in logic based languages like Haskell, ML etc.

      People have seen the value of this since the first days of programming. In fact, the value is so enormous that no one can afford it... And they have just finished proving that first few lines of code they wrote. In another five decades they hope to be able to have Notepad proven and ready to run so you can actually get some work done!

      • In another five decades they hope to be able to have Notepad proven and ready to run so you can actually get some work done!

        Progress has been slow, mainly because for most of the last 20 years, they've been hung up trying to formally prove that a newline character looks like a rectangle.

  • Amiga Hand? (Score:5, Informative)

    by ArcadeNut (85398) on Thursday August 13, 2009 @08:03AM (#29049981) Homepage

    Or do you mean the "Guru Meditation Error"?

    • by Canazza (1428553)

      I prefer SACF interrupts, much more exciting

  • The Amiga Hand? (Score:5, Interesting)

    by johannesg (664142) on Thursday August 13, 2009 @08:05AM (#29049995)

    The Amiga Hand was the boot screen, not an error screen. You're thinking of the famous Guru Meditation.

    Besides, who says that the Amiga kernel did _not_ meet the specifications? Have you read them? Does it mention "crash free" anywhere?

    The Haskell code is called a "specfication", but if it is Haskell code, surely it should count as a _program_ already? How can you prove that that program is bug-free? How about conceptual bugs?

    Was the toolchain verified? How about the hardware on which it runs?

    What overhead does this approach have? Are the benefits worth it?

    I'm not saying this is all bullshit, but it looks like me that they are pointing to one program, calling it a "specification", and then demonstrating machine translation / verification to a similar program. I'm not sure if I buy that methodology.

    • Re:The Amiga Hand? (Score:4, Interesting)

      by TheCycoONE (913189) on Thursday August 13, 2009 @08:17AM (#29050083)

      Well the typical bugs that affect C programs, like buffer overflows, using a dereferenced pointer, etc; along with common mistakes made in procedural programming in general like off-by-one errors are much less likely to come up in a functional language like Haskel. In a lot of cases Haskel code is simply a LOT shorter and easier to read than it's C/C++ counterpart which makes it much easier to find a problem; not much harder than finding the problem in a spec on paper.

      So, no I don't think it guarantees anything, but it's a lot better than C code on its own.

    • Re:The Amiga Hand? (Score:5, Informative)

      by Anonymous Coward on Thursday August 13, 2009 @08:31AM (#29050217)

      The missing word is formal.

      They use a formal specification [wikipedia.org], which is then formally verified [wikipedia.org].

      The overhead? It took something like 5 years for a 10,000 line program. The benefit is if the specification is right, the program should be right.

      Other questions are answered in the FAQ linked in the summary and this page [nicta.com.au].

    • Re:The Amiga Hand? (Score:5, Interesting)

      by Tom (822) on Thursday August 13, 2009 @08:43AM (#29050321) Homepage Journal

      How can you prove that that program is bug-free? How about conceptual bugs?

      Formal verification does not tackle conceptual bugs. What it does is prove that the implementation conforms to the specification. If your specification is false, then it is false, but the implementation will correctly implement the false behaviour. In other words, this checks whether the house and the building plan are identical. If the plan has a window where there shouldn't be one, then that window will be there, because it's on the plan.

      What overhead does this approach have? Are the benefits worth it?

      RTFA. The amount of work required is staggering (four years, 200,000 theorems to prove) but since it's a verification of code, not additional testing code, there is zero overhead when the system is running.

  • by Joce640k (829181) on Thursday August 13, 2009 @08:10AM (#29050015) Homepage

    "If you give me six lines of code written by the most diligent of programmers, I will surely find enough in them to crash the OS" - Cardinal Ritchielieu

  • by WiseOwl2001 (742135) on Thursday August 13, 2009 @08:11AM (#29050021)
    Even if we have a perfect kernel, it won't insulate us from bugs in the software running on top of that kernel, so do we really gain much? I guess for mission critical apps the answer could be yes... But for every-day computing?? On my desktop I have more trouble with Firefox crashing than I do the OS! (Yes I run linux).
    • by John Hasler (414242) on Thursday August 13, 2009 @08:15AM (#29050057) Homepage

      > Even if we have a perfect kernel, it won't insulate us from bugs in the
      > software running on top of that kernel, so do we really gain much?

      Since a kernel crash kills all your applications and background processes, kills your network connection, requires you to reboot, and can scribble anywhere on the disk, yes.

      • Re: (Score:2, Informative)

        by Fri13 (963421)

        Well, kernel bug brings down whole OS and all the applications running top of it. But the kernel is usually most secure and stable. Most errors comes from drivers.

        http://www.usenix.org/publications/login/2006-04/openpdfs/herder.pdf [usenix.org]

        And that is the problem with the monolithic OS's. One bug in the driver or any other part of the OS, brings it down. But even it is about monolithic OS, it can be as secure as microkernel-based OS's like Hurd, Minix or NT.

        And I believe that the star of the next 10 years will come

    • by Tom (822) on Thursday August 13, 2009 @08:47AM (#29050365) Homepage Journal

      Yes, it gains you a lot.

      Firefox crashing means your userland memory is fucked up and can't be trusted anymore. No problem, kill it, clean it and restart the application.

      A kernel crash leads to undefined behaviour on the ring 0 level. You don't want that, it's where root exploits live.

      Furthermore, we have a lot of really, really strong kernel-level security extensions, like SELinux, whose only two vulnerable spots are kernel-level exploits and weak security policies. If you can remove one of them, you've done a lot to improve security.

      • by Arlet (29997)

        Firefox crashing means your userland memory is fucked up and can't be trusted anymore. No problem, kill it, clean it and restart the application.

        If your filesystem task running on top of a proven kernel has a bug, it can still corrupt your disk.

  • by John Hasler (414242) on Thursday August 13, 2009 @08:11AM (#29050027) Homepage

    "Beware of bugs in the above code. I have only proven it correct, not tested it."

  • Provable? (Score:2, Interesting)

    by tedgyz (515156) *

    I thought any sufficiently complex system was impossible to prove correct.

    • Well, one thing you can do is cut and paste your code into the specification. Now your code perfectly matches the specification---and is coded as specified.

      They seem to be doing the same thing, except going about it some backwards way.

      • by mwvdlee (775178)

        Well, one thing you can do is cut and paste your code into the specification. Now your code perfectly matches the specification---and is coded as specified.

        That would be doing it backwards...

        They seem to be doing the same thing, except going about it some backwards way.

        ...and doing a backwards thing backwards would be the right way again.

    • Re:Provable? (Score:5, Insightful)

      by jamesh (87723) on Thursday August 13, 2009 @08:33AM (#29050225)

      I thought any sufficiently complex system was impossible to prove correct.

      Then obviously this OS is not sufficiently complex.

    • Re: (Score:2, Informative)

      by bramez (190835)

      only if you want to prove it automatically. this proof is constructed by hand, with some automatical steps in between, and then *verified* automatically. to manage the complexity of the proof, the proof is done in a more abstract formal specification language , then refined two times (haskel, C) . the correctness of the refinements are formally proven by the Isabelle system.

    • Well, there are some theoretical problems with this idea. The halting problem is trivial to prove insolvable, and it turns out that it's possible to generalise this proof and demonstrate that it's not possible to prove any complex properties of arbitrary programs. The key word here is arbitrary. You can, for example, trivially (algorithmically; the space / time requirements can be very large) solve the halting problem for any program that can be represented as a finite automaton. This includes any progr

  • Suddenly after that, the proven kernel will be brought to its knees when someone adds a driver for an old graphics card.

  • But... (Score:2, Funny)

    ... we will lost all the fun for Blue Screens and Kernel Panics!
  • Proven? (Score:5, Insightful)

    by mseeger (40923) on Thursday August 13, 2009 @08:28AM (#29050175)
    There is an old corollary that says, you cannot get from the informal to the formal by formal means. All they have proven is, that two specifications contain the same bugs. Both specification were formal (Haskell, C). This is the same as having Perl and Python code and you to prove they implement the same functionality. Neither is a proof, it is bug free (informal definition of bug, not if a bug is specified it isn't a bug).
    • by Viol8 (599362)

      This is something that people who bang on about formal proofs conveniently forget - all it does is move the bugs from the source code to the formal specification. And if the spec is detailed enough to be useful it effectively becomes code anyway so you might just as well write the actual code and debug *that* instead.

      • This is what executable specification languages, like Maude, actually do. I'm still not convinced by the approach, and I did my PhD in a department full of people working on formal specification, because it seems like they end up reinventing programming languages. That's not to say that they don't have some good ideas which can be used to extend programming languages, but starting by discarding 60 years of programming language research doesn't seem like a good model.
      • Re: (Score:3, Insightful)

        by Anonymous Coward

        That this moves the bugs to the formal specification is true, but that therefore you might just as well write the actual code is an invalid derivation. Formal specification languages are designed with the idea that one should be able to reason about them in mind (be it manually or with the help of automatic/interactive theorem proving, model checking). This typically leads to a language in which systems can be expressed on a higher level, because performance issues are not important: the specification doe

      • Re: (Score:3, Insightful)

        by ralphbecket (225429)

        I don't thinks that's fair: the spec. just says *what* properties the system should have; the implementation says *how* those properties are provided. Where you can do it, denotational semantics (the spec.) let you say a lot more than you can with operational semantics (the code).

        Of course, your argument also suffers from circularity: to debug code you have to have a spec. in the first place to tell you when the implementation is doing the wrong thing.

  • the kernel was so perfect that it thought it was god and it condemned the rest of the OS to hell
    • by jamesh (87723)

      the kernel was so perfect that it thought it was god and it condemned the rest of the OS to hell

      Windows is a good example of an OS that _thinks_ it is perfect. Fortunately Microsoft's idea of hell is my idea of heaven :)

  • by neongenesis (549334) on Thursday August 13, 2009 @08:35AM (#29050241)
    Do some research Guys...

    Honywell SCOMP Early 1980s Was intended to me a secure front-end to Multics.

    Verified by NSA et all as part of the first Orange-book A1 level certification.

    For the time it was a magnificent bit of work.

    • Re: (Score:3, Insightful)

      by TheRaven64 (641858)
      Thanks, I was surprised by this claim too. I remember hearing back in 2007 about a formally-verified kernel written in Ocaml that some people at Cambridge had written, and they weren't claiming that theirs was the first (although it may have been the first to run on commodity x86 hardware).
  • I was fairly sure the OS running on the MONDEX smartcards was formally proven.

  • This should once and for all silence the people who claim that it's impossible to formally verify any non-trivial code.

    Oh, what am I dreaming. They won't shut up, but maybe now they'll die out.

    • by Desler (1608317)
      No, what people claimed was that to formally verify any non-trivial code that it would take a long time to do. This announcement proves that. It took them 5 years to verify 10,000 lines of code for something that isn't even that complex of a piece of code. Come back to us when you can formally verify a sufficiently complex kernel that can actually be put into production in less time and we'll talk.
  • by pedantic bore (740196) on Thursday August 13, 2009 @08:44AM (#29050335)
    This is a nice accomplishment, but L4 is such a minimal kernel that some folks argue that it's not even a microkernel. It's a picokernel.

    It's a lot easier to get the kernel right when it only has twelve entry points...

    • Re: (Score:3, Informative)

      by noz (253073)

      This is a nice accomplishment, but L4 is such a minimal kernel that some folks argue that it's not even a microkernel. It's a picokernel.

      Microkernels provde:

      • address spaces
      • threads
      • IPC

      L4 checks these boxes. Nanokernel and picokernel are poorly defined, emotive terms. Not reasonable criticism.

  • Not very useful... (Score:2, Insightful)

    by jamesivie (805019)

    In addition to what's already been pointed out about formal vs. informal specifications,

    1. Running Linux on top of this is no more safe than running Linux directly on the hardware--it's Linux that crashes (though not very often!)

    2. Running this on a CPU that has not been formally proven is nearly useless because only part of the system has been proven, which results in an overall system that is unproven.

    • That depends. You can run multiple instances of Linux on it, for example, and have it isolate them. A privilege escalation vulnerability in Linux will only compromise that instance of Linux and (at least in theory) there should be no vulnerabilities in the microkernel that would allow the attacker to escape that instance. You could, for example, run an instance of Linux for normal use, an instance of Linux for internet banking, and an instance of Linux for normal web browsing and have them completely iso
  • by Jacques Chester (151652) on Thursday August 13, 2009 @08:48AM (#29050389)

    First off, as an Australian and a nerd, I am very proud.

    Now.

    Good news: there is now a formally verified microkernel. 8,700 lines of C and 600 odd lines of ARM assembly. Awesome.

    Bad news: it took 200,000 lines of manually-generated proof and approximately 25 person-years by PhDs to verify the aforementioned microkernel.

    Conclusion: formal verification of software is not going to take off any time soon.

    • At the speed commercial IT moves right now? No.

      Eventually? I think it would be in the best interest of everyone to work toward making this the normal operation in CS.

  • he would probably say:

    "Please be aware of bugs. I just proven this kernel to be correct, didn't actually run it" ;)

  • Great work, but... (Score:3, Informative)

    by Salamander (33735) <jeff AT pl DOT atyp DOT us> on Thursday August 13, 2009 @09:04AM (#29050581) Homepage Journal

    ...there are two caveats. One is that "executable specification in Haskell" is a bit of a dodge. It means that they have effectively verified the translation from Haskell to C, but if there are bugs in the Haskell then there will be bugs in the C as well (in fact this is part of what they've proven). They admit as much at the very top of http://ertos.nicta.com.au/research/l4.verified/proof.pml [nicta.com.au], but it still leaves the question of why this is any better than simply rewriting the microkernel in Haskell. If the benefit of a concise and abstract specification is lost, what's left?

    The second major caveat is that they had to leave out some features that they couldn't verify. Most significant is that they didn't verify multi-core - by which it seems they mean multi-processor, but "multi-core" is the much hipper word. Since even laptops are multi-processor now, this limits applicability until they take the next step. With all of these shortcuts, though, if you look at the list of what they did manage to verify - mostly pointer errors and overflows - it's not clear that it couldn't have been done with a more real-code-oriented tool such as the Stanford Checker or sparse.

  • The kernel has now been proven to implement buggy specifications precisely!

  • by Remus Shepherd (32833) <remus@panix.com> on Thursday August 13, 2009 @09:14AM (#29050677) Homepage

    As someone who does not work in IT, count me as surprised that not all OSes are tested this rigorously.

    • Re: (Score:3, Insightful)

      by maxwell demon (590494)

      Linux has more than ten million lines of code. Given that they needed 5 years for 12 persons to verify ten thousand lines of code, this means verifying the Linux kernel would give an estimated cost of 60,000 man-years. So even if they got a thousand people doing nothing else but verifying the Linux kernel, it would take then 60 years to finish.

    • Re: (Score:3, Insightful)

      by DerekLyons (302214)

      That more like "not only doesn't work in IT, but doesn't know of or understand the theory". No slam, just the facts.

      Seriously, the reason all OSes aren't tested this vigorously is that the time/amount of work needed to do so rises exponentially with the size/complexity of the OS. Not only does that imply the OS would take years to bring to market, but also that said OS would incredibly expensive. Worse yet, after all that time and expense, all you've actually proven is that your running code matc

  • I'm not a computer scientist, so this may be obvious, but how does the halting problem come into this? I mean, the summary suggests that you know this OS won't crash, but wouldn't a general method for proving that be solving the halting problem? I'm guessing that the answer is that they've just proved it meets the spec, and the spec cannot be proved not to crash. Or perhaps it's that they've proved that this program won't crash by a method that's not generally applicable to all problems, so they would

  • by autophile (640621) on Thursday August 13, 2009 @10:28AM (#29051895)
    I have written a truly marvelous bug-free operating system, however there is not enough space on this disk to include it here.
  • by chongo (113839) * on Thursday August 13, 2009 @09:04PM (#29060579) Homepage Journal
    This Slashdot article, referring to the so called "World's First Formally-Proven OS Kernel",was brought to my attention by a colleague who is aware of my experience with formally proven OS' in the 1980's. What follows is my response to the claim of being first, and the value of proving the correctness of an OS:

    I am aware of at least two instances of operating system kernels that were built in the late 1970's / early 1980's using formal proofs of correctness. I will talk about my experience with one of them.

    One of them, produced in the late 1970's was a kernel designed for a specialized environment. This kernel/OS was a reasonably functional kernel complete with multiprocessing, time-sharing, file systems, etc. Unfortunately while the formal proof for this kernel was solid, the axiomatic set on which this formal proof was based did not perfectly match its operating environment. This mismatch proved to be fatal to the OS security.

    This formally proven OS took years to create and prove its correctness. Those who developed and maintained the OS were very proud of their work. There were plans underway to create a commercial version of their work and to market it through a certain hardware vendor on which their OS ran.

    When I was a student intern working for the organization where that developed this OS worked, I worked in their OS environment from time to time. I came in from the outside where my OS experience was with a wide variety OS' such as MVS, NOS, KRONOS, TOPS-10, RSTS/E, VMS, Multics, and Unix (5th/6th/7th edition). I had enough experience in jumping into new OS environments that I felt comfortable as a user in this one, even though it was unusual.

    An important point to observe here is I was one of the first people who enter this OS environment from the outside. I was not steeped in the development world of the OS. I brought with me, ideas that differed from the OS developers. As a young student, I believed that the OS should work for me instead of getting in my way. To help come up to speed, I ported over my collection of OS-independent tools and soon began coding away on my assigned tasks.

    Then one day, working within my OS-independent tools, something very odd happened. By mistake, I did something that produced an unusual result. I quickly realized that something was very wrong because the result was "impossible" according to the formal proof of OS correctness. Under the rules set down by my employer I immediately contacted the appropriate security officer and the next thing I knew, I was in a long sequence of meetings with people trying to figure out what in the hell happened.

    In the first meeting after my mistake, I learned that I had been reassigned to a new team. I was assured that I was not being disciplined, far from it: I had made a number of people very happy and they moved paperwork mountains to move me into their team. This team was given a task of attempting to repeat my previous "mistake" as well as to discover if exploits that are more general were possible against this OS. We were assigned to work âoeundercoverâ as developers under test/QA installations using this OS. In the end, the team was successful in discovering a much more general form of the OS hole I accidentally found.

    What went wrong with the OS formal proof? Well the mapping from the formal logic world to the real world of hardware, physics, people, and the environment was flawed. In other words, when you added in the "real-world", the proof was incomplete. Attempts by the OS developers to expand their proof to address the "real-world" only produced a set of inconsistent theorems. I believe the OS project was abandoned after the OS developers failed multiple times to expand their formal proof to deal with âoereal-worldâ environments.

    During this experience I was introduced to two "Security Camps": One, "the absolutists" as they called themselves, included people who worked on this formally proven OS. The opposing camp called the

HOLY MACRO!

Working...