Slashdot Log In
Google Native Client Puts x86 On the Web
Posted by
timothy
on Tue Dec 09, 2008 10:33 AM
from the which-can-then-be-virtualized-ad-infinitum dept.
from the which-can-then-be-virtualized-ad-infinitum dept.
t3rmin4t0r writes "Google has announced its Google native client, which enables x86 native code to be run securely inside a browser. With Java applets already dead and buried, this could mean the end of the new war between browsers and the various JavaScript engines (V8, Squirrelfish, Tracemonkey). The only question remains whether it can be secured (ala ActiveX) and whether the advantages carry over onto non-x86 platforms. The package is available for download from its Google code site. Hopefully, I can finally write my web apps in asm." Note: the Google code page description points out that this is not ready for production use: "We've released this project at an early, research stage to get feedback from the security and broader open-source communities." Reader eldavojohn links to a technical paper linked from that Google code page [PDF] titled "Native Client: A Sandbox for Portable, Untrusted x86 Native Code," and suggests this in-browser Quake demo, which requires the Native Code plug-in.
Related Stories
[+]
IT: Google NativeClient Security Contest 175 comments
An anonymous reader writes "You may remember Google's NativeClient project, discussed here last December. Don't be fooled into calling this ActiveX 2.0 — rather than a model of trust and authentication, NaCl is designed to make dangerous code impossible by enforcing a set of a rules at load time that guarantee hostile code simply cannot execute (PDF). NaCl is still in heavy development, but the developers want to encourage low-level security experts to take a look at their design and code. To this end Google has opened the NativeClient Security Contest, and will award prizes topping out at $2^13 to top bug submitters. If you're familiar with low level security, memory segmentation, accurate disassembly of hostile code, code alignment, and related topics, do take a look. Mac, Linux, and Windows are all supported."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
doesn't sound too secure yet (Score:5, Insightful)
This is not a good thing: by definition x86 code is not portable across platforms.
Secure or not, it goes against the main founding principle of the web, which is portability. There are other ways to solve the performance issue, I thought just-in-time compilers were getting pretty close anyway (50% according to http://www.mobydisk.com/softdev/techinfo/speedtest/index.html [mobydisk.com]).
On the security side, I'll just quote Google's description: "modules may not contain certain instruction sequences". That doesn't sound like a robust way to detect malicious code.
http://fairsoftware.net/ [fairsoftware.net] where software developers share revenue from the apps they create together
Re:doesn't sound too secure yet (Score:5, Funny)
*Someone may have thought of this already.
Parent
Re:doesn't sound too secure yet (Score:5, Insightful)
Those that don't understand java are doomed to repeat it.
Parent
Re: (Score:3, Funny)
It's a Java system! I know this!
Re: (Score:3, Funny)
Yep, they're re-inventing the wheel, how cool is that?
Re: (Score:3, Interesting)
Worse - their example doesn't even make sense ...
Re:doesn't sound too secure yet (Score:4, Insightful)
Those that don't understand java are doomed to repeat it.
Given that Java started as a poor re-implementation of other VM based OO languages that just makes me want to weep.
Parent
Re: (Score:3, Insightful)
Isn't Java run in an emulated fashion on all platforms? Isn't that part of the 'slow' image that it cultivated in it's early years, that it was too slow due to the emulation of the java 'virtual machine'?
Is the problem here that this could mean some machines won't be as slow as others or just that its x86?
What exactly is the difference, outside of one having a much larger code base to 'exploit' and the potential for a huge speedup on machines that can natively handle x86 code?
Re:doesn't sound too secure yet (Score:5, Informative)
Parent
Re:doesn't sound too secure yet (Score:5, Interesting)
Pretty much all of Sun's offerings have HotSpot built in, which provides JIT compilation for the JVM. IBM's JVM, BEA, etc. all have JIT features. Google's Android has Java-like Dalvik, which is slow as balls and doesn't have JIT functionality.
Some ARM processors are capable of executing Java bytecode natively. The device developers have to pay for that feature though.
Really, it sounds like Google is poorly trying to reinvent Java. They've tried this with Android already and it doesn't work so hot from a performance standpoint.
Parent
Re:doesn't sound too secure yet (Score:5, Interesting)
Parent
Re:doesn't sound too secure yet (Score:4, Interesting)
1) They started the Android project before Java was open-sourced.
2) Sun has slightly different licenses for desktop and mobile use. The desktop license is GPL with a classpath exception (letting you write non-GPL java apps to run on the virtual machine), the mobile license is straight GPL. Google didn't want to force developers to only produce GPL apps for Android, so they could not use this.
See Stefano's blog [betaversion.org]
Parent
Re:doesn't sound too secure yet (Score:5, Informative)
An interpreter compiles each instruction every time it gets executed.
JIT compiles blocks of code only on first execution. Next time, the compiled code is already in memory.
Parent
Re:doesn't sound too secure yet (Score:4, Funny)
Parent
Re: (Score:3, Insightful)
Why not? It just means that the permissible instruction sequences are limited to a subset that can be statically analyzed and verified to be safe. The Java VM has similar verification algorithms that are run whenever untrusted code is first loaded.
It's true that this does not allow all x86 code to run; it's at least practically (an
Re:doesn't sound too secure yet (Score:5, Insightful)
One of the key differences is that Java code and data are separated to the point of paranoia. I cannot load a classfile as data and pass through execution to the native system. With the x86 instruction set, I can load a data file and execute a jump to a data segment without the code having passed through any sort of system loader. A VM would have to take this into account. Not to mention common issues of stack smashing, heap overflows, and other common memory tricks to execute unwanted code.
When you're managing native code, it only takes one slip-up to hand over the keys to the kingdom. That slip-up may be as simple as a two byte exploit, but it's a slip-up none the less. One must be VERY careful with native code because there is no way to prove that it is safe to execute natively.
Hypervisor features in modern processors simplify the issue somewhat, but it is still not proven that hypervisors are without exploits. Not to mention the overhead of running dozens of simultaneous hypervisor environments.
Java and Javascript have it right. Java bytecode is provably correct because it targets an ideal machine. Thus the code can be translated into well-behaved native code with the linkage between data and code managed during or after translation. Javascript is just as good because it provides an abstract execution environment that must rely on exposed APIs to accomplish any interaction with the system. It is provable not possible (shy of an underlying flaw in the browser) for Javascript to break through its execution engine into a native runtime.
The two platforms may be paranoid, but when you're dealing with security on the scale of the World Wide Web, "better safe than sorry" is a good motto.
Parent
Re:doesn't sound too secure yet (Score:5, Informative)
Holy crap. AKAImBatman I usually enjoy your posts, but it's painfully clear nobody on this thread - including you - has actually read the paper.
If you had, you'd see that this system is secure. It's simple yet clever at the same time. By using a combination of x86 segmentation (which ironically you say is never used anymore!), alignment rules, static analysis and - crucially - masked jumps, it's possible to ensure that native code cannot synthesize unverified code in memory and then jump into it. If you can prevent arbitrary code synthesis, you can control what the program does. It's as simple as that.
Even though the verifier for this system is microscopic (compared to, say, a JVM), and so much more likely to be correct, NativeClient also includes a ptrace sandbox to provide an additional redundant level of protection.
I don't blame you, because until I read the paper I also believed this. Once you read it you'll slap your forehead and say, my god, it's so simple. Why didn't I think of that?
Parent
Re:doesn't sound too secure yet (Score:5, Funny)
(I just love it when my browser runs unmanaged code full of unverified branch statements!)
Bah! Where's your sense of adventure?!?
Parent
Re:doesn't sound too secure yet (Score:4, Informative)
Why not? It just means that the permissible instruction sequences are limited to a subset that can be statically analyzed and verified to be safe. The Java VM has similar verification algorithms that are run whenever untrusted code is first loaded.
It's true that this does not allow all x86 code to run; it's at least practically (and probably theoretically) impossible to correctly determine whether or not a piece of code is safe, but as long as the VM errs on the side of caution, there shouldn't be any problems with this approach.
I will grant that this makes it unclear what the advantage is over (say) Java applets. What can this technology do that the Java VM couldn't? As far as I'm concerned, the failure of Java in the browser has more to do with the lack of a standard library for high-performance multimedia applications (think: Flash) than with shortcomings in the bytecode language.
All this means is that google have created a VM in which the "bytecodes" happen to be executable on real hardware, but some of these "bytecodes" have to be intercepted and replaced at runtime with substitute code... this aught to sound familiar; this is what a software hypervisor does (eg VMware).
In other words every man and his dog has jumped aboard the "I can write an x86-hypervisor" bandwagon, the difference being that google have decided to take theirs and embed it into the browser rather than run as a standalone app.
Interestingly enough it took the momentum that VMware created to get intel to correct some of the issues with its' ISA to make it much easier to virtualise [wikipedia.org], perhaps someone the size of google can prod intel into adding a third wave of virtualisation accelartion extensions to their ISA so as to make this idea safer* with low overhead
*I think virtualisation is a useful thing (I make a living from consulting on it), however I am unconvinced [wikipedia.org] of it being possible, to truly [wikipedia.org] secure [kerneltrap.org] it.
Parent
Re:doesn't sound too secure yet (Score:5, Funny)
Knock Knock.
Who's there?
.
.
.
(long pause)
.
.
.
Java.
Parent
Re:doesn't sound too secure yet (Score:4, Interesting)
my "platform" would be very linux and windows machine i've ever run - all on x86
try this for a "Slow platform"
Athlon 64 x2 6000+, 4GB DDR2-800, 250GB SATA-II drive
JVM initialization is slow because the JVM weights 9 million metric tons.
and initialization of the VM of any language is an important factor in it's effective performance - no matter if it's per-instruction performance once it's VM is started is almost as good as native code, it will take a long time for that to outweight the initial startup time.
Parent
Re: (Score:3, Interesting)
Yes all new technology is bad even R&D concepts. Dag Nabbit I want my ASCII (no freaking colors) 300BPS BBS back you know the ones where you need to put your phone headset into the modem. Back then everything was secure. The password was the telephone number that you dialed. Brute force attacks were expensive. And if the BBS had a Password protection you were secured to no end, where no one can get in who you didn't want.
Um the way things work with software is the program sends opt-codes to the CPU whi
Re:doesn't sound too secure yet (Score:5, Funny)
Um the way things work with software is the program sends opt-codes to the CPU which interns translates them to particular basic actions.
Ah ha!
So that is the secret!!
Cheap intern labor!
Parent
Re:doesn't sound too secure yet (Score:5, Interesting)
x86 code runs natively on 90% of the processors out there. Java or .NET bytecode runs natively on about 0% of them (Sun did have a Java chip once but it is long dead). So it is hardly any worse than the alternatives. There are many x86 emulators and some of them have reasonable performance.
If we were starting from scratch now, nobody would choose the barnacle-encrusted i386 instruction set as a way to distribute programs. But given the hardware and software that exists, it's not such a bad choice.
Of course, the way to do it is to define what instruction sequences are safe and allow only those. I assume that's what they are doing and 'modules may not contain certain instruction sequences' is just the one-line summary.
That said, you can make any instruction sequence you like using the assembler and run it on your Linux system, and it cannot break out of the process virtual machine to access hardware or memory belonging to other processes or the kernel. If it can, this would be a bug in Linux. So there is no reason why arbitrary instruction sequences couldn't be allowed in principle, if you let the operating system do the work of sandboxing the process. After all why reinvent the wheel?
Parent
Re:doesn't sound too secure yet (Score:4, Insightful)
x86 does not run natively on 90% of the processors out there. ARM beats it by a bit.
Parent
Re:doesn't sound too secure yet (Score:5, Informative)
x86 code runs natively on 90% of the processors out there. Java or .NET bytecode runs natively on about 0% of them (Sun did have a Java chip once but it is long dead). So it is hardly any worse than the alternatives. There are many x86 emulators and some of them have reasonable performance.
ARM Jazelle (in quite a number of the ARM revisions deployed all over the place) includes DBX for direct bytecode execution of Java. That includes the iphone and loads of other stuff.
Parent
Re:doesn't sound too secure yet (Score:4, Informative)
From the article linked from the story, emphasis mine:
The release contains the experimental compilation tools and runtime so that you can write and run portable code modules that will work in Firefox, Safari, Opera, and Google Chrome on any modern Windows, Mac, or Linux system that has an x86 processor. We're working on supporting other CPU architectures (such as ARM and PPC) to make this technology work on the many types of devices that connect to the web today.
Reading Comprehension FTW!
Parent
The important (and finally valid!) question (Score:4, Funny)
Does it run Linux??
Re:The important (and finally valid!) question (Score:4, Funny)
So as if javascript isn't bad enough, now we're going to have the inevitable beowulf cluster running across the tabs of our browsers?
Parent
More importantly.... (Score:5, Funny)
Does Linux run on it?
(Prompting a possibly valid "In Soviet Russia" gag).
Parent
Two steps backward (Score:5, Insightful)
Because all that we need is to further promote an archaic instruction set that won't die because of all the pre-existing code compiled for it. An instruction set that was finally starting to loosen its grip as the industry worked toward more abstract solutions.
And with good reason!!! Plugin engines do not provide a very smooth browsing experience. You must wait for them to download and activate before you can start using the widget. Meanwhile, Javascript is designed for execution as the page is loading.
The heavyweight JVM was probably the worst offender, but look at Flash for another example of an engine that most developers would rather eliminate. While it was hip to create entire websites out of Flash for a while, the platform was very user-unfriendly and almost died out. Thanks to infighting over video standards however, Flash was able to hold on as a video delivery platform and even gained a margin of success as a web-gaming platform. (About the only area where Java Applets really shined back when they were popular.)
My personal opinion* is that this is a step in the wrong direction. Javascript engines are getting good. Damn good. I'd like to see more R&D poured into these engines and the underlying technologies [whatwg.org] rather than reinventing ActiveX and Java. If researchers wanted to invent a more efficient or usable browser language other than JS, I'd be all for it. But I don't run a browser to become a part of a compute farm. I run a browser to access web information and applications. Very little of which is compute-intensive enough to require a new execution engine over a more advanced set of APIs.
* ...and 50 cents won't buy you a cup of coffee anymore, so take it for what it's worth.
** As an aside, C/C++ is an incredibly complex build environment. Why anyone would want to continue subjecting developers to the angst of compiler differences, makefiles, configure scripts, and other irritants is beyond me. As is typical with such platforms, I can't even get the examples running on my machine. The run.py script dies with an "Invalid Argument" on line 42 and the nacl-gcc compiler fails with syntax errors a-plenty. I'm sure I'll figure it out eventually, but WHY oh why do we want to promote such a complicated method of compiling code?
Re: (Score:3, Interesting)
I really hope so. Does anyone actually enjoy programming JavaScript? No real objects, weak typing, etc. It's fine for small bits of code, but for larger apps? Ugh.
A "lite" version of Python would be nice. Shove (erm, specify) lots of interesting libraries into the browser itself, and let us use those.
Re: (Score:3)
Whats the problem with JavaScript? I have written JS code with >20k lines already, and it was quite ok. Among the things that irritate me is this "var" nonsense (declaring a variable without var puts it in the global namespace!), but other than that, it was fine. Also, you are wrong, JS has real objects. And, weak typing can be a very powerful tool if used properly. Note that Python has weak typing too...
Re:Two steps backward (Score:4, Insightful)
That example has nothing to do with strong typing, but rather operator overloading. Both Java and Javascript understand "+" to be a string concatenation symbol when dealing with strings. Thus they attempt to coerce the values into strings. In Java's case, the resulting output looks something like this:
Javascript does have implicit type casting (e.g. '1' - 1 = 0), but this is a feature that can be found in quite a few strongly languages. (e.g. int x = 10; char val = 10; x += val) Javascript is actually STRONGER than C when it comes to typing. When I cast a variable to a new type, its original type information is redefined and/or completely lost. This can create problems when programmers start using (void *) pointers for everything. Javascript remembers the underlying type of a value at all times. Values are never modified or destroyed, but can be coerced to create a new value with an implicitly cast type.
Parent
Re:Two steps backward (Score:5, Interesting)
I do. And so can you. [yahoo.com] It's the C-based syntax that throws most programmers for a loop. Once you realize that the language is actually of a functional design similar to LISP, everything gets a lot easier.
Javascript has one of the most flexible Object systems I have seen in my 20+ years of programming. And its typing system is actually quite strong. Like another poster mentioned, it's dynamically typed not weakly typed. Which is an issue that fades into obscurity once you understand how to properly utilize the language.
Javascript (like most functional languages) is perfect for building large apps out of a massive number of small bits. Look up scoping in Javascript sometime and you'll understand that larger apps get built by having machines within machines within machines to go from simple tasks to ever more complex tasks. It is, in many ways, a more scalable solution than APIs and packaging. But it is different and therein lies the crux of its failure in the minds of many programmers.
Parent
Re:Two steps backward (Score:5, Interesting)
The only problem you seem to have with Java plugins is the load time -- this is only resolved by Javascript because JS is pre-loaded by the browser at all times (in modern browsers at least).
If other plugins were to be marked as 'frequently used' by the plugin engine and loaded at runtime instead of page load-time, they'd obviously be just as responsive as Javascript (or more so, since Java is compiled to native code in many cases).
Making a browser that integrates Java in a reasonable way and makes it work just as seamlessly as Javascript was tried already (by Netscape) but it was before we had computers with enough RAM to handle it IMHO.
Parent
Re:Two steps backward (Score:5, Insightful)
The Java runtime was compiled into early browsers like Netscape. So the load time is not caused by the plugin itself. (Though that does play a role in the first activation.) The load time is the time it takes to download the complete application, dearchive the components, load the components into an interpreter or JIT, initialize the environment and/or APIs used, and finally present the application to the user.
Javascript fits in better with the way web browsers are designed in that the browser executes each individual module during the page load. The makes page loading more asynchronous and thus a better experience for the web user. The web developer can still throw up a "loading" progress bar for applications must preload, but they are the exception rather than the rule.
There is more to the issue than meets the eye. Besides the synchronous aspect I mentioned, the client Java runtime has also grown to meet the expansion in system memory and complexity. Which is a good thing from the perspective of writing rich applications for deployment on the server or desktop. It's a bad thing when we're talking about the time-sensitive environment of the web browser. If you want an ideal JVM for the browser, Sun is going to have to strip it down again and make the platform a better fit than it has been in the past. (A version that relies heavily on the DOM for APIs would be preferable.)
They're also going to have to work out a good method of solving the load problem. Even Flash allows for partial execution prior to the load being complete. (This is how most Flash games show a LOADING screen.) Java was not designed with this in mind and the platform shows it. There are ways a developer could work around it using dynamic class loading, but this requires a great deal of knowledge, effort, and skill on the part of the developer.
My own feeling is that it's best to let sleeping dogs lie. I love the Java platform, but it currently has a higher calling. Best to let it work where it excels and focus on the aspects of the browser that currently excel. (e.g. Javascript)
Parent
I can see the appeal; but it sucks. (Score:4, Insightful)
Did they really just... (Score:5, Funny)
The only question remains whether it can be secured (ala ActiveX)
HAHAHAHAHAHAHAHAHAHAHAHAHAHA *gasp* HAHAHAHAHAHAHAHAHAHAHAHHAAHHAHA *wipes eyes*
HAHAHAHAHAHAAHHAAHAHAHAHAHAHAH
"secured (ala ActiveX)" (Score:5, Insightful)
Talk about an oxymoron.
Beta (Score:3, Insightful)
Just slap a "Beta" on it and move on, that's the Google way, right?
Re: (Score:3, Insightful)
Nope the google way is:
1. slap "Ads by Google" on everything
2. ???
3. PROFIT !
Something similar from Microsoft (Score:5, Interesting)
You can find more details here [usenix.com].
GREAT IDEA (Score:5, Funny)
google x86 (Score:3, Funny)
Does this mean I can run old DOS games in a browser?
Silent Service II!
Re:google x86 (Score:4, Funny)
Parent
A remarkably bad idea. (Score:5, Insightful)
It will go far.
Re:A remarkably bad idea. (Score:4, Interesting)
"In Native Client we disallow such [self-modifying code] practices through a set of alignment and structural rules that, when observed, insure that the native code module can be disassembled reliably, such that all reachable instructions are identified during disassembly."
Ok, when I read the post I had to chuckle when I read the asm joke. I've been programming in asm for 16 years now and there are a few rules of thumb:
- if assembly is allowed then the only real security is executed by hardware.
- malware writers love a challenge like this.
Parent
What a stupid idea (Score:5, Insightful)
If you want crap like this, you would be a lot better off, by just exhuming Java applets.
I really hope this project dies a quiet death.
This is rather clever (Score:5, Informative)
This is a fascinating effort. Read the research paper. [googlecode.com]
This is really a little operating system, with 44 system calls. Those system calls are the same on Linux, MacOS (IA-32 version) and Windows. That could make this very useful - the same executable can run on all major platforms.
Note that you can't use existing executables. Code has to be recompiled for this environment. Among other things, the "ret" instruction has to be replaced with a different, safer sequence. Also, there's no access to the GPU, so games in the browser will be very limited. As a demo, they ported Quake, but the rendering is entirely on the main CPU. If they wanted to support graphics cross-platform, they could put in OpenGL support.
Executable code is pre-scanned by the loader, sort of like VMware. Unlike VMware, the hard cases are simply disallowed, rather than being interpreted. Most of the things that are disallowed you wouldn't want to do anyway except in an exploit.
This sandbox system makes heavy use of some protection machinery in IA-32 that's unused by existing operating systems. IA-32 has some elaborate segmentation hardware which allows constraining access at a fine-grained level. I once looked into using that hardware for an interprocess communication system with mutual mistrust, trying to figure out a way to lower the cost of secure IPC. There's a seldom-used "call gate" in IA-32 mechanism that almost, but not quite, does the right thing in doing segment switches at a call across a protection boundary. The Google people got cross-boundary calls to work with a "trampoline code" system that works more like a system call, transferring from untrusted to trusted code. This is more like classic "rings of protection" from Multics.
Note that this won't work for 64-bit code. When AMD came up with their extension to IA-32 to 64 bits, they decided to leave out all the classic x86 segmentation machinery because nobody was using it. (I got that info from the architecture designer when he spoke at Stanford.) 64-bit mode is flat address space only.