Insights Into Google Compute Engine 80
snydeq writes "The Compute Engine announcement at Google I/O made it clear that Google intends to take Amazon EC2 head on. Michael Crandell, who has been testing out Compute Engine for some time now, divulges deeper insights into the nascent IaaS, which, although enticing, will have a long road ahead of it in eclipsing Amazon EC2. 'Even in this early stage, three major factors about Google Cloud stood out for Crandell. First was the way Google leveraged the use of its own private network to make its cloud resources uniformly accessible across the globe. ... Another key difference was boot times, which are both fast and consistent in Google's cloud. ... Third is encryption. Google offers at-rest encryption for all storage, whether it's local or attached over a network. 'Everything's automatically encrypted,' says Crandell, 'and it's encrypted outside the processing of the VM so there's no degradation of performance to get that feature.'"
The real question (Score:5, Interesting)
How long until Google cancels it?
Who is the troll? (Score:4, Insightful)
Google has a clear track record of yanking the rug out from under people who adopt their non-core products.
Unfortunately it's a valid concern.
Re:Who is the troll? (Score:5, Insightful)
Actually, its not a valid concern.
Google shuts down projects that have no clear path to making money, like Wave, Buzz and Others [inc.com].
As far as I know none of these had any monitization mechanism other than pushing ads in your face.
Compute has a price schedule published right up front, and its about the twice the cost of the electricity to power a comparable computer, but with zero capital investment. Their data storage prices and bandwidth prices are also published, and are reasonable. You really couldn't afford to even put your legacy machines into production at these prices.
Clearly they expect this project to cover its own costs, and make use of excess capacity in their data centers.
Google can build a processor in house cheaper than Dell or any white-box company. With a gazillion of them on hand, they can provision them fast, swap them in when there is trouble, and they do it day in and day out. So chances are they are simply reselling the in house expertise they already have. None of this is going away any time soon, and they always need to maintain excess capacity for their own needs, so why not market that.
With a clear path to making money on this project baked in at the start, the only thing that would kill it is lack of customers. Hell I'm thinking of renting a couple cores just for playing around with.
Re: (Score:2)
With Google's current pricing they will likely lose money until the achieve enough regular income to cover the costs of acquiring hardware, powering them, and supporting them. I don't know how many customers it'll take before Google breaks even and starts making a profit, but it's probably not a small number. So even though Google has a plan to monetize this, there's no guarantee they'll make a profit (otherwise Google App Engine would be expanding), and no guarantee they'll keep dumping money in an unpro
Re: (Score:3)
there's no guarantee they'll make a profit (otherwise Google App Engine would be expanding),
Wait, they just introduced this last week, and you already know that its NOT expanding? Clairvoyance?
The pricing is not all that dissimilar than EC2. Its less, but not orders of magnitude less.
This is not the sort of thing that requires a huge customer base up front. They have the luxury of throwing a few hundred virtual cores at this, or maybe a million. They have the hardware sitting around. Sunk costs already spent. If it turns out to be as self managing as most of Google's system, it will require
Re: (Score:1)
I tend to think that there is more support costs associated with hosting that with websites services. The user target is not the same, and people who pay will feel they need to be supported ( ie, when . Not to mention that there is much larger attack surface ith their vm than with gmail ( gmail being mostly run on the client, with know interaction and limited set of input, vs a vm where anything could happen ). There is also likely more pressure on google ( someone using google compute for spamming or warez
Re: (Score:2)
The user target is not the same,
Exactly.
The targeted users of this service actually have a clue. Unlike web hosting service users.
You get a bare bones Linux Virtual Machine with a remote console. The clueless will
soon disappear.
Re:The real question (Score:4, Interesting)
How long until Google cancels it?
Not exactly trolling. Google has never done very well in the support and consistency department. Having to deal with the YouTube API has given me some insights into just what a PITA it is to deal with Google on an ongoing basis.
Google has never really been shy about pulling the plug on a project either. Which is what almost everything seems to be to them. At some point you have to let a project die, so I can't really blame them for being conservative with resources, but it does not help the people that started to rely on it.
All that being said though, I think it is a pretty good assumption this will be a paid offering, and as such will not have the plug pulled. Certainly, not before migration plans could be made.
Re: (Score:1)
Especially with product like deltacloud ( http://deltacloud.apache.org/ [apache.org] ), and the fact that there seems to have no specific API or products , ie, that's just centos and ubuntu servers. You can deploy your code everywhere, it doesn't use specific google stuff ( unless you want to, but that was already the case before ).
I have yet to see how far this will go. They will surely have people because that's Google, but I wonder if they do not simply aim to attack Amazon directly at the important point, the purse.
Re: (Score:2)
Given that they announced a pricing structure, its more than just an assumption.
That it is a paid-only offering, that it is not a blue-sky project like Wave but something that fits a well-known, established commercial market, are factors that suggest it isn't likely to get the plug pulled quickly (as is the apparently pretty serious pre-open-launch private partner work th
Re: (Score:2)
If you rent all the cores for a day that is more than 2$ million - if somebody pays for that, they are going to keep them running ;)
Encryption detail? (Score:5, Interesting)
It's interesting them doing at-rest-encryption - now I wonder where the keys are stored and who has access to them?
Re: (Score:2, Funny)
No keys, they use ROT-52 for extra security while still enabling excellent throughput.
Re: (Score:2, Funny)
Little known fact: ROT-52 encrypted data is susceptible to both chosen plaintext attacks and decryption with the weaker ROT-26 standard.
Re: (Score:2, Funny)
That's why you should use several iterations of it. ROT-52 may be simple, but 16ROT-52 takes 16 times as long to decrypt.
Re: (Score:2)
But it's also completely forward compatible with the rock-sold ROT-910 standard.
Re: (Score:3)
Re: (Score:2)
Re:Encryption detail? (Score:4, Informative)
It's interesting them doing at-rest-encryption - now I wonder where the keys are stored and who has access to them?
That is exactly the right question. If the encrypted data is worked on in the "cloud", the encryption keys have to be accessible to Google's servers. It's possible to have a remote backup service where the service doesn't have the keys. (iDrive claims to be such a service.) But if the data is processed in the "cloud", the keys are in there somewhere.
Re:Encryption detail? (Score:5, Informative)
I tend to take claims iDrive makes with a grain of salt given their approach to "security" on the client machine. If, on a Windows iDrive installation, one looks at (for a typical installation) C:\Program Files\IDrive\UserName.ini, one finds a line of the form:
Encryption password=Vjku_Ku_Oa_Rcuuyqtf_CCCDDDEEE
Of course, not to worry, the password is well encrypted with a sophisticated algorithm. Yes. ROT-2 for alpha characters. Really.
So, this user's actual encryption password is: This_Is_My_Password_AAABBBCCC
I understand that some people want the convenience of not having to enter their encryption password (or, even, a password vault password) when using the service or at system boot or user logon, but there seems to be no way to 'opt out' of this convenience.
I assume the engineers at iDrive used ROT-2 as a joke instead of putting the encryption password in clear text. I'm not a humorless guy, but there's a few areas that I don't like joking about -- and security is one of them. Unfortunately, this unfunny joke decreases security because it slightly increases the chances that some users won't realize that their encryption password is sitting in (almost) cleartext on their local disk and they won't protect it well (most users, of course, would have no idea this file even exists).
Since iDrive seems to think that security is something to be "funny" and "cute" about, I question their general judgement on the topic. (Of course, it's possible that they are incompetent and don't do security reviews -- I suppose that's worse).
Re: (Score:1)
I tend to take claims iDrive makes with a grain of salt given their approach to "security" on the client machine. If, on a Windows iDrive installation, one looks at (for a typical installation) C:\Program Files\IDrive\UserName.ini, one finds a line of the form:
Encryption password=,,,,,,,,,,,,,
Of course, not to worry, the password is well encrypted with a sophisticated algorithm. Yes. ROT-2 for alpha characters. Really.
So, this user's actual encryption password is: **********
That's so weird. I just see commas when you posted the contents of your .ini file, and then performing a ROT-2 on it makes it all asterisks. How neat that it protects your data! Try it again with your bank password, friend.
Re:Encryption detail? (Score:5, Insightful)
I haven't seen any technical details yet, but I'd guess that the advantages of encryption would be (1) fewer people at Google will have access to the keys than to the data (2) an outside attacker who gets access to the raw data also needs to attack the key store (3) if by malice or mistake a disk is not properly wiped before being removed from the data center, it will be harder to get data off of it.
It's hard to see this as being worse than no encryption; even if it is easier to get the key than to get the encrypted data, you still need both to do anything with the data.
Re:Encryption detail? (Score:5, Informative)
Re:Encryption detail? (Score:5, Informative)
I attended the tech details IO session (https://developers.google.com/events/io/sessions/gooio2012/313/ - as of this writing, the video isn't up yet), and they said the encryption keys don't leave the server where the data resides.
Re:Encryption detail? (Score:4, Interesting)
well.. so? they're on the server the data resides on all the same. wouldn't it be nicer if they weren't on that server, since if they're on the same server what's the fucking point??
Re: (Score:2)
Re:Encryption detail? (Score:5, Informative)
I'm the TL for Google Compute Engine and was the speaker at that talk. The answer is a little more subtle than that. We have two types of mountable disk -- ephemeral disk which stays on the physical machine and never leaves the machine and persistent disk that outlives an instance is written over the network.
For ephemeral disk, we generate the encryption key on the host machine and it only ever stays in memory. We are careful to control the code paths that see the key material.
For the persistent disk, by necessity, we need to manage the key as part of our overall virtual machine management infrastructure. We utilize some strongly audited and auditable systems to wrap the encryption keys and really lock down the users that have access to the unwrapping service. The name of the game here is to restrict the scope as much as possible.
BTW -- the video for the talk isn't up yet but I just shared the slides here: https://plus.google.com/110707185519531431463/posts/EfDCBjuPiPf [google.com].
Re: (Score:2)
Wouldn't it just be better to allow the VMs to handle the persistent store keys themselves?
Like key passing at launch etc? That way only the owner of the VM and the VM itself would have access to
the persistent storage keys.
Re: (Score:2)
Still a no go, the thing with cloud solutions and encryption is that: You are ok with the Infrastructure owner knowing what gets into your machines (usually the http traffic) but you don't want them to know what state the machines compute/store. If the encryption is handled by the Infrastructure owner and not you it is as good as nonexistent.
This is a typical showoff move to get the idiots yeling: "Teh googl claud incrypts u d8a withaut prosissin penaltee"
Shame on you Google, first actors/models for the com
Re:Encryption detail? (Score:4, Informative)
It's interesting them doing at-rest-encryption - now I wonder where the keys are stored and who has access to them?
The Google Compute Engine FAQ sheds some light on these details: https://developers.google.com/compute/docs/faq#disks [google.com]
Can I retrieve ephemeral disk data if I have lost it?
No. All data written to ephemeral disk is encrypted with a key that is unique to the VM instance. By design, once a virtual machine terminates, all data on the ephemeral disk is lost.
Google LAN (Score:5, Interesting)
I found this to be an interesting piece of info
Even in this early stage, three major factors about Google Cloud stood out for Crandell. First was the way Google leveraged the use of its own private network to make its cloud resources uniformly accessible across the globe.
"When you create a Google Compute Engine account and use their resources," he said, "they provide a private network, a LAN of sorts that spans different regions. For example, if you set up an architecture to replicate a database from region A to region B, in the Google cloud, you don't need to traverse the public Internet to do it. You're using their private network."
How precisely that network is implemented (as its own private fiber or simply a very efficiently-routed VPN) is not disclosed by Google. But the key thing is that the whole structure is seen as a single network from a programming point of view. "This makes it easier if you're building cross-regional architectures," Crandall says. It's expected that Google will eventually expand Compute Engine to territories outside the United States.
- I really wonder if Google built (or bought) larges swaths of private infrastructure that is otherwise outside of the Internet, does anybody know?
Here is why I am wondering about it - Google as an ISP would then avoid outside costs to move its data, it's all internal costs, this turns Google into its own 'Internet' of sorts, Google only Internet.
That's why web neutrality is a nonsense concept from my perspective - if companies can build their own infrastructure, they can compete with each other and offer their own content at better speeds, but then Google could be an ISP that uses both, Google 'Internet' and external backend, but then on its own 'Internet', the content available from Google could be delivered at a higher priority and faster (and cheaper, because its internal costs, that can be managed easier).
By the way, there was a question in the story, asking why didn't Google provide this earlier. Well, maybe it tech wasn't ready or the business model wasn't there or maybe it's something to do with the government that wants to listen in on everything.
BTW., this is why such information should be made available, the speculation about the reasons for things like that could be worse than whatever the truth is.
Re:Google LAN (Score:4, Informative)
Google's been buying dark fibers since at least 2005 [cnet.com]. So, they likely do have the capacity to do this in a lot of areas.
Re: (Score:2)
Re: (Score:1)
That's why web neutrality is a nonsense concept from my perspective - if companies can build their own infrastructure, they can compete with each other and offer their own content at better speeds, but then Google could be an ISP that uses both, Google 'Internet' and external backend, but then on its own 'Internet', the content available from Google could be delivered at a higher priority and faster (and cheaper, because its internal costs, that can be managed easier).
Prodigious thinking...maybe Google could call it something like Prodigy [wikipedia.org]?
Me, I see what's happening - the privatization of the internet - as "the glass house" dragging itself back into the world as zombies. Really bloodthirsty zombies.
Yes, they do own massive fiber (Score:5, Informative)
Google, in a very forward-thinking move, outright purchased massive quantities of laid fiber at rock-bottom prices after the telecom crash that followed the dot-com crash. There was quite a glut of capacity that nobody needed at the time and had no use for. They picked up years and years worth of bandwidth expansion without having to go through all the trouble and expense of actually laying that fiber.
Re:Google LAN (Score:5, Insightful)
The problem is that it's really only a handful of Google-sized companies who can do so. The worry with net neutrality is that the traditional ability of smaller players to participate will be eroded, if you can no longer buy access to the internet as a leaf node via an ISP, and then have your traffic treated equally once you're on the network.
Re: (Score:2)
I don't quite get the "single network from a programming point of view" part. No matter whether over a private or public network, there are likely significant latency differences between networking in a single data center, in a single city, or across the world. Surely you need to be aware of the difference when you are building large-scale applications?
Re: (Score:2)
if companies can build their own infrastructure, they can compete with each other and offer their own content at better speeds
The point behind net neutrality is that is a pretty big "if". And very few companies are the size of Google.
Should some random podcaster be expected to build their own infrastructure in order to get their content out?
Why encryption? (Score:1)
Isn't that only usefull when somebody has physical access to the device, in other words somebody stealing physical storage?
I assume the buildings are secure, right? So what's the use?
Re: (Score:2)
I assume the buildings are secure, right?
On what grounds do you assume that? Jurassic Park-style 20m tall high voltage barriers around them and a ground-to-air missile defense systems?
Even then, I guess that some humans are allowed to enter the buildings without deathly harm. As soon as the human element is involved, security cannot be guaranteed.
Re: (Score:2)
On what grounds do you assume that? Jurassic Park-style 20m tall high voltage barriers around them and a ground-to-air missile defense systems?
Unlikely, that sounds more like a Microsoft thing. I imagine the Googleplex is defended mostly by machine-gun turrets, moving platforms, and an all-seeing but emotionally unstable AI.
Re: (Score:2)
I imagine the Googleplex is defended mostly by machine-gun turrets, moving platforms, and an all-seeing but emotionally unstable AI.
No, that would be Valve HQ. Google just shows you random YouTube comments.
Uh (Score:2, Interesting)
Doesn't Amazon run on its own cloud services? Wouldn't that make the first point (FTFS) irrelevant by way of comparison?
Re:Uh (Score:5, Informative)
Re: (Score:2)
Er, did you just link to their cloud user guide and even state that it is free. Did you expect them to sell documentation and make money out of it?
Or did mods (myself included), miss a joke?
Re: (Score:2)
My favorite example would be buying the documentation for a Tandy 1000 TX to find out why it wouldn't go into protected mode only to discover that the mobo showed an undocumented chip codenamed "Midnight Blue" on the schematics.
how about I/O performance (Score:4, Interesting)
Speaking of "Google I/O", how is the I/O performance on Google's offering? Is it any better than the, err, "not great" performance of Amazon's EBS?
Re: (Score:3, Informative)
but again, it would be performance with the data encryption, compared to non-encrypting EBS.
This is what RightScale, an early Google Compute Engine customer, had to say regarding encryption and performance: http://blog.rightscale.com/2012/06/28/rightscale-joins-google-compute-engine-for-launch-day/ [rightscale.com]
One very aggressive innovation that Google Compute Engine brings to cloud computing is encryption of data at rest, both for local ephemeral drives as well as for network attached drives. In the case of the network attached disks the encryption happens on the host before it is put on the network, so it’s also encrypted in transit. The encryption is on a per-project basis (Google’s term for an account). This is a big deal for security conscious organizations, especially those having regulatory or other mandates to encrypt all data at rest. On other clouds one solution is to run a loop-back crypto driver, but that eats into the VM’s performance. I’ve been benchmarking the Google Compute Engine disk performance (more on that in a future post) and the encryption doesn’t seem to have a noticeable impact on performance. Pretty awesome.
Re: (Score:2)
what has a virtual machine to do with de-duplication? right, nothing at all.
I sure hope businesses and people are smarter/ (Score:4, Interesting)
Re: (Score:2)
Encryption (Score:2)
Notice its encrypted by default. Not a real concern, other than downtime.
Re: (Score:3)
That would only be useful if they couldn't get to the key as well, which (apparently) isn't the case here - it sits on the same machine.
Re: (Score:3)
When you use an US-based company to trust your data too, you are a fool.
More to the point, they don't have any non-US zones [google.com], nor do they mention any plans at all to change this. OK, such a rollout is not trivial since it can't just be done by freeing up server space; they probably have to alter their corporate structure as well so as to limit the amount to which the non-US operations arms can be pressured by the US government via the parent. If you've got a legal requirement to keep your data out of the US, GCE is not for you. Amazon have had this addressed for years.
Re: (Score:3)
But if your service only needs a 4 core for 95% of the time, but it gets huge spikes those other 5%, you'd be wasting a ton of money keeping an 8-core twiddling its thumbs, while on EC2 you can scale just during those spikes, particularly if they're regular.
That's the whole point of "X-as-a-service" - getting resources on demand.
Of course if you can, putting the baseload on real server and using IaaS to handle spikes is probably the cheapest option, but it's far from easy unless your application is really s
Re: (Score:2)
Big users rarely rarely pay the published rates. When a service wants the business they will customize the contract.
The big advantage of the Cloud is that it is a service, hence not capitalized and not subject to the accounting rules for capital expenses.
You are forgetting to include costs on both sides for an enterprise professional services agreement. Also you did not include the SLA details. This typically makes all the difference in pricing; Amazon, Google or Rackspace.
Re: (Score:2)
The basis of your cost comparison appears to be comparing the cost of a server farm (and associated support infrastructure like internet connections, etc.) that is fully utilized 100% of the time vs. buying the same flat capacity from a cloud service whose main selling point is the ability to scale with demand.
Which, surprisingly enough, shows the fixed 100% utilized system (which is, of course, what the cloud provider has to have to sell the service in the first place) is less expensive than the cloud serv
Who Holds the Keys? (Score:2)
One comment that immediately jumped out at me was the use of External Encryption to lighten the load on the VM's. If that's the case, what method are the using and who holds the keys? If it isn't the customer who's paying for that cloud, then it's useless for building any kind of service on as the data can easily be snooped on by Google. In other words, kill the project now because it will never be profitable.
Uptime (Score:2)
Re: (Score:2)
As long as google doesn't completely fall apart for one week every year, they're pretty much got amazon cornered.
Not so sure about that. Now that Google has made massive changes to their policies [68forums.com], I think at this point Amazon clearly [amazon.com] has them beat [google.com].
Encryption in the cloud is worthless (Score:2, Insightful)
And these people are truly unethical claiming anything different. Encrypting something before you put it into the cloud is another story. But the only use for encryption at rest in the Google cloud would be is somebody were to steal disks from their data-centers. Somehow I do not see that happening.
What they really intend is IMO to run a smoke-screen with regard to the fact that the cloud-provider is the real, major security risk and that no technological measures can help here, unless you do your own encry
Re: (Score:2)
And the difference is (Score:2)
I trust Google not to design in features that make it hard for me to leave, lie to me about how many CPU cycles or I/O I "spent" and just basically rip me off as much as they possibly can
http://blog.carlmercier.com/2012/01/05/ec2-is-basically-one-big-ripoff/ [carlmercier.com]
This is where founders matter. My idea of Google is geeks for the greater good of geekdom to the benefit of the long term good of society as a whole
Amazon fonder, CEO and billionaire many times over Jeff Bezeos "feared pirate Bezeos"
http://www [pcmag.com]
They need smaller and cheaper instances (Score:1)
It maybe cost effective considering the CPU/memory to cost ratio, but the smaller option they're offering with 3.75 "google compute units" is 104$/months if you have an usage of 100% (like the typical web server). An Amazon small instance for example, which is more than enough for the load of the majority of web servers is 58$/month, or 27.77$/month if you get a reserved instance (I have one for a webserver with about 2000 visits/day). A lot of people is even running webservers on microinstances.