Forgot your password?
typodupeerror
The Internet IT

HTTP 2.0 Will Be a Binary Protocol 566

Posted by timothy
from the cram-your-stuff-in-this-box dept.
earlzdotnet writes "A working copy of the HTTP 2.0 spec has been released. Unlike previous versions of the HTTP protocol, this version will be a binary format, for better or worse. However, this protocol is also completely optional: 'This document is an alternative to, but does not obsolete the HTTP/1.1 message format or protocol. HTTP's existing semantics remain unchanged.'"
This discussion has been archived. No new comments can be posted.

HTTP 2.0 Will Be a Binary Protocol

Comments Filter:
  • Makes sense (Score:3, Interesting)

    by thetagger (1057066) on Tuesday July 09, 2013 @01:29PM (#44227569)

    HTTP is the world's most popular protocol and it's bloated and slow to parse.

    • Re:Makes sense (Score:5, Insightful)

      by cheesybagel (670288) on Tuesday July 09, 2013 @01:36PM (#44227641)

      It might be bloated and slow. But it is also easily extendable and human readable.

      • by steelfood (895457)

        You can also make binary blobs human readable. Just have an official spec on how to translate binary into English.

      • Re:Makes sense (Score:5, Informative)

        by PhrostyMcByte (589271) <phrosty@gmail.com> on Tuesday July 09, 2013 @02:01PM (#44228027) Homepage

        It might be bloated and slow. But it is also easily extendable and human readable.

        Human readable yes, extendable no. Well, it's not extendable in any meaningful way. Even though it looks like it on a quick look, if you read the spec you quickly realize there really is no generic structure to a message -- you cannot parse an HTTP request if you do not fully understand it. Even custom headers like the commonly used X-Foo-Whatever are impossible to parse or even simply ignore, so implementations just use an unspecified de-facto parsing and pray to the web gods that it works.

        This makes HTTP parsers very complicated to write correctly and even moreso if you want to build a library for others to extend HTTP with. This isn't a text VS binary issue, but simply a design flaw. Hopefully HTTP 2.0 fixes this.

        As they say, HTTP 1.1 isn't going anywhere -- this'll be a dual-stack web with 2.0 being used by new browsers and 1.1 still available for old browsers/people.

      • Re:Makes sense (Score:4, Insightful)

        by Darinbob (1142669) on Tuesday July 09, 2013 @03:00PM (#44228839)

        Human readable is irrelevant. Humans do not read this, computers do. Maybe I'm old or something, but I remember when programmers know how to write programs to convert from binary to text, or even were able to read octal or hex directly, and so binary formats were no hindrance at all. Now though, even private internal saved state never seen by a human is done in XML for bizarre reasons.

        (Probably the programmer knows how to use the XML library and so when your only tool is a pneumatic jackhammer then every problem looks like a sidewalk.)

        • Re:Makes sense (Score:4, Insightful)

          by cartman (18204) on Tuesday July 09, 2013 @07:40PM (#44232263)

          I disagree. I'm an old enough programmer (in my 40s), I started my career working with proprietary binary formats, and I remember the good reasons why binary formats were abandoned. Where I work, the older someone is, the less likely they are to favor binary formats for structured data (this argument has come up a lot recently).

          I'll repeat one or two of the arguments in favor of not using proprietary binary formats.

          If you wish to save space, conserve bandwidth, etc, then binary formats are not a good way of accomplishing that. The best way of saving space and conserving bandwidth is to use compression, not a custom binary format! Binary formats are still very large compared to compressed xml, because binary formats still have uncompressed strings, ints with leading zeroes, repeating ints, and so on. If you wish to save space or conserve bandwidth, then you ought to use compression.

          If you use compression, though, then using a binary format also, gains you nothing. Binary formats do not compress down any further than human-readable formats that encode the same information. You won't gain even a few extra bytes on average by using a binary format before compressing. It gains nothing to use a custom binary format if you compress, which you should do if you're concerned about space or bandwidth.

          Of course, compressed formats are binary formats. However, the compression formats you will use, are extremely common, are easily identified from a text identifier at the beginning of the file, and have widespread decompressors available on almost all platforms. Gzip, Bzip2, and zip are installed by default on the macbook pro I got from work. They're almost everywhere. That is not the case for a custom binary format which you create. Also, compression can be turned on and off. If you wish to sniff packets for debugging, you can turn compression off for awhile.

          Here's a different way of putting it. You can think of common compression algorithms (such as bzip2) as mechanisms for converting your files into the most compact binary representation available with no programming effort from you. It does not help those algorithms if you also try to do binary encoding yourself beforehand.

          There are a few weird exceptions where it's best to use binary formats. There are small embedded devices which lack the hardware to perform compression. Also, http/2.0 might be an exception, because the data transmitted is less than 100 bytes usually, so adaptive compression wouldn't work well, and it wouldn't be possible to compress across http requests because Http is (in theory) a stateless protocol.

          Now though, even private internal saved state never seen by a human is done in XML for bizarre reasons.

          There are reasons other than human-readability to use XML. Using xml means you gain an ecosystem of tools: all kinds of parsers, generators, code generators, validators, editors, pretty-printers in your IDE, network packet sniffers that can parse and pretty-print it, etc, on a wide variety of platforms. You lose all that if you roll your own binary format, for a gain of nothing if you compress the data in production.

          Also, private internal state is seen by a human on rare occasion. What happens if parsing the file fails? Someone will need to look at it.

    • Re: (Score:2, Insightful)

      by UltraZelda64 (2309504)

      Most popular protocol? What ever happened to TCP?

    • This binary protocol will be scrapped for JSON.
    • Re:Makes sense (Score:4, Interesting)

      by Trepidity (597) <delirium-slashdot@hacki s h . o rg> on Tuesday July 09, 2013 @01:57PM (#44227965)

      Not particularly bloated or slow to parse, especially on modern hardware. HTTP/2.0, which is basically a minorly tweaked version of Google SPDY, doesn't even claim speedups more than about 10%.

      • by Bengie (1121981)
        10% faster usually means about 10% less power for the part of the system that is now 10% faster. A large datacenter will save lots of money. Maybe not a large eprcentage of money, but probably more than 1 person's salary.
    • HTTP is the world's most popular protocol and it's bloated and slow to parse.

      HTTP is a simple protocol, simple enough to write a parser in an afternoon. You're probably confusing it with HTML, which is a different subject altogether.

      • Re: (Score:3, Interesting)

        by Anonymous Coward

        You can write a simple HTTP/1.0 parser, maybe. Try implementing HTTP/1.1 Pipelining some time.

        Also, most HTTP parsers don't obey MIME rules when parsing structured headers. Regular expressions are insufficient. The vast majority of HTTP libraries don't fully support the specification, even at the 1.0 level. But most don't notice because you never see those complex constructs, except perhaps from an attacker trying to get around a filter--where on implementation perceives X and the other Y.

        I've written an HT

  • by Anonymous Coward

    It's nice to have a link to the draft. But couldn't we have just a little more "what's new" than it's binary? This is slashdot... Filled with highly techincal people. At least a rundown of the proposed changes would be very helpful in a discussion. The fact that they're proposing a binary protocol doesn't really matter to anyone besides anyone who wants to telnet to a port and read the protocol directly.

    • by gl4ss (559668) on Tuesday July 09, 2013 @01:51PM (#44227877) Homepage Journal

      It's nice to have a link to the draft. But couldn't we have just a little more "what's new" than it's binary? This is slashdot... Filled with highly techincal people. At least a rundown of the proposed changes would be very helpful in a discussion. The fact that they're proposing a binary protocol doesn't really matter to anyone besides anyone who wants to telnet to a port and read the protocol directly.

      from quick glance, multiple transfers and communications channels("streams" in the drafts lingo) can be put through the single connection, cutting tcp connection negotiations.

  • Makes it harder to troubleshoot by using telnet to send basic HTTP commands, and speedily develop applications with nuts and bolts tools over plaintext -- but the tradeoff is it can transfer a ton of data over one TCP socket, greatly simplifying the network layer of HTTP, and most certainly adding a lot of performance. For webserver admins, this will make life a lot easier.
    • HTTP already does that with pipelining. One connection, many files, optional compression on the body. The header is tiny.
      • by tepples (727027)

        One connection, many files

        HTTP/1.1 keep-alive and pipelining don't let the user agent cancel a (large) transfer in progress without closing and reopening the connection.

        The header is tiny.

        Unless things like Host:, User-agent:, and Cookie: need to be resent for each request.

    • by 0123456 (636235)

      Maybe if the average web page didn't contain 16MB of Javascript these days, you wouldn't need to worry so much about how much data you can send over one connection.
       

      • by robmv (855035)

        don't worry we will make Javascript 2.0 binary too. With the number of compilers targeting Javascript I don't see this as a joke :(

        • by jandrese (485)
          Heck, I'm surprised there isn't a javascript virutal machine already in browsers that sites could pre-compile scripts for, especially with the advent of the webpage as an application. We're doing GL calls with a scripted language for gods sake.

          While I'm sure modern browsers JIT compile javascript, it's amazing that we have to do that in the first place.
      • If you're worried about 16MB of Javascript code, you got bigger problems than whether the protocol is binary or not. And that suggests a new product feature-- a browser plug-in that blocks 1) any content using binary, and 2) any content over a maximum size, say 1MB. Less is more.
    • by lgw (121541) on Tuesday July 09, 2013 @01:49PM (#44227847) Journal

      Makes it harder to troubleshoot by using telnet to send basic HTTP commands

      Since we're using a tool in the first place, it's just as easy to use a tool that understands the binary format. Back before open source toolchains had really caught on as a concept, human readable formats were a big plus, because proprietary tools could be hard to come by. Not really a concern these days, as long as the binary format is unencumbered.

      • by belphegore (66832)

        ...unless you're on an embedded platform for which you don't have a compiler, and maybe busybox might build in this fancy new binary HTTP client tool in a few decades, but it'll be another few decades after that before manufacturers enable it and ship it.

      • by organgtool (966989) on Tuesday July 09, 2013 @02:29PM (#44228421)

        Since we're using a tool in the first place, it's just as easy to use a tool that understands the binary format.

        Let's say that you use a test client to send commands to your custom server interface and there's a bug. Now you have to spend extra time to discover if the bug is in the test client or in your custom server.

        Back before open source toolchains had really caught on as a concept, human readable formats were a big plus, because proprietary tools could be hard to come by. Not really a concern these days, as long as the binary format is unencumbered.

        You have it backwards. Before open source caught on, binary formats were all the rage. They were proprietary and they were very prone to corruption. Once a document became corrupted, you were at the mercy of the developers to come up with a tool that could reverse the damage. When open source caught on, it pushed hard for using XML as the format for storing and transmitting information. Data corruption is much easier to spot with clear text and can even be easily fixed compared to binary data. In this respect, HTTP 2.0 is a complete step backwards.

  • by Freshly Exhumed (105597) on Tuesday July 09, 2013 @01:38PM (#44227681) Homepage

    Can't wait to use the new 046102 047111 005113 tag!

    • This is HTTP not HTML.

  • by phizi0n (1237812)

    Isn't the HTTP 2.0 spec based on Google's SPDY protocol? It is basically just HTTP 1.1 but with header compression and the ability to either send or hint that extra files will be needed.

  • by PCM2 (4486)

    I am not big on my networking protocols, but didn't the HTTP 2.0 group decide to base its work on Google's SPDY protocol? The two don't look the same to me, but some of the descriptions in this spec do look like reshuffled versions of the SPDY spec. What's the relationship between the two these days?

    • by Shimbo (100005)

      Draft 0 [ietf.org] was SPDY. It's usually the way that standards evolve from one proposal; cut and shut standards don't often work out.

  • by Animats (122034) on Tuesday July 09, 2013 @01:44PM (#44227775) Homepage

    The big change is allowing multiplexing in one stream. It's a lot like how Flash multiplexes streams.

  • by Anonymous Coward on Tuesday July 09, 2013 @01:53PM (#44227909)

    This is FAR from a done deal. The binary/ASCII question is being hotly debated.

  • by qbast (1265706) on Tuesday July 09, 2013 @01:59PM (#44227985)
    HTTP/2.0 defines stream multiplexing, framing, stream control, prioritizing - pretty much replicating TCP. What is the point of putting TCP-like protocol on top of TCP ?
  • What a clustferfuck (Score:4, Interesting)

    by l0ungeb0y (442022) on Tuesday July 09, 2013 @02:13PM (#44228205) Homepage Journal

    Seems it's going binary to have EVERYTHING be a stream, with frame based communications, different types of frames denoting different types of data and your "web app" responsible for detecting and handling these different frames. Now I get that there's a lot of demand for something more than Web Socket, and I know that non-Adobe video streaming such as HLS are pathetic, but this strikes me as terrible.

    Really, why recraft HTTP instead of recrafting browsers? Why not get Web Socket nailed down? Is it really that hard for Google and Apple to compete with Adobe that instead of creating their own Streaming Media Services they need HTTP2.0 to force every server to be a streaming media server?

    Adobe's been sending live streams from user devices to internet services and binary based data communication via RTMP for several years, but HTML5 has yet to deliver on the bandied about "Device API" or whatever it's called this week even though HTML5 pundits have been bashing on Flash for years.

    So if Adobe is really that bad and Flash sucks that much, why are we re-inventing HTTP to do what Flash has been doing for years?
    Why can't these players like Apple and Google do this with their web browsers, or is it because none of these players really wants to work together because no one really trusts each other?

    At the end of the day, we all know it's all just one big clusterfuck of companies trying to get in on the market Adobe has with video and the only way to make this happen in a cross-platform way is to make it the new HTTP standard. So instead of a simple text based protocol, we will now be saddled with streaming services that really aren't suited to the relatively static text content that comprises the vast majority of web content.

    But who knows, maybe I'm totally wrong and we really do need every web page delivered over a binary stream in a format not too different from what we see with video.

  • by ArhcAngel (247594) on Tuesday July 09, 2013 @02:37PM (#44228539)
    Why does it have to be Binary?

    That's so twentieth century.

    We should be using a Hexadecimal format!
  • Rationale (Score:5, Informative)

    by hebcal (25008) on Tuesday July 09, 2013 @03:07PM (#44228943) Journal

    The rationale for http-2.0 is available in the http-bis charter. [ietf.org] Quoting the spec:...

    As part of the HTTP/2.0 work, the following issues are explicitly called out for consideration:

    • * A negotiation mechanism that is capable of not only choosing between HTTP/1.x and HTTP/2.x, but also for bindings of HTTP URLs to other transports (for example).
    • * Header compression (which may encompass header encoding or tokenisation)
    • * Server push (which may encompass pull or other techniques)

    It is expected that HTTP/2.0 will:

    • * Substantially and measurably improve end-user perceived latency in most cases, over HTTP/1.1 using TCP.
    • * Address the "head of line blocking" problem in HTTP.
    • * Not require multiple connections to a server to enable parallelism, thus improving its use of TCP, especially regarding congestion control.
    • * Retain the semantics of HTTP/1.1, leveraging existing documentation (see above), including (but not limited to) HTTP methods, status codes, URIs, and where appropriate, header fields.
    • * Clearly define how HTTP/2.0 interacts with HTTP/1.x, especially in intermediaries (both 2->1 and 1->2).
    • * Clearly identify any new extensibility points and policy for their appropriate use.
  • by mstefanro (1965558) on Tuesday July 09, 2013 @03:24PM (#44229187)

    Why won't they focus on what really matters? HTTP is still missing sane authentication techniques. Almost all websites require users to log in, so it should be a priority that a log in mechanism be supported in HTTP as opposed to being the responsibility of every web developer individually (don't you dare mention HTTP Basic Authentication). Relying on cookies alone to provide authentication is a joke. Not all websites afford to buy certificates or the performance penalty of HTTPS and security over HTTP is practically non-existent.

    The HTTP protocol is very primitive and it has not evolved on par with the evolution of cryptography and security. A lot better privacy, confidentiality, authentication can be obtained with very little cost. Because most websites allow you to log in, the server and the client share a common piece of information that a third party does not, namely the client's password (or some digest of that). Because of this shared information, it should be far easier to provide security over HTTP (at least for existing users). I find it laughable that we still rely on a piece of fixed string sent as plain-text (the session cookie) to identify ourselves. Far better mechanisms exist (besides the expensive HTTPS) to guarantee some security, and it should not be the responsibility of the web developer to implement these mechanisms.

    At the very least, HTTP 2.0 should support password-authenticated key agreement and client authentication using certificates.

    While signing up can still be problematic in terms of security, logging in need not be. There's a huge difference between being exposed once and being exposed every time. If there was no Eve to monitor your registration on the website, then there should be no Eve that can harm you any longer. You have already managed to share some secret information with the server, there is no reason to expose yourself every time you log in by sending your password in plain text and then sending a session in plaintext every time you make a request. That is, a session which can be used by anyone.

    While it may be acceptable for HTTP1.1 to not address such security concerns, I would find it disturbing for HTTP2.0 not to address them either.
    To get an intuitive idea on how easy it can be to have "safer proofs" that you are indeed logged in, consider the following scenario: you have registered to the website without an attacker interfering; you would like to log-in, so you perform an EKE with the server and you now have a shared secret; every time you send a request to that website, you send some cookie _authenticated = sha2(shared_key || incremental_request_number). Obviously, even if Eve sees that cookie, it cannot authenticate itself in the next request, because it does not know the shared_key and thus cannot compute the "next hash".
    This is just an example proof-of-concept technique to get an idea on how much better we can do. Many other cheap techniques can be employed. This one has the overhead of only one hash computation.

    Given that HTTP is a tremendously used protocol, it does make sense to make it as space-efficient as possible, being responsible for a large portion of the bandwidth. I do believe it matters on a large scale. However, given the arguments above, this should not be their priority.

"Ahead warp factor 1" - Captain Kirk

Working...