Google Launches Brotli, a New Open Source Compression Algorithm For the Web 215
Mark Wilson writes: As websites and online services become ever more demanding, the need for compression increases exponentially. Fans of Silicon Valley will be aware of the Pied Piper compression algorithm, and now Google has a more efficient one of its own. Brotli is open source and is an entirely new data format that offers 20-26 percent greater compression than Zopfli, another compression algorithm from Google. Just like Zopfli, Brotli has been designed with the internet in mind, with the simple aim of making web pages load faster. It is a "lossless compressed data format that compresses data using a combination of the LZ77 algorithm and Huffman coding, with efficiency comparable to the best currently available general-purpose compression methods". Compression is better than LZMA and bzip2, and Google says that Brotli is "roughly as fast" as zlib's Deflate implementation.
What's the Weissman score? (Score:5, Funny)
What's the Weissman score?
Re: What's the Weissman score? (Score:1)
Lossless ha. Like the "c" missing from "Compresses" from the summary?
Auto correction is losing us more in meaning than any lossy jpeg compression ever did.
ompresses data (Score:5, Funny)
It is a "lossless compressed data format that ompresses data
... by discarding random bits and pieces of redundant occurrences of words.
Re:ompresses data (Score:5, Funny)
No no no... you don't understand. It's just THAT good of a ompression gorithm.
Re: (Score:3, Funny)
No no no... you don't understand. It's just THAT good of a ompression gorithm.
Well, if it ncreases xponentially, it must be good.
Ompression (Score:1)
No no no... you don't understand. It's just THAT good of a ompression gorithm.
Ompression is meta Compression, the type of compression performed by Omniscient beings. It's that good.
Re: (Score:2)
No no no... you don't understand. It's just THAT good of a ompression gorithm.
although i work with computers, I'm not up on the newest whiz bang stuff these days. i thought ompression might be some fancy word like impression but meaning something else.
so anyways, fuck the little punks that come up with all these stupid names that have to sound all web.20 like the 40 or whatever AWS names you have to learn to use Amazon.
Re: (Score:3, Funny)
That's lossy ompression.
Re: (Score:1)
It's not lossy. If it were, you wouldn't know how to reconstruct "ompression"; but I think you do. It's as if that sentence were run through an algorithm that detected some of the unnecessary letters in Enlish and omitted thm.
Re: (Score:3)
Why would it need to comit them?
Re: (Score:1)
Re: (Score:2)
There C is there in square brackets. Not sure why...
Re: (Score:1)
It's the cloud man. You have to expect bits to go missing every once in a while.
A better idea (Score:5, Insightful)
If they want to make webpages load quicker, remove ads.
Re:A better idea (Score:4, Insightful)
If they want to make webpages load quicker, remove ads.
That would only be a linear improvement. They need to load exponentially faster.
Re: (Score:1)
Take away advertising and at least in the short term much of the content goes away.. data volume problem solved.
Re: (Score:2)
baahahahah
Re: (Score:2)
Re: (Score:3, Insightful)
7z is pretty slow. Modern internet connection speeds are already between 100 Mbps (mobile) to up to 1 Gbps (fiber optics). Of course developing countries are still at about 50-100 Mbps level.
Thus something like LZ4 compresses much less, but total time of transmission is typically better. Who cares of the compression ratio, total time to complete the transaction matters.
Re:7zip (Score:5, Funny)
Let it be said that the USA resents being called a developing country.
Re: (Score:2)
So you should. I'm in Australia with a 100mbit residential connection for gods sake! :P
Truly unlimited too. For $90 AUD/month which is about $63 USD/month.
Re: (Score:1)
It is more complicated than that.
100 Mbit/s is 10 MB/s. If the web page uncompresses 1:4, it is enough if the decompressor runs a 40 MB/s, anything faster at the expense of the compression ratio is going to make things slower (well, unless the decompression reads from a compressed local cache soon and repeatedly). Transmission, decompression and building the html tree all happen in parallel in the browser. For a 1 Gbit/s network, i.e., 100 MB/s compressed data in, you need about a 400 MB/s decompression rat
Re: (Score:3)
Then it needs to start acting like a developed country. It can start by having some decent telecom regulation so that internet speeds are sufficient and inexpensive (if the US can't beat Romania of all places, then it deserves to be called "third world"), and cellular coverage is as good as northern Finland while as inexpensive.
Re: (Score:2)
Re: (Score:2)
7z is pretty slow. Modern internet connection speeds are already between 100 Mbps (mobile) to up to 1 Gbps (fiber optics).
But the compression does not have to be "on the fly". Large files can be compressed once, and then transmitted many times.
Re: (Score:2)
Re: (Score:1)
7-Zip isn't slow unless you tell it to be slow. You can tell it to be fast if you want to. You can tell it to use a specific algorithm with whatever parameters you want.
Besides, all static content is compress once, deliver many.
Dynamic content only needs to be compressed as often as it is updated.
Re: (Score:1)
Of course developing countries are still at about 50-100 Mbps level.
Not even close. I think you're off by an order of ten. I was in Central America last year and it was 4-6Mbps as "High Speed" internet.
Re: (Score:3)
Google's algorithm addresses those issues by having a fixed dictionary. Compression is faster because there is no need to compute the dictionary, transmission is faster because there is no need to transmit the dictionary, and decompression uses less memory and is faster because the dictionary is shared between multiple streams.
Re: (Score:3)
There are much better general purpose lossless compression algorithms than what is in 7-Zip (LZMA2) or WinRAR. They are also much slower, like the paq series that uses context modeling and arithmetic coding.
The challenge here is to do something that is also fast and memory efficient. In fact, we could improve the compression of Brotly very easily just by using arithmetic coding instead of Huffman coding. But the improvement probably wasn't worth the performance loss.
Better (Score:3, Funny)
Hey, Google! I have a compression algorithm that can compress any size to a single byte. I just need a little help with the decompress and we can really speed things up.
Binary Page Delivery (Score:2, Insightful)
for a better user experience. The real reason is so that ad blockers will no longer work. It will no longer be web "pages", they will be "apps". The walled garden will move to the web and have the doorway sealed shut.
Re: Binary Page Delivery (Score:2)
Re: (Score:2)
I don't see what your problem with compression is.
Chrome already supports gzip, deflate and sdch
Its part of the "Accept-Encoding" header that gets sent to the server by your browser.
You can check yours here:
http://www.xhaus.com/headers [xhaus.com].
The server won't send a compressed version if your browser doesn't say it can decode it first.
Re: (Score:2)
This isn't insightful, it's just wrong. We already have binary page delivery if the browser and server agree to support gzip or whatever. It's just one more thing like that.
replace with unified app framework (Score:2)
if only we had some unified app framework so that you didn't have to install a new app for every site. you could give it a little text bar at the top where you would type in the name of the app's data you want, and instead of having to have the app installed, it would just load the data FROM the source the app loads from, and then render it. you might have to add in some markup language tags here and there to help with the formatting, but it should be doable. this would save a lot on internal storage on the
Re: (Score:2)
Sepiroth!
To make web pages load faster: (Score:5, Insightful)
Stop making my browser run 500 trips to DNS in order to run 500 trips to every ad server in the world.
Also, for the everloving sake of Christ, you don't need megabytes of scripts, or CSS, or any other shit loaded from 50 more random domains in order to serve up an article consisting of TEXT. One kilobyte of script will be permitted to setup a picture slideshow, if desired. /Get off my e-lawn
name and shame (Score:5, Interesting)
Re:name and shame (Score:5, Informative)
Google announced [blogspot.ca] a while ago that they would take into account page load speed in search ranking.
Re: (Score:2)
I couldn't figure out what you're talking about - then I realized you must have actually read the linked article.
Weird.
Most of us stopped doing that in 2002 or so.
Re: (Score:2)
Not just ad(vertisement)s! Also, fancy and huge videos, animations, audio, and images. Sometimes, I disable them to avoid slow downs and messy designs esp(ecially) during my slow speed Internet sessions. Some web people don't even optimize anymore.
Re: (Score:2)
LOL, really? You must have a script, as soon as it sees the key word it post you shitty reply.
This is a test: AdBlock
Maybe we should all add AdBlock to our signature, mayber it would crash his script?
Re: (Score:2)
How do I block your ads for "hosts"?
Arnoldisation of Google (Score:2)
Brotli, Zopfli... is Google now run by Austrians, according to the Austrian nicknames?
And lossless too? I'd prefer if they lost the ads, then the compression wouldn't be needed.
Re: (Score:3)
FWIW... words ending with -li are usually Swiss. Austrians would use -le.
Re: (Score:1)
Re: (Score:2)
Swiss. It says that in the article.
For the web only, not much more (Score:5, Informative)
From the paper [gstatic.com]:
Unlike other algorithms compared here, brotli includes a static dictionary. It contains 13’504
words or syllables of English, Spanish, Chinese, Hindi, Russian and Arabic, as well as common
phrases used in machine readable languages, particularly HTML and JavaScript.
This means that brotli isn't a general purpose algorithm, but only built for the web, not more. I guess that future versions of the algorithm will include customized support for other, smaller languages, whose compression databases are only downloaded if you open a web page in that language.
Re: (Score:1)
That's not really true. See https://quixdb.github.io/squash-benchmark/ [github.io]. It compresses non-text data quite well.
Re: (Score:1)
Re: (Score:2)
So this means that it is an algorithm for compressing text. What I'm wondering about is how much actual performance we get, when most of the time spent loading pages is spent loading already heavily compressed content such as images.
1997 called (Score:3, Insightful)
It wants its bottleneck back. From what I can see it's plugins, scripts and adverts loading from fifty different sites that clog web pages, not large file sizes or what have you. Yes I get that compression is vital for an outfit like Google and they want to showcase what they've been doing but most websites don't have their traffic volume.
Re:1997 called (Score:4, Insightful)
Websites maybe, web applications are a different story. Problem is that businesses make heavy use of web applications and data lists can grow out of control quickly if you include the formatting required to make it look right. One could say that these web application have bad data presentation strategies and I would argue that you'd probably be right in many cases. Unfortunately not all problems can be solved with forced filters and paging.
Take /. for example. The HTML alone of this page (with very few comments at this point) is 200KB. If you add up the CSS and JS you are well above 1MB. Data alone would probably take only 100KB but data that's hard to decipher through is not fun data hence the overhead.
Compression is a no brainer if used properly. Bandwidth is limited and so is processing power. Just a matter of deciding which one is more important at any given time.
Re: (Score:2)
I work on an enterprise Business Intelligence web application, and it is 5.0/5.4MB debug (compressed / uncompressed), and 1.6/3.4MB minified + release (compressed / uncompressed) payload. It's over 700 JavaScript files, 200+ images, lots of CSS etc etc (debug, obviously much of this is combined into 1 when built into release). While it's a massive huge first payload, it successfully loads on tablets and phones etc. as long as they have a decent connection (HSPDA+, LTE or WiFi).
So I suppose I would agree wit
Re: (Score:2)
I should also mention that loading the debug version onto a tablet is actually impossible. All of the current top-end tablets and phones (iPad Air, Note 4, etc.) all crash when the uncompressed, unminified and uncombined files are transmitted to it. None of them can take it, except the Surface, if you consider that a tablet (I don't).
As a result, we've had to use emulators on machine with huge amounts of RAM and pray to God the error shows up in the emulator as well. Hopefully we can start taking advantage
Re: (Score:2)
Chrome practically killed off plug-ins when it ditched the Netscape plug-in API earlier this year. Future versions will not run Flash adverts by default either. Google do appear to be taking some fairly aggressive steps to clean up the web and advertising.
I'm surprised Mozilla hasn't jumped on this bandwagon too.
Not as good as alternatives (Score:2)
AdBlock developed a webpage compression algorithm that works much better than that. Although it simultaneously functions as a virus scanner, removing many malicious scripts and even most social engineering attack vectors, it not only doesn't slow down your computer, but rather makes pages load faster!
Re: (Score:3)
Be careful, you might summon the flying spittle of APK.
Re: (Score:2)
So, have you figured out why privilege escalation is a bad thing yet?
Re: (Score:3)
The very fact that your software requires a privilege escalation every time it updates indicates you don't understand security. Most people have no way to know what your software does when it updates, what is to stop you from having a Trojan load on your update? There is nothing, whereas your "rival" adblock requires no escalation.
Re: (Score:2)
Bob and weave, bob and weave.
You just don't get security do you? In your mind it is all about trust. Security isn't about a chain of trust, there is the verify step there too, which is not possible with your application. But go ahead and keep being wrong, it is what you are good at.
Re: (Score:2)
My antivirus software requires privileges in order to install, not to update.
Re: (Score:2)
Antivirus doesn't use elevated privileges to update.
Adblock doesn't use elevated privileges at all
Your host file software isn't a virus scanner, nor does it replace one.
Re: (Score:2)
Really?
https://www.sophos.com/en-us/s... [sophos.com]
How do you stop conflicker with your Hosts file, or any of the numerous viruses that spread through network protocols on local networks?
http://www.thorschrock.com/200... [thorschrock.com]
It looks like you haven't changed much, did it ever occur to you that maybe people respond the way they do because of your hostility?
Re: (Score:2)
It is nice to see you still don't understand how security works.
2. Spreads via Windows file sharing
Once on the network the virus can spread using the Microsoft exploit (above) or by accessing the file and admin shares on the network.
When it infects a computer it creates a file with a random name and a random extension within the System32 folder. A scheduled task (running as SYSTEM) will execute this file using rundll32.exe.
3. Spreads via removable media such as USB drives
When a removable drive is connected to an infected computer, the Conficker worm will
create a copy of itself in the RECYCLER\S-x-x-xx-xxxxxxxxxx-xxxxxxxxxx-xxxxxxxxx-xxxx folder on that drive (where x consists of random numbers)
drop the file autorun.inf in the root director of the drive.
How will your hosts file stop either of these attacks? It can't, which is why I mention this particular virus, but there are many like it that you have no defense against if you use hosts files instead of virus scanners.
Good luck with your Pwning.
Re: (Score:2)
This is where you say your hosts file stops all threats:
* My program utterly DISPLACES THE NEED FOR ANTIVIRUS since it blocks you getting infected in the 1st place (negating the need for antivirus @ all) & then it also stops botnets from communicating with their C&C servers even IF you get an infestation...
How can it displace the need for antivirus when it can't stop most of the threats?
What I have done is secure numerous systems on the internet. I have followed many procedures, but not one suggests editing a system file such as hosts. You are the one overloading and abusing the hosts file which wasn't meant for that purpose, not I. I don't have to defend what I do, I am not claiming to be the security expert like you are. I am not claiming to have a
Re: (Score:2)
I can't help if you are so forgetful. I didn't post AC claiming to be APK:
http://tech.slashdot.org/comme... [slashdot.org]
Re: (Score:2)
Again, I will explain to you as you seem to be mentally challenged. Administrator privilages used when installing a program (ONCE!) is not the same from a security perspective of administrator privileges every time an update is needed.
And your hosts file does not replace AV, you said it did, now eat your words.
http://slashdot.org/comments.p... [slashdot.org]
http://slashdot.org/comments.p... [slashdot.org]
http://slashdot.org/comments.p... [slashdot.org]
I didn't put any words in your mouth, you did it yourself, now eat them and go home. You have ZERO
Loading web pages faster? (Score:2)
Brotli has been designed with the internet in mind, with the simple aim of making web pages load faster.
Intall AdBlock, and it'll be faster.
No need for compression, when the ad server are the ones who slow page loading to a crawl.
Re: (Score:2)
Oh, not that shitty spam again? Who are you? You sit at home all day, glued to Slashdot, and as soon as someone mention "adblock" you cut & paste your shit?
And drop the bold on important words. Bold or CAPS on some words instantly rings my "scammer" alarm.
Exponential compression (Score:2)
As websites and online services become ever more demanding, the need for compression increases exponentially.
I hope not, because we're not going to get exponential increases in compression over what we have now.
Re: (Score:2)
I've found that misuse of the word "exponentially" by people who don't understand what it means is growing exponentially more irritating. ;)
Why is this labelled as a launch? (Score:2)
I was like "Haven't I seen this before?", and thinking that I had, because I work on Chrome, but then looked at:
https://en.wikipedia.org/wiki/... [wikipedia.org]
which says "This page was last modified on 27 February 2015, at 18:32."
Maybe it's just coming out of beta.
Re: (Score:2)
I was like "Haven't I seen this before?", and thinking that I had, because I work on Chrome, but then looked at: https://en.wikipedia.org/wiki/... [wikipedia.org] which says "This page was last modified on 27 February 2015, at 18:32."
Maybe it's just coming out of beta.
This is Google. Nothing ever comes out of beta.
As opposed to... (Score:4, Insightful)
It is a "lossless compressed data format..."
As opposed to what, a lossy compression formula for data?
Well hell, if you don't need the original data back I've got a compressor algorithm that'll reduce a 50GB file to 1 byte every time. Sometimes even less than a byte, like maybe 0.25 bytes. In fact it reduces the file to such a small size that just finding it again turns out to be a real challenge...
Re: (Score:1)
I could see a lossy algorithm for HTML / JS that eliminates lengthy tags and inefficient structures, perhaps even perform code optimization on heavily JS-infested pages, while rendering identically to the original.
The result of course would look nothing like the source and couldn't easily be reconstructed.
Re: (Score:2)
yes, exactly. Not everything requires the original data be recreated exactly, after all, just "similar enough". (images and audio, mostly; text and things represented as text usually fare poorly when parts are missing.)
Re: (Score:2)
It's easy to joke about, but in many cases lossy encoding/compression algorithms can be incredibly useful. Most tech people are already familiar with the fact that JPEG and MP3 are lossy encodings that produce "good enough" results for the vast majority of use cases while potentially reducing the data footprint by orders of magnitude, so I won't dwell on those
But what many of us forget to consider (even though we're aware of the fact that it's done) is that code can also be encoded in lossy ways to great be
Re: (Score:2)
It's easy to joke about, but in many cases lossy encoding/compression algorithms can be incredibly useful.
Yes yes, I know. But for something like a compressed database dump....no. Or medical imaging....no. Or forensic work.....no.
Pictures of cats....yes.
Re: (Score:2)
Quite right. And there will likely never be a general purpose lossy compression scheme, since the "loss" needs to be tailored to whatever the content is in order for the scheme to be useful.
When I was in grad school, our research group was working on a lossless compression algorithm tailored for web content (for internal use, since we had done the largest web crawl in academia at the time and needed a better way to store the data), much like this new one Google has. By relying on assumptions that hold true
Re: (Score:2)
or potentially even in database dumps (e.g. eliminating needlessly duplicated rows,
I don't see how an algorithm could ever really determine with any certainty which/what rows were "needlessly duplicated", as the reason(s) for duplication could vary so widely as to make it a guessing game. Maybe a row was there to test "needless" duplication.
Unless it was able to read the designer's mind or somehow positively determine for itself what the definition of "needless" was, this would be a disaster waiting to happen. Maybe a duplicated row was there for a good reason, just not one that's apparen
Re: (Score:2)
Quoting myself, emphasis added:
(e.g. eliminating needlessly duplicated rows, performing cascading deletes on data that wasn't cleaned up properly before, etc., though again, it would need to be tailored to the case at hand
I fully agree with you that there will never be a lossy compression scheme that works in the general case. And I agree with you as well that we can't come up with a lossy database compression scheme that can work in that generalized case. I was merely pointing out that there may be specific cases within that general case where lossy compression is possible.
Sorry if that comes across as me being obtuse. That wasn't my intent. I was merely trying to point out that domain knowled
Simple tests (Score:1)
I grabbed source, compiled and tests on latest linux kernel tarball. All compressors at 'best', so -9 for gzip, bzip2 and xz. Brotli at 11 (I think that's the highest). Brotli took twice as long as xz -9, but took 1/3 of the memory of xz -9.
82M Sep 22 14:12 linux-4.2.1.tar.xz
87M Sep 22 14:46 linux-4.2.1.tar.bro
98M Sep 22 14:13 linux-4.2.1.tar.bz2
125M Sep 22 14:13 linux-4.2.1.tar.gz
602M Sep 22 14:14 linux-4.2.1.tar
Re: (Score:1)
Twice as long to compress, to compress then decompress, or to decompress?
They seem to be only claiming fast expansion.
Not entirely lossless (Score:2)
"lossless compressed data format that ompresses data
You've lost a "c" already.
Confusing summary (Score:2)
IMHO, at first read, it sounds like it's saying it's more efficient than Pied Piper's (fictional) algorithm... Which of course is impossible, since Pied Piper's will compress everything.
Spinal Tap (Score:1)
Re: (Score:1, Insightful)
OOoops. Forgot to check the Anonymous checkbox again, sexconker? I think we know who the cow troll is. Now it is time to delete his account.
Re: (Score:1)
But why? Cows introduce a nice element of levity.
Re: (Score:1)
Re: (Score:2)
Why, do you think he's a mooooooooocher?
Re: (Score:2)
If you don't like the cow trolls, you're free to skip over them. As for me, I regard them as "mostly harmless", and I've even posted one myself [slashdot.org].
Re: (Score:2)
Re: (Score:1)
This is not about speed, this is about GOOGLE's bandwidth. Because they process so many transactions a second, they see cost savings even for small improvements.
Re: (Score:2)
I was thinking the same. The names are too much Swiss.