Google Publishes Zopfli, an Open-Source Compression Library 124
alphadogg writes "Google is open-sourcing a new general purpose data compression library called Zopfli that can be used to speed up Web downloads. The Zopfli Compression Algorithm, which got its name from a Swiss bread recipe, is an implementation of the Deflate compression algorithm that creates a smaller output size (PDF) compared to previous techniques, wrote Lode Vandevenne, a software engineer with Google's Compression Team, on the Google Open Source Blog on Thursday. 'The smaller compressed size allows for better space utilization, faster data transmission, and lower Web page load latencies. Furthermore, the smaller compressed size has additional benefits in mobile use, such as lower data transfer fees and reduced battery use,' Vandevenne wrote. The more exhaustive compression techniques achieve higher data density, but also make the compression a lot slower. This does not affect the decompression speed though, Vandenne wrote."
Wow, gzip -9 is very competitive for most usages (Score:5, Insightful)
Looking at the data presented in the pdf, it seems to me that gzip does a fantastic job for the amount of time it takes to do it.
The interesting bit is this: (Score:5, Insightful)
Re:Overhyped (Score:5, Insightful)
Actually, they state that the 3-8% better maximum compression than zlib is 2-3 orders of magnitude longer to compress.
I can't imagine what kind of content you're hosting that'd justify 3 orders of magnitude compression time to gain 3% compression.
Static content that only has to be compressed once, yet is downloaded hundreds of thousands or millions of times. 3-8% is a pretty significant savings in that case.
JavaScript libraries, for one thing (Score:5, Insightful)
In anything that is static enough that it will be downloaded many times in its lifetime, and not time sensitive enough that it needs to be instantly available when generated, very small gains in compression efficiency are worth paying very large prices in compression.
If you, for just one of many Google-relevant examples, host a fair number of popular JavaScript libraries [google.com] (used on both your own sites -- among the most popular in the world -- and vast numbers of third party sites that use your hosted versions) and commit, once you have accept a particular stable version of a library, to hosting it indefinitely, you've got a bunch of assets that are going to be static for a very long time, and accessed very large numbers of times. One time cost to compress is going to be dwarfed by even a miniscule savings in transfer costs for those.
Re:Wow, gzip -9 is very competitive for most usage (Score:5, Insightful)
Why would you recompress static content every time it is accessed? For frequently-accessed, static content (like, for one example, the widely-used JavaScript libraries that Google hosts permanently), you compress it once, and then gain the benefit on every transfer.
For dynamic content, you probably don't want to do this, but if you're Google, you can afford to spend money getting people to research the best tool for very specific jobs.
Re:Overhyped (Score:5, Insightful)
For example, assuming browsers incorporate the capability to decompress it, lowering the bandwidth of Youtube by ~3% is an achievement.
I don't know why people keep mentioning Youtube, since all videos are already compressed in such a way that pretty much no external compression is going to gain anything.
Although when compressing a video Zopfli might result in a smaller file compared to gzip, that doesn't mean either will be smaller than the original. All H.264 files should be using CABAC [wikipedia.org] after the motion, macroblock, psychovisual, DCT, etc. stages, and that pretty much means that the resulting files have as much entropy per bit as possible. At that point, nothing can compress them further.
Re:Overhyped (Score:5, Insightful)
Word, when I'm downloading the latest pirated release of a 1080p movie
"word", and intend to download zipped h.264 files leads me to believe you are retarded.