LZ4 Compression Algorithm Gets Multi-Threaded Update (linuxiac.com) 44

Posted by EditorDavid on Saturday July 27, 2024 @08:01PM from the I-concurrency dept.

Slashdot reader Seven Spirals brings news about the lossless compression algorithm LZ4: The already wonderful performance of the LZ4 compressor just got better with multi-threaded additions to it's codebase. In many cases, LZ4 can compress data faster than it can be written to disk giving this particular compressor some very special applications. The Linux kernel as well as filesystems like ZFS use LZ4 compression extensively. This makes LZ4 more comparable to the Zstd compression algorithm, which has had multi-threaded performance for a while, but cannot match the LZ4 compressor for speed, though it has some direct LZ4.
From Linuxiac.com: - On Windows 11, using an Intel 7840HS CPU, compression time has improved from 13.4 seconds to just 1.8 seconds — a 7.4 times speed increase.
- macOS users with the M1 Pro chip will see a reduction from 16.6 seconds to 2.55 seconds, a 6.5 times faster performance.
- For Linux users on an i7-9700k, the compression time has been reduced from 16.2 seconds to 3.05 seconds, achieving a 5.4 times speed boost...

The release supports lesser-known architectures such as LoongArch, RISC-V, and others, ensuring LZ4's portability across various platforms.

LZ4 Compression Algorithm Gets Multi-Threaded Update

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 44 Comments Log In/Create an Account

Comments Filter:

its, not it's (Score:4, Informative)

by Anonymous Coward writes: on Saturday July 27, 2024 @08:08PM (#64660754)

its codebase
FTFY. It's just like yours, mine, theirs, etc. No apostrophe needed!

- Re: (Score:2)
  
  by newbie_fantod ( 514871 ) writes:
  
  Who's code base? Apostrophes are correctly used to indicate possession.
  - Re: (Score:2)
    
    by clawsoon ( 748629 ) writes:
    
    In this case, you're saying, "Who is code base?" The form you're looking for is, "Whose code base?"
  - Re: (Score:2)
    
    by cas2000 ( 148703 ) writes:
    
    > Apostrophes are correctly used to indicate possession.
    Sometimes yes, but not always.
    Apostrophes are used to indicate both contraction and possession - in that order of precedence.
    Example contractions:
    "it's" = "it is"
    "who's" = "who is"
    Example possessions:
    "its" indicates possession by "it"
    "whose" indicates possession by "who" (a previously mentioned owner)
    Using an apostrophe to indicate possession in these cases is an INCORRECT use of the apostrophe.
    - Re: its, not it's (Score:2)
      
      by AnonymousNoel ( 6972222 ) writes:
      
      Do we really need to teach everybody on the internet the basics of grammar, one by one? Surely there must be a more efficient way than this!? Can't they go watch a YouTube or something?
TL;DR (Score:5, Funny)

by Unnamed Chickenheart ( 882453 ) writes: on Saturday July 27, 2024 @08:37PM (#64660788)

Article too long. Can someone compress it for me?

- Re: (Score:3)
  
  by martin-boundary ( 547041 ) writes:
  
  ChatGPT, summarize the article and translate into Klingon.
  - Re: TL;DR (Score:5, Insightful)
    
    by crackerjack155 ( 1328815 ) writes: on Saturday July 27, 2024 @09:54PM (#64660894)
    
    LZ4 to Daqtagh bejqu' muDaq Hoch paQDI'norgh
    I-ngaDHa'ghach yaH
    Slashdot qel Seven Spirals LZ4 Daqtagh ja':
    LZ4 mIvwa' bejqu' moHaq HIvje' vo' multi-Threaded che'laHghach pe'vIl 'IH je'
    'ejchugh, LZ4 DaqlIj bejqu' qaSpu'DI' vuDwI' qImHa'. wa'DIch Qapbe' 'oH bejqu' Haq. Linux qawHaq je' Qo'noSmey lo'chu' 'IH ZFS bejqu' LZ4 je' 'op vo' qo' mIwvetlh Daqtagh bIt.
    vo' Linuxiac.com:
    - Windows 11, Intel 7840HS CPU, bortaS 13.4 lup je' neH 1.8 lup che' 7.4 logh qet
    - macOS M1 Pro chip, lupvo' 16.6 neH 2.55 lup che' 6.5 logh puS
    - Linux lo', i7-9700k, lupvo' 16.2 neH 3.05 lup che' 5.4 logh puS...
    naDev lo'Ha'ghach yaHwIj lo' LaongArch, RISC-V je', Qapla' Daqtagh muDaq Hoch tlhoy'meH.
    
    - Re: (Score:2)
      
      by Bu11etmagnet ( 1071376 ) writes:
      
      He asked for compression, not translation to Klingon
- Re: (Score:2)
  
  by NoWayNoShapeNoForm ( 7060585 ) writes:
  
  I think I got an old copy of ARJ sitting around here somewhere. Gotta minute to wait?
- Re: (Score:3)
  
  by ls671 ( 1122017 ) writes:
  
  I assume it's a copy cat of pigz, pigz vs gz is same as new LZ4 vs old I assume.
  https://linuxhandbook.com/pigz... [linuxhandbook.com]
  Be aware of default values and manage your threads and IO usage. /s
though it has some direct LZ4 (Score:3)

by Gavino ( 560149 ) writes: on Saturday July 27, 2024 @09:25PM (#64660842)

what does that even mean?

- Re: (Score:3)
  
  by sl3xd ( 111641 ) writes:
  
  I'm going to bet it's that: zstd has the capability to create/use lz4 (and gzip, xz, and lzma) archives. (zstd --format=lz4 foo.lz4 file1.txt file2.txt ...)
  No love to bz2 or brotli, though.
Loongarch? (Score:2)

by kriston ( 7886 ) writes:

Loonarch? Don't you mean pirated MIPS64?
- Re: (Score:3)
  
  by sl3xd ( 111641 ) writes:
  
  I'm not sure I'd call it pirated, given the MIPS64 architecture owner open-sourced it (and released MIPS r6) in 2019. At least, probably not anymore (chronology probably matters, but...)
  Even the IP owner of MIPS has moved on to RISC-V.
  - Re: (Score:2)
    
    by kriston ( 7886 ) writes:
    
    Did those MIPS patents expire yet or did the owners release them to the public domain?
Compression time for what data? (Score:5, Insightful)

by ElrondHubbard ( 13672 ) writes: on Saturday July 27, 2024 @09:51PM (#64660882)

A speed increase ranging from 5.4x to 7.4x would sound impressive, if only they said just what kind and quantity of data was being compressed. Was it English text? If so, how many characters? Was it a bitmap file? PCM audio? Whatwhatwhat? It says LZ4 prioritizes speed over compression, but I didn't learn much about the compression ratio or any time/ratio trade-off, either. How can anyone judge based on what's in this article?

- Re:Compression time for what data? (Score:5, Informative)
  
  by Dwedit ( 232252 ) writes: on Saturday July 27, 2024 @10:48PM (#64660972) Homepage
  
  It is looking exclusively for backreferences, or data which has previously appeared and has been repeated. It does not do any entropy or huffman encoding, does not do any audio sample or pixel prediction, or anything like that. It's backreferences only. LZ4 has a maximum distance of 64KB for its backreferences.
  
- Re:Compression time for what data? (Score:5, Informative)
  
  by thegarbz ( 1787294 ) writes: on Sunday July 28, 2024 @01:56AM (#64661098)
  
  The kind of data being compressed isn't relevant for speed due to the simplicity of the compressor. It is designed for on the fly compression and decompression so it's actual ability to compress data is very limited.
  The compression ratio is rarely above two even for highly compressible content such as text. Just for example using LZ4 to compress the contents of its own README.md file:
  Compressed 3058 bytes into 1769 bytes ==> 57.85% (compared to 1283 bytes for zip)
  vs compressing it's own exe file:
  Compressed 882789 bytes into 462399 bytes ==> 52.38% (compared to 309687 bytes for zip)
  You're not using LZ4 if you want good compression, you are using LZ4 if you want compression which doesn't degrade performance during I/O activities, such as disk compression, RAM compression, etc.
  
  - Re: Compression time for what data? (Score:2)
    
    by Viol8 ( 599362 ) writes:
    
    If you think it's so simple why dont you write an improved version. 50% compression is very good for a general purpose lossless compressor. If you want high compression you need data specific algorithms as one size does not fit all and simple doesnt preclude a high compression ratio, eg run length encoding can compress sparse data down to almost nothing.
    - Re: (Score:1)
      
      by thegarbz ( 1787294 ) writes:
      
      If you think it's so simple why dont you write an improved version.
      What the fuck are you talking about. Where did I say it was simple or that one size fits all (I specifically said the opposite). If you have voices in your head, please talk directly to them rather than posting pointless non-arguments on Slashdot.
    - Re: (Score:2)
      
      by Anubis IV ( 1279820 ) writes:
      
      50% compression is very good for a general purpose lossless compressor.
      You can't make blanket statements like that without citing the data set being used because the compression factor depends on the input. Some data is entirely uncompressible (literally 0%). Some data is highly compressible (99.99999...%). Just depends.
      In this case, we can safely say that 50% is not that impressive, however, for the simple reason that the previous poster shared the performance of a standard zip (i.e. likely DEFLATE, a general purpose, lossless approach) and the fact that it significantly outp
      - Re: (Score:2)
        
        by Viol8 ( 599362 ) writes:
        
        Web pages contain reams of HTML language tokens. Easy to compress those down to a few bits each.
        
        Re: (Score:2)
        
        by Dwedit ( 232252 ) writes:
        
        Brotli is specifically made for web content like HTML. Just look at the preset dictionary and you see not only words, but also lots of likely code fragments for HTML, XML, CSS, and JavaScript.
        
        Re: (Score:2)
        
        by Anubis IV ( 1279820 ) writes:
        
        Yes. I know. Hence why I mentioned it.
- Re: (Score:2)
  
  by ls671 ( 1122017 ) writes:
  
  It's most use cases since this is multi-threaded version of the compression algorithm.
  I posted above:
  https://hardware.slashdot.org/... [slashdot.org]
  Use case example:
  I have a machine with 2 physical cpu. Each cpu has 24 cores for a total of 48 threads that can really physically run at the same time on the machine.
  So in theory, any multi-threaded compression could be 48 times faster by splitting the work between all threads available compared to its single-threaded version which would never use more than one core.
  But it's
- Re: Compression time for what data? (Score:2)
  
  by LindleyF ( 9395567 ) writes:
  
  Most importantly, is it 3D video?
- Re: (Score:2)
  
  by edwdig ( 47888 ) writes:
  
  LZ4 is one of the algorithms that was common right before Zip mostly took over for general purpose compression. It's pretty similar to zip (deflate), good and bad on the same types of data zip is. LZ4 compressed files are a little bigger than zip/deflate compressed files, but they're faster to compress and decompress.
  LZ4 was really common on the older cartridge based systems. Good enough compression, but the faster decompression speed than zip was beneficial on those CPUs.
Erâ¦ AMD, not Intel (Score:4, Informative)

by Aryeh Goretsky ( 129230 ) writes: on Saturday July 27, 2024 @10:47PM (#64660968) Homepage

Hello,
I was unfamiliar with the Intel 7840HS CPU mentioned in the article, and figured it was either some model for embedded systems, servers or other computers not generally used by the public.
One quick search later, and I found out is an AMD CPU for laptops, specifically the AMD Ryzen 7 7840HS. Here are the specs for it: https://www.amd.com/en/product... [amd.com].
The changelog for the LZ4 release gives more information about the speed improvements: https://github.com/lz4/lz4/rel... [github.com]. It does not mention the manufacturers of the CPUs used in benchmarking, which is probably why it was misidentified in the article.
Regards,
Aryeh Goretsky

- Re: (Score:2)
  
  by Lobachevsky ( 465666 ) writes:
  
  Yeah, I run a Ryzen 7840HS on my main pc, and a Ryzen 7840U on my handheld (ROG Ally), so I chuckled immediately when I read "Intel 7840HS". It's quite clear that editors just slap "Intel" on any x86-64 architecture cpu. The funny thing is that x86 used to be Intel, and all other x86 cpus were "Intel-compatible". But x86-64 is AMD (Linux even calls it AMD64 architecture), so all other x86-64 cpus are "AMD-compatible" now.
  - Re:Er, AMD, not Intel (Score:2)
    
    by kriston ( 7886 ) writes:
    
    Classic move by AMD to upstage Intel with a backwards-compatible x86 64-bit processor. Intel begrudgingly adopted it as their standard, too. It must have been humiliating for Intel especially after the massive failure of the 64-bit Itanium.
- Re: (Score:2)
  
  by ctilsie242 ( 4841247 ) writes:
  
  On the Linux side, pigz does a similar thing, spreading out the workload among multiple cores.
  Of course, compression in general is always useful. For example, a lot of ZFS filesystems come with LZ4 compression turned on, which works in almost all cases, even improving [trunc.org] access times for SQLite databases, and in my experience, PostgreSQL backends as well. Having a faster LZ4 isn't going to hurt things.
  Of course, what would be nice is if we can get LZMA/XZ/ZSTD compression to a similar tier. ZSTD compression
  - Re: (Score:2)
    
    by ls671 ( 1122017 ) writes:
    
    Using zfs here and using gzip for hosts with spinning rust. Only slight gains noticeable with regards to IO delays so it's just we didn't bother to switch it back yet.
    ZFS can handle blocks compressed with many different algorithms in the same pool so nothing to do after changing the compression algorithm, We have memory read caches like 64GB so only writes are problems with zfs and spinning rust.
    - Re: (Score:2)
      
      by ctilsie242 ( 4841247 ) writes:
      
      If you use a L2ARC cache and a ZIL, even those are not really issues as well. For write intensive stuff, I've found a ZIL mirror (two SSDs, to ensure if one fails, the ZIL stays working) quite useful, just because the data hits the SSD landing zone, and the latency and seek time of the main pool's HDDs are less of a factor.
      - Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        You always have a ZIL, the differentiator is when you put it on a SLOG.
        
        Re: (Score:2)
        
        by ctilsie242 ( 4841247 ) writes:
        
        True. In any case, having a log vdev, especially if one has a SSD, or even better (RIP) Optane, can greatly help writes. For example, if one has a machine handling writes from backups, having the SLOG be the landing zone means not having to handle a bunch of random I/O.
        The best way I've seen this implemented was a card on a Supermicro motherboard that had a few gigs of battery backed up DRAM, about 4-8 gigs. This allowed the ZFS write to finish very quickly, especially when a bunch of random I/O was comi
- Re: (Score:2)
  
  by ls671 ( 1122017 ) writes:
  
  Remember several months ago running 7zip on a few gigs of data. It completed in a few seconds and I was confused assuming something had gone wrong. Then I ran it again and looked in task manager and was like...oh yea I have a shitload of cores now and the compression runs in parallel.
  LZ4 is even faster where the parallel version will probably saturate most NVMe if you have enough cores.
  Yeap, saturated IO bandwidth for us for pigz vs gzip. Back to one thread only but still using pigz. Even 2 threads was unacceptable for our use case.
- Re: (Score:2)
  
  by WolfWings ( 266521 ) writes:
  
  "Enough cores" being one on the decompression side, it's already generally bottlenecked by PCIe bandwidth to read data from the storage layer there.
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    7-zip is *not* PCIe bandwidth limited, it's still very much CPU bound in decompression and you get a significant speed improvement in multicore tests.
- Re: (Score:2)
  
  by Gabest ( 852807 ) writes:
  
  Did you check the size? When a compression algorithm uses more threads it splits the input data into equal parts, for each thread to work on, but the dictionaries aren't shared, and they cannot cross-reference similar data. If you have the time, it is usually not worth going for more than 2 threads with 7zip. And I guess it's the same with LZ4 as well.
  - Re:Multi-threaded craziness (Score:4, Informative)
    
    by thegarbz ( 1787294 ) writes: on Sunday July 28, 2024 @10:50AM (#64661546)
    
    And I guess it's the same with LZ4 as well.
    You'd be guessing wrong. LZ4 works on 64kb windows. Compression data is not shared between windows blocks of data. Providing you can read 16x 64kb blocks faster than you can compress one of them you get a speed boost on a 16 core CPU with zero difference in final file size.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

its, not it's (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: its, not it's (Score:2)

TL;DR (Score:5, Funny)

Re: (Score:3)

Re: TL;DR (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

though it has some direct LZ4 (Score:3)

Re: (Score:3)

Loongarch? (Score:2)

Re: (Score:3)

Re: (Score:2)

Compression time for what data? (Score:5, Insightful)

Re:Compression time for what data? (Score:5, Informative)

Re:Compression time for what data? (Score:5, Informative)

Re: Compression time for what data? (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Compression time for what data? (Score:2)

Re: (Score:2)

Erâ¦ AMD, not Intel (Score:4, Informative)

Re: (Score:2)

Re:Er, AMD, not Intel (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Multi-threaded craziness (Score:4, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals