feat: Add support for LZ4 compression #1181

andygrove · 2024-12-18T20:15:03Z

Which issue does this PR close?

Closes #1178

Builds on #1192 so we need to merge that PR first

Rationale for this change

LZ4 may provide faster compression than ZSTD, which could help with shuffle performance.

What changes are included in this PR?

How are these changes tested?

Dandandan · 2024-12-18T20:52:50Z

native/core/src/execution/shuffle/shuffle_writer.rs

+        }
+        CompressionCodec::Zstd(level) => {
+            let encoder = zstd::Encoder::new(output, *level)?;
+            let mut arrow_writer = StreamWriter::try_new(encoder, &batch.schema())?;


I am not really familiar with the code, but shouldn't StreamWriter and encoder be created only once per stream instead of per batch?

Yes, that is something I have been thinking about as well. We have the cost of writing the schema for each batch currently, and the schema is guaranteed to be the same for each batch.

To add some more context here, we buffer rows per partition until we reach the desired batch size and then need to serialize that batch to bytes that can be read as one block by CometBlockStoreShuffleReader.

Besides writing the schema each time, I guess it will also have higher overhead of flushing each time (less efficient buffering), having lower compressibility, etc. ?

I filed #1186 for re-using the writer across many batches. It looks like a big perf win.

andygrove · 2024-12-19T00:16:17Z

Here are my findings from hacking on this today.

LZ4 provides two compression formats: LZ4 Block Format and LZ4 Frame Format.

Spark uses the Java library https://github.com/lz4/lz4-java and specifically uses LZ4BlockOutputStream which seems to be a proprietary streaming LZ4 format, as noted in the documentation:

/**
 * Streaming LZ4 (not compatible with the LZ4 Frame format).
 * This class compresses data into fixed-size blocks of compressed data.
 * This class uses its own format and is not compatible with the LZ4 Frame format.
 * For interoperability with other LZ4 tools, use {@link LZ4FrameOutputStream},
 * which is compatible with the LZ4 Frame format. This class remains for backward compatibility.
 * @see LZ4BlockInputStream
 * @see LZ4FrameOutputStream
 */
public class LZ4BlockOutputStream extends FilterOutputStream {

edit: Apache Commons provides FramedLZ4CompressorInputStream and BlockLZ4CompressorInputStream so I am testing with using those from CometBlockStoreShuffleReader instead of using Spark's codec.

andygrove · 2024-12-20T19:33:50Z

switched to lz4_flex crate:

shuffle_writer/shuffle_writer: encode and compress (zstd)
                        time:   [209.27 µs 211.21 µs 213.47 µs]
shuffle_writer/shuffle_writer: encode and compress (lz4 frame)
                        time:   [162.58 µs 165.17 µs 167.76 µs]
shuffle_writer/shuffle_writer: encode and compress (lz4 block)
                        time:   [180.16 µs 183.93 µs 188.09 µs]

andygrove · 2024-12-20T20:19:40Z

The lz4_flex crate should get even faster once PSeitz/lz4_flex#175 is merged

andygrove · 2024-12-20T20:23:25Z

Current status:

✔️ I can run real benchmarks
✔️ shuffle write encoding + compression time is improved
🤕 Overall query time is more than 10x slower, so perhaps there is an issue with the reader side now

This reverts commit 76e0d71.

andygrove · 2024-12-23T17:01:42Z

LZ4 support is now part of #1192

Dandandan reviewed Dec 18, 2024

View reviewed changes

andygrove force-pushed the shuffle-lz4 branch from c15692d to b4e10cd Compare December 20, 2024 20:08

andygrove changed the title ~~[do not review] experimental support for lz4 compression (not working)~~ feat: Add support for LZ4 compression Dec 20, 2024

andygrove added 7 commits December 21, 2024 07:21

native decode

75ff0ef

move shuffle classes from commmon to spark

2a76eb9

some tests pass

f7c9407

reuse buffer

f218870

revert reuse buffer

64d6ab0

fix spark 4.0 build

ee1037d

Add support for LZ4 compression

291d8f1

andygrove force-pushed the shuffle-lz4 branch from 41602c1 to 291d8f1 Compare December 21, 2024 17:29

andygrove added 15 commits December 21, 2024 10:38

more efficient codec parsing

e0de83a

make more robust

8885115

upmerge

1551398

fully switch to native decode

76e0d71

Revert "fully switch to native decode"

66b7e2b

This reverts commit 76e0d71.

fix

6656520

fix

78d5445

fix

22d6884

save progress

d80cbcb

fix

2c3df14

revert encoding enum

3dccf7d

format

dec355f

prepare for review

d3aafef

upmerge

652fdeb

fix

bb55eaa

andygrove closed this Dec 23, 2024

andygrove deleted the shuffle-lz4 branch December 23, 2024 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for LZ4 compression #1181

feat: Add support for LZ4 compression #1181

andygrove commented Dec 18, 2024 •

edited

Loading

Dandandan Dec 18, 2024

andygrove Dec 18, 2024

andygrove Dec 18, 2024

Dandandan Dec 19, 2024

andygrove Dec 19, 2024

andygrove commented Dec 19, 2024 •

edited

Loading

andygrove commented Dec 20, 2024

andygrove commented Dec 20, 2024

andygrove commented Dec 20, 2024

andygrove commented Dec 23, 2024

feat: Add support for LZ4 compression #1181

feat: Add support for LZ4 compression #1181

Conversation

andygrove commented Dec 18, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Dandandan Dec 18, 2024

Choose a reason for hiding this comment

andygrove Dec 18, 2024

Choose a reason for hiding this comment

andygrove Dec 18, 2024

Choose a reason for hiding this comment

Dandandan Dec 19, 2024

Choose a reason for hiding this comment

andygrove Dec 19, 2024

Choose a reason for hiding this comment

andygrove commented Dec 19, 2024 • edited Loading

andygrove commented Dec 20, 2024

andygrove commented Dec 20, 2024

andygrove commented Dec 20, 2024

andygrove commented Dec 23, 2024

andygrove commented Dec 18, 2024 •

edited

Loading

andygrove commented Dec 19, 2024 •

edited

Loading