One post tagged with "column-pruning" | Apache Fluss™ (Incubating)

How Apache Fluss Achieves True Pruning in Streaming Storage

April 22, 2026

PPMC member of Apache Fluss (Incubating)

TL;DR:

Apache Kafka's "column pruning" is actually pseudo-pruning. All fields still cross the network, and clients discard unwanted ones after the fact. Apache Fluss redesigns the storage format, server-side read path, and write-side batching strategy from the ground up with Arrow IPC columnar storage, zero-copy server-side pruning, and client-side pre-shuffle batching. The result: pruning 90% of columns yields a 10x read throughput improvement, with performance scaling linearly with the pruning ratio.