Zstandard


Zstandard is a lossless data compression algorithm developed by Yann Collet at Facebook.
Zstd is the reference implementation in C. Version 1 of this implementation was released as free software on.

Features

Zstandard was designed to give a compression ratio comparable to that of the DEFLATE algorithm, but faster, especially for decompression. It is tunable with compression levels ranging from negative 5 to 22.
The zstd package includes parallel implementations of both compression and decompression. Starting from version 1.3.2, zstd optionally implements very long range search and deduplication similar to rzip or lrzip.
Compression speed can vary by a factor of 20 or more between the fastest and slowest levels, while decompression is uniformly fast, varying by less than 20% between the fastest and slowest levels. Zstandard command-line has an "adaptive" mode that varies compression level depending on I/O conditions, mainly how fast it can write the output.
Zstd at its maximum compression level gives a compression ratio close to lzma,
lzham, and ppmx, and performs better than lza, or bzip2. Zstandard reaches the current Pareto frontier, as it decompresses faster than any other currently-available algorithm with similar or better compression ratio.
Dictionaries can have a large impact on the compression ratio of small files, so Zstandard can use a user-provided compression dictionary. It also offers a training mode, able to generate a dictionary from a set of samples. In particular, one dictionary can be loaded to process large sets of files with redundancy between files, but not necessarily within each file, e.g., log files.

Design

Zstandard combines a dictionary-matching stage with a large search window and a fast entropy coding stage, using both Finite State Entropy, and Huffman coding.
Because of the way that FSE carries over state between symbols, decompression involves processing symbols within the Sequences section of each block in reverse order.

Usage

The Linux kernel has included Zstandard since November 2017 as a compression method for the btrfs and squashfs filesystems.
In 2017, Allan Jude integrated Zstandard into the FreeBSD kernel and used it to create a proof of concept OpenZFS compression method. It was subsequently integrated as a compressor option for core dumps.
The AWS Redshift and RocksDB databases include support for field compression using Zstandard.
In March 2018, Canonical tested the use of zstd as a deb package compression method by default for the Ubuntu Linux distribution. Compared with xz compression of deb packages, zstd at level 19 decompresses significantly faster, but at the cost of 6% larger package files. Debian developer Ian Jackson favored waiting several years before official adoption.
In 2018 the algorithm was published as RFC 8478, which also defines an associated media type "application/zstd", filename extension "zst", and HTTP content encoding "zstd".
Arch Linux added support for zstd as a package compression method in October 2019 with the release of the pacman 5.2 package manager, and in January 2020 switched from xz to zstd for the packages in the official repository. Arch uses zstd -c -T0 --ultra -20 -, the size of all compressed packages combined increased by 0.8%, the decompression speed is 1300% faster, decompression memory increased by 50 MiB when using multiple threads, compression memory increases but scales with the number of threads used.
Fedora added ZStandard support to RPM in May 2018, and used it for packaging the release in October 2019
Full implementation of the algorithm with an option to choose the compression level is used in the.NSZ /.XCZ file formats, developed by the homebrew community for the Nintendo Switch hybrid game console.

License

The reference implementation is licensed under the BSD license, published at GitHub. Since version 1.0, it had an additional Grant of Patent Rights.
From version 1.3.1, this patent grant was dropped and the license was changed to a BSD + GPLv2 dual license.