Citus Columnar - Jeff Davis - PGCon 2022

Discover Citus Columnar, a PostgreSQL feature that improves scan speed and reduces storage requirements by storing data in columns with compression and filtering.

Key takeaways
  • Citus Columnar is a feature that improves scan speed and reduces storage requirements by storing data in columns rather than rows.
  • It uses compression and chunk group filtering to reduce I/O and storage requirements.
  • Citizen preview work enables full control of the extension override ahead log and redo log.
  • Generic WAL records are used to track the data separately in their own metadata, and then use that metadata to exclude some of those chunks before bothering to read them.
  • Custom scan is used to skip over unnecessary columns that aren’t needed by the query, leading to faster execution.
  • It also enables the use of unique constraints on a columnar table, but indexes take up the same amount of space they would on a row table.
  • Columnar compression feature comes with Citus, which reduces storage requirements.
  • Citus is fully open source, and available to use.
  • It allows the creation of custom GUCs that define ergonomic configuration for extensions.
  • It allows defining custom wall resource manager API for demonstrating logical replication and logical decoding.
  • It allows defining custom write ahead log record formats and redoing it in Citus.
  • It uses PGTRANSFORM to transform a given table to improve scan speed and reduce storage requirements.
  • It allows the creation of custom indexes on columnar tables, such as B-tree and hash indexes.
  • Citus compression is done using the Z-standard compression algorithm by default, but can be changed to LZ4 or other compression algorithms.
  • Columnar tables are append-only, and inserts are handled through the INSERT command, with COPY command for bulk loading.
  • Citus partitions tables into different ranges (e.g., date ranges), with corresponding columnar or row partitioning.
  • It has some limitations, such as INSERT, UPDATE, and DELETE operations are not allowed.
  • The planner and executor integration, called custom scan, is utilized in Citus to skip over unnecessary columns.
  • The compression rate for Citus Columnar can be calculated, showing an 8x compression ratio for a simple query.
  • It supports a wide range of use cases, including grouping and aggregations, and time-based partitioning.
  • Citus has a few more feature gaps, and it is hoped to close those in the future.