Claire Giordano: Beginner's Guide to Partitioning vs. Sharding in Postgres (PGConf.EU 2023)

Learn the basics of partitioning and sharding in Postgres, including range, list, and hash partitioning, and how to use Citus for sharding, to create a highly scalable and performant database architecture.

Key takeaways
  • Partitioning in Postgres is a way to split large tables into smaller, more manageable pieces, allowing for improved query performance and easier maintenance.
  • Partitioning can be done using three methods: range, list, and hash.
  • Range partitioning is useful for time-based data, where partitions are created based on a specific date range.
  • List partitioning is useful for data that is divided into distinct categories, where partitions are created based on a specific list of values.
  • Hash partitioning is useful for data that is evenly distributed, where partitions are created based on a hash function.
  • Partition pruning is a feature of Postgres that allows the database to automatically exclude partitions that are not relevant to a query, improving performance.
  • Auto vacuum is a feature of Postgres that helps to maintain partitioned tables by automatically deleting old data.
  • Sharding is a way to split a database across multiple nodes, allowing for improved scalability and performance.
  • Sharding can be done using row-based sharding, where each row is assigned to a specific node, or table-based sharding, where entire tables are split across nodes.
  • Citus is an extension to Postgres that provides sharding capabilities.
  • Citus provides features such as shard rebalancing and object propagation to help manage sharded databases.
  • Partitioning and sharding can be used together to create a highly scalable and performant database architecture.
  • Partitioning and sharding can be used to improve query performance, reduce storage costs, and improve data maintenance.
  • Partitioning and sharding require careful planning and maintenance to ensure optimal performance and scalability.