Highly efficient interconnection for distributed PostgreSQL - Dmitry Ursegov & Teodor Sigaev - PGCon

Dmitry Ursegov and Teodor Sigaev present a highly efficient interconnection solution for distributed PostgreSQL, overcoming limitations of libpq and connection models, achieving low latency and high throughput.

Key takeaways
  • libpq and connection model have limitations that reduce overall efficiency, and the transport subsystem can still be improved.
  • The current transport model can lead to a large number of system calls, memcpy, and bulk transfers, which become a limit for point queries.
  • Sharding has unlimited scalability and is partially why it will be discussed mostly about sharding in this talk.
  • Partitioning and sharding both have limitations, but sharding can provide unlimited scalability.
  • The new transport provides low latency and high throughput interconnection for distributed PostgreSQL.
  • The architecture of the new transport includes multiplexing processes, shared memory queues, and GCP sockets.
  • The key aspect of the new transport is to minimize overhead of data transfers and optimize performance.
  • To optimize performance, databases adopt several techniques, including sharding, replication, and partitioning.
  • Sharding requires adopting data model and query planning, and has unlimited scalability.
  • Replication provides a database with a read-only copy, but can be limited by CPU resources.
  • Partitioning can provide efficient execution for workloads with large volumes of requests or size of transferred data.
  • The new transport can scale to 100 nodes and more, and can provide 30% more performance when integrated with Postgres FTW infrastructure.
  • The patches for Postgres FTW and sources of multiplexer extension will be available during the next month.