We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
ElixirConf 2023 - Andrew Bennett - Erlang Dist Filtering and the WhatsApp Runtime System
Expert Andrew Bennett presents on Erlang Dist Filtering and its role in the scaled WhatsApp runtime system, a highly distributed network with 30,000 nodes, showcasing improvements in performance and resilience.
- The Whatsapp runtime system is a highly distributed system with 30,000 nodes, and it uses Erlang’s disk protocol to send messages between nodes.
- To improve performance and reduce single points of failure, Erlang Dist Filtering was developed to filter messages before they are sent to nodes.
- The Erlang Dist Filtering project is a NIF that intercepts and rewrites inbound messages to enable disk filtering.
- The project also includes loggers, which are stateful and do not preserve signal ordering.
- The WhatsApp runtime system is not designed to be secure, but it is a trusted environment.
- The system uses a distributed architecture, with a full mesh of connections between nodes.
- The Erlang Dist Filtering project includes handlers, which are lossless and preserve signal ordering.
- The project is still in development, but it has already improved performance by reducing the number of disk operations.
- The WhatsApp runtime system uses a variety of strategies to reduce the impact of node failures, including automated restarts and distributed logging.
- The system also uses a unique way of dealing with senders and receivers, which allows it to handle high volumes of traffic.
- The loggers in the system are used to monitor and debug node failures, and to provide a paper trail for investigating issues.
- The system is designed to be highly available, with multiple nodes and automated restarts.
- The Erlang Dist Filtering project has improved performance and reduced the number of disk operations in the WhatsApp runtime system.
- The project is still in development, but it has already had a significant impact on the performance and scalability of the system.