We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Ara Pulido – Kubernetes at Datadog Scale
Datadog's journey to scale Kubernetes to over 1,000 nodes across multiple clouds, including their approach to networking, policy enforcement, and developer experience.
- Datadog has over a thousand nodes per cluster, using multiple clouds.
- They started migrating to Kubernetes, initially using IPv6 and Selium for networking.
- They chose Selium over Istio for host-to-host encryption due to its simplicity and flexibility.
- Datadog’s engineers contributed to Selium, making it a more reliable choice.
- They implemented vertical pod scaler to reduce nodes and save costs.
- Direct port routing was chosen over IPVS or IP tables for pod networking.
- They used Rego code for policy enforcement with gatekeeper, allowing them to validate and mutate requests.
- Datadog’s approach to policy enforcement is to expose as much Kubernetes as possible to developers.
- They prioritize developer experience, using APIs to make Kubernetes more accessible.
- They believe in the importance of extending Kubernetes through custom resource definitions.
- Datadog uses Kubernetes as a platform to build its platform, focusing on simplicity and extensibility.
- They are committed to making Kubernetes API-driven, using APIs to enable automation tools.
- Gatekeeper is a super easy-to-use policy enforcement tool, making it suitable for new users.
- Datadog’s journey with Kubernetes has been a six-year-long process of learning and adaptation.
- They emphasize the importance of understanding Kubernetes internals to build a scalable and maintainable system.
- Datadog’s success with Kubernetes is attributed to its adoption of managed Kubernetes, as well as its own engineering efforts.
- They are hiring engineers who are experienced in Kubernetes and container networking.