Pere Urbon-Bayes – Apache Kafka: advice from the trenches or how to successfully fail!

Apache Kafka expert shares practical insights and real-world advice on how to successfully adopt and maintain a Kafka cluster, covering common pitfalls and best practices for configuration, performance, and monitoring.

Key takeaways
  • Monitor your Kafka cluster always, as it will keep retrying if configured correctly.
  • Use quotas to prevent a single broker from consuming all resources.
  • Ensure you have a replica factor of at least 3 for partitions.
  • Use the max.in-flight.requests.per.connection setting to prevent message ordering issues.
  • Keep your retries enabled to handle network failures.
  • Use the load.compaction thread to compact topics and delete old data.
  • Use the replica.fetcher thread to copy data from leaders to followers.
  • Understand the importance of the leader and follower roles in Kafka.
  • Use the consumer groups abstraction to organize consumers and handle rebalancing.
  • Use the offset to track the last message processed by a consumer.
  • Use the acknowledgment setting to control when messages are considered committed.
  • Ensure you have a sufficient number of nodes in your Kafka cluster.
  • Use the auto-commit feature to automatically commit messages.
  • Use the re-assignment tool to reassign partitions to different brokers.
  • Use the Kafka topics to store and retrieve data.
  • Use the Zookeeper to manage and coordinate the Kafka cluster.
  • Use the security features to authenticate and authorize access to Kafka.
  • Use the observability features to monitor and troubleshoot Kafka.
  • Use the metrics to track and analyze Kafka performance.
  • Use the logs to track and analyze Kafka errors and issues.
  • Use the rebalancing feature to rebalance partitions and ensure even distribution.
  • Use the Kafka cluster version 0.11 or later for better performance and features.
  • Use the Kafka configuration settings to customize and optimize performance.
  • Use the Kafka tools and utilities to manage and maintain the cluster.
  • Use the Kafka documentation and resources to learn and troubleshoot.