Kafka for .NET Developers - Ian Cooper - NDC Oslo 2024

Ian Cooper

Learn the fundamentals of Apache Kafka for .NET developers, covering core concepts, reliability patterns, the Confluent client SDK, and essential tools in the ecosystem.

Key takeaways
  • Kafka is a distributed, append-only log system originally created at LinkedIn for data lake ingestion, now widely used for messaging and event streaming

  • Messages in Kafka are organized into topics with partitions, where:

    • Each partition has a leader and followers for redundancy
    • Messages are immutable once written
    • Ordering is only guaranteed within a single partition
  • Key concepts for producers:

    • Messages consist of a key and value
    • Producer writes are asynchronous by default
    • Need to call flush() to ensure messages are sent
    • Can control delivery guarantees with acks setting (leader-only vs all replicas)
  • Consumer patterns:

    • Consumers operate in consumer groups to scale processing
    • Each partition can only be read by one consumer in a group
    • Consumers track their position using offsets
    • Single-threaded to preserve ordering
  • Schema management:

    • Schema Registry provides centralized schema storage
    • Supports Avro, Protobuf and JSON Schema formats
    • Handles schema evolution and compatibility
    • First 5 bytes of message contain schema metadata
  • Reliability considerations:

    • Manual vs auto commit of offsets
    • Delivery reports for producer acknowledgements
    • Idempotent producers to prevent duplicates
    • Outbox pattern for reliable integration
  • .NET specific details:

    • Confluent .NET client is the main SDK
    • Async/await support throughout
    • SerDes handle serialization/deserialization
    • Message pump pattern for consuming
  • Ecosystem includes many tools:

    • Kafka Connect for integrations
    • KSQL for stream processing
    • UI tools for management
    • ZooKeeper being replaced by KRaft