What Is This OpenTelemetry Thing? • Martin Thwaites • GOTO 2024

Learn how OpenTelemetry is revolutionizing observability with traces, metrics & logs. Explore key components, benefits, and challenges of this new industry standard.

Key takeaways
  • OpenTelemetry has become the de facto standard for observability, overtaking other projects in the CNCF and unifying previously competing standards (OpenTracing and OpenCensus)

  • Key components include:

    • Traces: Groups of spans showing request flow and causality
    • Metrics: Time series data with labels for aggregation
    • Logs: Point-in-time structured data
    • Collector: Central component for processing and routing telemetry data
  • The OpenTelemetry Collector provides critical benefits:

    • Centralized configuration
    • Security and access control
    • Data filtering and redaction
    • Vendor-agnostic data routing
    • Reduced egress costs
  • Auto-instrumentation libraries provide quick startup with minimal code changes, while manual instrumentation allows deeper customization and context

  • Sampling is essential for cost control:

    • Head sampling: Simple but loses context
    • Tail sampling: More complex but preserves full trace context
    • Must balance storage costs with debugging needs
  • Semantic conventions provide standardized naming and attributes across different languages and frameworks

  • The protocol and data model are vendor-agnostic, allowing easy switching between backends and multi-vendor setups

  • Key challenges include:

    • Documentation needs improvement
    • Development can be slow due to committee processes
    • Running in async/serverless environments
    • Managing high cardinality data
  • Benefits include:

    • Reduced vendor lock-in
    • Standardized telemetry across systems
    • Better debugging capabilities
    • Cost control through centralized management