Practical OpenTelemetry in Javascript/Typescript - Martin Thwaites - NDC Oslo 2024

Learn practical OpenTelemetry implementation in JS/TS: from auto-instrumentation and sampling strategies to production deployment and frontend considerations.

Key takeaways
  • OpenTelemetry is now the de facto standard for observability, replacing older solutions like OpenTracing and OpenCensus

  • Three main signals in OpenTelemetry:

    • Traces (shows causality and system flow)
    • Metrics (aggregated numerical data)
    • Logs (point-in-time structured data)
  • Key practices for instrumentation:

    • Use semantic conventions where they exist
    • Be intentional about naming and attributes
    • Avoid dynamic property names
    • Don’t include PII or sensitive data
    • Use constants files for attribute names
    • Document your instrumentation choices
  • Sampling is crucial for managing data volume:

    • Head sampling (immediate decision)
    • Tail sampling (delayed decision)
    • Aim for statistical significance rather than 100% capture
    • Balance between data fidelity and cost
  • Auto-instrumentation provides quick value:

    • Works out of the box for common libraries
    • Can be enhanced with custom instrumentation
    • Good starting point for debugging
    • May need tuning for production use
  • For metrics and spans:

    • Create abstractions around meters
    • Use explicit property names
    • Focus on high cardinality data
    • Consider performance impact of attributes
    • Group related metrics in classes
  • Web/Frontend instrumentation:

    • Still experimental in OpenTelemetry
    • Some vendors provide solutions
    • Need standards for filtering sensitive data
    • Consider cross-origin and security implications
  • Local development:

    • Use console exporters for quick feedback
    • Don’t need production backends locally
    • OTLP protocol makes switching backends easy
    • Good for debugging instrumentation issues
  • Production considerations:

    • Expect 10-20 second delays for data
    • Configure sampling appropriate to scale
    • Monitor cardinality and attribute volume
    • Use environment variables for configuration
  • Best practices:

    • Add context intentionally
    • Use span events instead of logs where possible
    • Consider span links for async operations
    • Keep attributes focused and meaningful
    • Document and standardize across teams