Matthieu Caneill - dbt-score: a linter for your dbt model metadata | PyData Amsterdam 2024

Learn how dbt-score, an open-source linting tool, helps maintain quality and consistency in large dbt projects through metadata validation and customizable rules.

Key takeaways
  • dbt-score is an open-source linting tool for dbt model metadata, developed at Picnic and MIT licensed

  • The tool helps enforce consistency and quality checks across large dbt projects with hundreds or thousands of data models

  • Key features include:

    • Linting dbt YAML metadata files
    • Configurable severity levels for rules
    • Custom rule creation using Python
    • Score-based evaluation of models
    • Integration with dbt documentation/catalogs
    • CI/CD friendly with machine-readable output
  • Main use cases:

    • Enforcing documentation standards
    • Checking for required properties (owners, descriptions)
    • Validating data quality tests exist
    • Ensuring security and access controls
    • Custom business rule validation
  • Benefits:

    • No database connection required (metadata-only checks)
    • Easy to integrate into existing Python/dbt workflows
    • Extensible through custom rules
    • Helps maintain consistency across large teams
    • Prevents bad models from reaching production
  • Installation and usage:

    • Installs as a companion library to dbt
    • Simple CLI interface (dbt-score lint)
    • Configurable via pyproject.toml
    • Can be run anywhere Python/dbt runs
  • Customization options:

    • Rule skipping for specific models
    • Severity level configuration
    • Custom rule namespaces
    • Project-specific thresholds
    • Custom scoring weights