Confused Learning: Supply Chain Attacks through Machine Learning Models

Learn how ML models are exploited for supply chain attacks via Lambda layers & metadata files. Discover key attack vectors, detection gaps, & defensive strategies for ML environments.

Key takeaways
  • Machine learning models can contain malware through various formats, with Keras/TensorFlow models being particularly vulnerable through Lambda layers and metadata files

  • Supply chain attacks through ML models require no special ML expertise - basic Python knowledge and C2 framework operation skills are sufficient

  • ML environments are high-value targets due to direct access to business crown jewels (data), broad permissions, and low security visibility

  • Common attack vectors include:

    • Public model repositories like Hugging Face
    • Organization registration and social engineering
    • Poisoned models in development/testing environments
    • Lambda layer code execution
    • Metadata file manipulation
  • Current detection capabilities are limited:

    • No standardized model evaluation process
    • Lack of consistent model documentation
    • Traditional AV struggles with large model files
    • Few purpose-built security tools
  • Defensive recommendations:

    • Environmental hardening of ML pipelines
    • Implementing proper access controls and logging
    • Using static analysis tools for model inspection
    • Avoiding pickle-based models
    • Establishing model evaluation procedures
  • Model infection rates are relatively low (~1.7% contained code) but impact can be severe due to privileged access and persistence

  • Need for improved security tooling including:

    • Better static analysis capabilities
    • Standardized model cards
    • DFIR tooling specific to ML environments
    • Yara/Semgrep rules for model scanning
  • ML teams often prioritize experimentation over security, leading to reduced security controls and increased attack surface

  • Supply chain attacks through ML models can be more persistent and stealthy than traditional phishing attacks