Cloud? No Thanks! I’m Gonna Run GenAI on My AI PC [PyCon DE & PyData Berlin 2024]

Ai

Discover how to run AI models locally on your PC using Intel's Neural Processing Unit & OpenVINO toolkit. Learn about data privacy, cost savings & optimal performance.

Key takeaways
  • Intel Core Ultra processors now include integrated NPU (Neural Processing Unit) alongside CPU and GPU for dedicated AI workloads

  • OpenVINO serves as a central toolkit for AI inference optimization, supporting multiple frameworks (PyTorch, TensorFlow, ONNX) and hardware backends

  • AI PC advantages include:

    • Data privacy (no cloud dependency)
    • Cost efficiency (no subscription fees)
    • Local processing control
    • Better latency for real-time applications
  • Three main ways to use OpenVINO:

    • Direct OpenVINO API
    • PyTorch 2.0 backend
    • ONNX runtime integration
  • NPU provides power-efficient AI execution with lower consumption (12-19W) compared to CPU/GPU operations (40W)

  • Auto-device selection feature automatically chooses optimal hardware (CPU/GPU/NPU) for specific workloads

  • Different AI tasks can run simultaneously on different processors:

    • Conventional AI (object detection, pose estimation) on CPU/NPU
    • Generative AI (text-to-image, LLMs) on GPU
  • Model optimization through OpenVINO can reduce memory requirements (e.g., 25GB models reduced to 4GB)

  • Open source implementation allows deployment across various hardware platforms (Intel, ARM) and operating systems

  • Supports both traditional AI (prediction-based) and generative AI workloads locally without cloud dependencies