Cloud? No Thanks! I’m Gonna Run GenAI on My AI PC [PyCon DE & PyData Berlin 2024]

Discover how to run AI models locally on your PC using Intel's Neural Processing Unit & OpenVINO toolkit. Learn about data privacy, cost savings & optimal performance.

Key takeaways

Intel Core Ultra processors now include integrated NPU (Neural Processing Unit) alongside CPU and GPU for dedicated AI workloads
OpenVINO serves as a central toolkit for AI inference optimization, supporting multiple frameworks (PyTorch, TensorFlow, ONNX) and hardware backends
AI PC advantages include:
- Data privacy (no cloud dependency)
- Cost efficiency (no subscription fees)
- Local processing control
- Better latency for real-time applications
Three main ways to use OpenVINO:
- Direct OpenVINO API
- PyTorch 2.0 backend
- ONNX runtime integration
NPU provides power-efficient AI execution with lower consumption (12-19W) compared to CPU/GPU operations (40W)
Auto-device selection feature automatically chooses optimal hardware (CPU/GPU/NPU) for specific workloads
Different AI tasks can run simultaneously on different processors:
- Conventional AI (object detection, pose estimation) on CPU/NPU
- Generative AI (text-to-image, LLMs) on GPU
Model optimization through OpenVINO can reduce memory requirements (e.g., 25GB models reduced to 4GB)
Open source implementation allows deployment across various hardware platforms (Intel, ARM) and operating systems
Supports both traditional AI (prediction-based) and generative AI workloads locally without cloud dependencies

Cloud? No Thanks! I’m Gonna Run GenAI on My AI PC [PyCon DE & PyData Berlin 2024]

More talks