From Text to Flaws: vulnerabilities in applications with Generative AI and LLMs - Paul Molin

Security Ai

Learn how attackers exploit LLM vulnerabilities like prompt injection and data leakage, and discover key defense techniques to build more secure AI applications.

Key takeaways

LLMs are both powerful and gullible - they can be manipulated through carefully crafted prompts while appearing to follow instructions
Key vulnerabilities in LLM applications:
- Prompt injections allowing attackers to manipulate application behavior
- Indirect data leakage through summarization tasks
- Code execution risks when LLMs generate executable code
- Multi-modal vulnerabilities through hidden text in images
- Information extraction from custom GPTs
Primary defense techniques:
- Dual LLMs pattern - using separate privileged and quarantined instances
- Preflight prompts to validate inputs
- Vector embeddings to detect malicious prompts
- Escaping and sanitizing user inputs
- Canary tokens for detecting data exfiltration
Best practices:
- Limit LLM access to sensitive tools/APIs
- Validate and sanitize all user inputs
- Use established libraries for input handling
- Implement monitoring and logging
- Design for minimum blast radius
Additional challenges:
- Cost implications of security measures
- Difficulty distinguishing malicious from legitimate inputs
- Balancing security with functionality
- Handling multimodal inputs safely
- Managing context length limitations
LLM applications require considering both traditional web security and novel AI-specific attack vectors

From Text to Flaws: vulnerabilities in applications with Generative AI and LLMs - Paul Molin

More talks