We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
From Text to Flaws: vulnerabilities in applications with Generative AI and LLMs - Paul Molin
Learn how attackers exploit LLM vulnerabilities like prompt injection and data leakage, and discover key defense techniques to build more secure AI applications.
-
LLMs are both powerful and gullible - they can be manipulated through carefully crafted prompts while appearing to follow instructions
-
Key vulnerabilities in LLM applications:
- Prompt injections allowing attackers to manipulate application behavior
- Indirect data leakage through summarization tasks
- Code execution risks when LLMs generate executable code
- Multi-modal vulnerabilities through hidden text in images
- Information extraction from custom GPTs
-
Primary defense techniques:
- Dual LLMs pattern - using separate privileged and quarantined instances
- Preflight prompts to validate inputs
- Vector embeddings to detect malicious prompts
- Escaping and sanitizing user inputs
- Canary tokens for detecting data exfiltration
-
Best practices:
- Limit LLM access to sensitive tools/APIs
- Validate and sanitize all user inputs
- Use established libraries for input handling
- Implement monitoring and logging
- Design for minimum blast radius
-
Additional challenges:
- Cost implications of security measures
- Difficulty distinguishing malicious from legitimate inputs
- Balancing security with functionality
- Handling multimodal inputs safely
- Managing context length limitations
-
LLM applications require considering both traditional web security and novel AI-specific attack vectors