We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Words as weapons: The dark arts of Prompt Engineering by Jeroen Egelmeers
Learn about AI security risks and prompt injection attacks, including social engineering of models, bypassing restrictions, and best practices for secure AI implementation.
-
Prompt injection can trick AI models by inserting hidden instructions in text, images or system prompts to bypass guardrails and restrictions
-
Social engineering tactics work on AI models similar to humans - models can be manipulated through emotional appeals and misdirection
-
Many companies are implementing AI systems without proper security considerations, like using LLMs to automatically process invoices or scan CVs without human oversight
-
System prompts and guardrails can be bypassed through techniques like:
- Using ASCII art or white text to hide restricted words
- Confusing the model by rephrasing banned topics
- Overflowing context windows with large amounts of text
- Injecting contradictory instructions
-
Custom GPTs and public AI interfaces pose security risks as they may leak sensitive information or be manipulated through prompt injection
-
Critical security practices when using AI:
- Always have human oversight/verification
- Don’t automate sensitive processes entirely
- Carefully validate AI system outputs
- Consider data privacy when using public AI tools
- Implement proper guardrails and restrictions
-
The rapid evolution of AI technology means security measures need constant updating as new vulnerabilities are discovered
-
Companies should thoroughly test AI systems for potential exploits before deployment in production
-
Proper prompt engineering knowledge is essential for both implementing AI safely and defending against adversarial prompts
-
Educational understanding of adversarial prompting helps developers build more secure AI systems