Deploying read/write Autonomous Agents to the public internet creates a massive threat vector. If an LLM has access to a SQL database and a user executes a prompt injection attack, the results can be catastrophic. This playbook details our layered Zero-Trust defense architecture for Agentic GenAI.
1. Anatomy of a Prompt Injection Attack
A standard web application is protected by parameterized queries. LLMs, however, operate on natural language. An attacker can use social engineering against the machine itself to "forget previous instructions" and force the execution of unintended actions.
2. The Layered Defense Architecture
Relying purely on "system prompts" to stop injections is a failing strategy. An LLM's stochastic nature means it can be bypassed. We implement strict defense-in-depth methodologies.
3. Multi-Model Guardrails
3.1 Input Filtering (The Shield)
Before the core logic LLM sees a prompt, it is routed through an "Input Guardrail" model. This is generally a smaller, faster model (like Llama 3 8B or a dedicated classifier) trained explicitly to detect jailbreak attempts, toxic phrasing, and prompt overrides. If the guardrail flags the prompt, the request is immediately dropped before costly core inference occurs.
3.2 Output Data Loss Prevention (DLP)
Similarly, all output generated by the Agent must pass through an Output Guardrail. This model scans the generated text via RegEx and neural classifiers to detect Data Exfiltration. If the system accidentally generates un-masked PII, Credit Card Data, or internal API keys, the Output Guardrail aggressively redacts it before it reaches the client.
4. IAM Constraints and Database Sandboxing
Even if the LLM goes rogue, its blast radius must be surgically constrained.
- Role-Based Access Control (RBAC): The LangChain agent running in AWS ECS operates under an IAM Role with strict least-privilege policies. It cannot provision infrastructure.
- Database Sandboxing: Agents executing natural language to SQL queries are restricted to highly specific views within the database. They operate under a read-only DB user role. Destructive mutations (INSERT, UPDATE, DELETE) are explicitly blocked at the database engine level, not just the LLM level.