Menu Sluiten
Research Paper

Multi-Layer Prompt Injection Detection for Production AI Systems

Bart Luttels March 2026 LTech Consultancy
0.998
F1 SCORE
23ms
LATENCY
355MB
RAM
48+
LANGUAGES

As AI agents gain access to sensitive data and real-world actions, prompt injection attacks emerge as a critical security risk. This paper presents a multi-layer detection pipeline designed for production deployment, achieving state-of-the-art accuracy with minimal resource requirements.

Our approach achieves F1 score of 0.998 with median latency of 23ms, running on CPU with only 355MB RAM, making it practical for real-world deployment without GPU infrastructure.

Production-Ready Detection

Unlike academic solutions requiring GPU clusters, our pipeline runs efficiently on standard CPU hardware. This enables deployment as middleware, browser extensions, or embedded filters without infrastructure overhead.

  • Four independent detection layers with complementary strengths
  • Sub-30 million parameter neural component
  • Support for 48+ languages out of the box
  • Deterministic, reproducible results

Real-World Attack Vectors

The paper examines prompt injection in practical contexts: hidden instructions in documents, malicious email content, compromised web pages, and adversarial inputs targeting AI coding assistants.

We demonstrate that configuration-based defenses (system prompts, allowlists) are insufficient against motivated attackers, necessitating dedicated detection infrastructure.

Integration Patterns

The paper discusses practical deployment as:

  • Pre-processing hooks for Claude Code and similar tools
  • Transparent proxy for OpenAI-compatible APIs
  • Browser extension for GitHub Copilot protection
  • Inlet filter for Open WebUI deployments
  • Middleware for custom LLM gateway architectures
@article{luttels2026promptguard, title={Multi-Layer Prompt Injection Detection for Production AI Systems}, author={Luttels, Bart}, journal={arXiv preprint arXiv:XXXX.XXXXX}, year={2026} }