What this site is for
AI Attacks covers offensive AI security from a working practitioner's perspective. Here's what we publish.
AI Attacks exists to cover offensive AI security with the same rigor a working AI red teamer would expect — and the same honesty about what does and doesn’t land in production.
What we publish:
Technical writeups of working attacks. Prompt injection variants, jailbreak techniques and the model behaviors they exploit, indirect injection through retrieved content, multi-modal attack chains, agent and tool-use abuse. Where possible, reproducible PoCs against open models. Closed models get attack patterns and behavioral analysis.
Adversarial ML, applied. Membership inference, model extraction, evasion attacks, training-data extraction, backdoors — focused on what’s exploitable in deployed systems, not theoretical bounds.
Red team methodology. Scoping AI engagements, building attack libraries, communicating findings to a model team that doesn’t speak security and a security team that doesn’t speak ML.
Tooling reviews. Honest takes on the offensive AI security tooling landscape — Garak, PyRIT, promptmap, the LLM-specific scanners — and what each is actually good for.
What we don’t publish:
- Press release rewrites
- Listicles
- Anything we can’t source to primary material
Bylines are pseudonymous. The work is the point. Tips, attack reports, and corrections to the editor.
Real content starts shortly.
AI Attacks — in your inbox
Practitioner-grade AI red team techniques and tooling. — delivered when there's something worth your inbox.
No spam. Unsubscribe anytime.
Related
OWASP Top 10 LLM Explained: Every Entry, What It Means, and What to Fix
The OWASP Top 10 for LLM Applications 2025 is the canonical vulnerability taxonomy for production AI systems. Here is every entry, what it means in practice, and the highest-return mitigations.
Evasion Attacks on Production Classifiers: Malware, Spam, and Fraud
Deployed ML classifiers in malware, spam, and fraud detection face evasion attacks where the attacker has a clear payoff. How the attacks work against real systems, why black-box transfer is the practical threat, and what actually raises the cost of evasion.
Poisoning Web-Scale Training Sets: Split-View and Frontrunning
You don't need to control a model's training pipeline to poison it — you only need to control content the crawler will fetch. How split-view and frontrunning poisoning work against web-scale datasets, and the integrity controls that defend the pipeline.