AI Attacks
site

What this site is for

AI Attacks covers offensive AI security from a working practitioner's perspective. Here's what we publish.

By Editorial ·

AI Attacks exists to cover offensive AI security with the same rigor a working AI red teamer would expect — and the same honesty about what does and doesn’t land in production.

What we publish:

Technical writeups of working attacks. Prompt injection variants, jailbreak techniques and the model behaviors they exploit, indirect injection through retrieved content, multi-modal attack chains, agent and tool-use abuse. Where possible, reproducible PoCs against open models. Closed models get attack patterns and behavioral analysis.

Adversarial ML, applied. Membership inference, model extraction, evasion attacks, training-data extraction, backdoors — focused on what’s exploitable in deployed systems, not theoretical bounds.

Red team methodology. Scoping AI engagements, building attack libraries, communicating findings to a model team that doesn’t speak security and a security team that doesn’t speak ML.

Tooling reviews. Honest takes on the offensive AI security tooling landscape — Garak, PyRIT, promptmap, the LLM-specific scanners — and what each is actually good for.

What we don’t publish:

  • Press release rewrites
  • Listicles
  • Anything we can’t source to primary material

Bylines are pseudonymous. The work is the point. Tips, attack reports, and corrections to the editor.

Real content starts shortly.

Subscribe

Practitioner-grade AI red team techniques and tooling. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.