Tag
#adversarial-ml
6 posts tagged adversarial-ml.
- adversarial-ml
Adversarial Examples Against Vision Models in 2025
Where physical-world adversarial patches and digital attacks stand against modern vision models — what still works, what's been hardened, and where the research frontier is.
- attack-patterns
Jailbreaking Multimodal Models: Visual Prompt Injection Attacks
How attackers use images, typography, and adversarial visual inputs to bypass safety guardrails in GPT-4V, Claude, and Gemini — and why multimodal inputs fundamentally expand the jailbreak attack surface.
- adversarial-ml
Model Extraction via Black-Box Query Attacks
How attackers reconstruct private model weights and decision boundaries through query-only access — the techniques, the economics, and what extracted models are actually used for.
- attack-patterns
Supply Chain Attacks on AI Models: Poisoning and Backdoors
How attackers compromise AI models before they reach production — through malicious fine-tuning, dataset poisoning, serialization exploits, and the unique risks of public model registries like Hugging Face Hub.
- adversarial-ml
Model Inversion and Membership Inference: Extracting LLM Data
How membership inference attacks determine whether specific data was used to train a model, and how model inversion techniques reconstruct private training examples from gradient signals and output distributions.
- adversarial-ml
Training Data Poisoning and Backdoor Attacks on LLMs
A technical deep-dive into how adversaries manipulate training datasets and introduce hidden backdoors into LLMs — covering poisoning mechanics, stealthy trigger design, and why standard evaluations miss these attacks.