Multi-Agent Poisoning

Research · Multi-Agent Poisoning

NYU Shanghai · Jan 2026 – Present · with Shenhua Shi & Lorenzo Xiao · advised by Prof. Hua Shen

From Poisoned Agents to Violated Users: Cascading Attacks and Autonomy Boundary Violations in Multi-Agent LLM Systems. Targeting NeurIPS / EMNLP 2026.

Intro

Multi-agent LLM systems (MAS) are landing in production — autonomous dev pipelines, clinical decision assistants, financial advisors. When one sub-agent gets compromised, the damage doesn't stop at a wrong answer; it can cascade downstream and end up overriding the user's authority — budgets exceeded, consent skipped, irreversible actions taken.

Research goals

Move the safety conversation from point-failure (one agent breaks) to cascading propagation (the whole pipeline).
Introduce Autonomy Boundary Violations (ABVs) as a failure mode distinct from capability failures and harmful content.
Build a unified framework that links attack propagation → user-facing violations → defense evaluation.

Research questions

RQ1 — How do attack vectors differ in propagation through the MAS pipeline?
RQ2 — When does a compromised sub-agent cause the system to violate user authority?
RQ3 — Which defenses account for propagation, not just entry-point hardening?

Plan

A PRISMA-lite literature review plus a controlled experimental sweep across four MAS topologies (sequential, star, mesh, hierarchical) and four attack vectors (prompt injection, memory poisoning, tool-output spoofing, role impersonation). Measuring ASR, persistence, propagation depth, and ABV rate, then evaluating candidate defenses in a final phase.

This is the plain-HTML mirror served to crawlers, LLMs, and curl. Humans with a JavaScript-enabled browser see the rich React/XP-themed SPA at the same URL.

All plain pages · Live site · sitemap.xml