AI Agents vs Humans: Who Wins at Web Hacking in 2026? - wiz.io
AI agents, including Claude Sonnet 4.5, GPT-5, and Gemini 2.5 Pro, demonstrated high proficiency by solving 9 out of 10 lab challenges that simulated real-world web application vulnerabilities with minimal cost. These successes encompassed exploits like authentication bypass, IDOR, stored XSS, S3 bucket takeover, and AWS IMDS SSRF, highlighting AI's capability for multi-step reasoning and rapid pattern recognition.
Source: Original Report ↗