Researchers uncover critical vulnerability allowing manipulation of AI browsers through specially crafted content
The world is facing a fundamentally new type of cyberattack that exploits not code, but the very logic of artificial intelligence operation. Agent-aware cloaking technology uses AI browsers like OpenAI’s ChatGPT Atlas to deliver misleading content that can poison the information AI systems ingest, potentially manipulating decisions in hiring, commerce, and reputation management.
By detecting AI crawlers through user-agent headers, websites can deliver altered pages that appear benign to humans but toxic to AI agents, turning retrieval-based AI systems into unwitting vectors for misinformation.
The Scale of the Problem: 2025 Statistics
The threat of prompt injections and AI manipulation has reached critical proportions:
93% of security leaders are preparing for daily AI attacks in 2025, while 66% of surveyed organizations predict that AI will have the most significant impact on cybersecurity this year.
The specific numbers are even more alarming:
- Out of 1.8 million prompt injection attacks in a public AI agent red-teaming competition, over 60,000 succeeded in causing policy violations (data access, illicit actions)
- Across approximately 3,000 U.S. companies using AI agents, there are an average of 3.3 AI agent security incidents per day in 2025, with 1.3 per day tied to prompt injection or agent abuse
- From July to August 2025 alone, several LLM data leakage incidents related to prompt injection resulted in massive breaches of sensitive data, including user chat records, credentials, and third-party application data
Table 1: AI Attack Statistics in 2025
| Metric | Value | Source | 
|---|---|---|
| Organizations expecting daily AI attacks | 93% | Trend Micro | 
| Successful prompt injections out of 1.8M attempts | 60,000+ | Public competition | 
| Average success rate of attacks | 3.33% | Calculated data | 
| AI security incidents per day (US) | 3.3 | Study of 3,000 companies | 
| Prompt injection incidents per day | 1.3 | Same sample | 
| Organizations using generative AI | 65% | McKinsey 2024 | 
| Confirmed AI-related breaches (YoY increase) | +49% | Industry reports | 
OpenAI Atlas Technology: The Double-Edged Sword of Innovation
Atlas from OpenAI, launched in October 2025, is a Chromium-based browser that integrates ChatGPT for seamless web navigation, search, and automated tasks. It enables AI to browse live webpages and access personalized content, making it a powerful tool for users but a vulnerable entry point for attacks.
The Evolution of Cloaking
Traditional cloaking deceived search engines by showing optimized content to crawlers, but agent-aware cloaking targets AI-specific agents like Atlas, ChatGPT, Perplexity, and Claude.
Expert Opinion:
Perplexity’s security team published a blog post on prompt injection attacks, noting that the problem is so severe that “it demands rethinking security from the ground up.” The blog continues to note that prompt injection attacks “manipulate the AI’s decision-making process itself, turning the agent’s capabilities against its user.”
How AI Crawlers See a Different Internet
A simple server rule—”if user-agent equals ChatGPT-User, serve fake page”—can reshape AI outputs without hacking, relying solely on content manipulation.
Table 2: Traditional vs. Agent-Aware Cloaking Comparison
| Characteristic | Traditional Cloaking | Agent-Aware Cloaking | 
|---|---|---|
| Target | Search engines (Google, Bing) | AI agents (ChatGPT, Claude, Perplexity) | 
| Detection Method | IP addresses, search engine user-agents | AI crawler user-agents | 
| Attack Complexity | Medium | Low | 
| Code Required | Yes | No (text only) | 
| Impact Scale | SEO rankings | AI decision-making | 
| Detection Difficulty | Relatively easy | Very difficult | 
| Consequences | Lost rankings | Decision manipulation, data leaks | 
| Prevention | Standard techniques | Requires new approaches | 
Real-World Experiments: SPLX Case Studies
SPLX researchers demonstrated this vulnerability through controlled experiments on sites that differentiate between human and AI requests.
Case 1: Reputation Sabotage
SPLX created zerphina.xyz, a portfolio for the fictional Zerphina Quortane, a Portland-based designer blending AI and creativity.
What Humans See:
- Professional biography
- Clean layouts
- Positive project highlights
- No suspicious elements
What AI Agents See:
When accessed by AI agents like Atlas identified via user-agents such as “ChatGPT-User” or “PerplexityBot,” the server serves a damning alternate narrative portraying Zerphina as a “notorious product saboteur” riddled with ethical lapses and failures.
Atlas and similar tools reproduced this poisoned profile without verification, confidently labeling her unreliable and unhirable in summaries. Detection lagged, as neither ChatGPT nor Perplexity cross-checked inconsistencies, underscoring gaps in provenance validation.
Case 2: Hiring Process Manipulation
SPLX’s second test targeted recruitment, simulating a job evaluation with five fictional candidates’ resumes on hosted pages.
Table 3: Candidate Evaluation Manipulation Results
| Candidate | AI Score (with cloaking) | AI Score (without cloaking) | Difference | 
|---|---|---|---|
| Natalie Carter | 88/100 | 26/100 | -62 points | 
| Jessica Morales | 78/100 | 85/100 | +7 points | 
| Other candidates | 70-75/100 | 80-90/100 | Varies | 
| Average manipulation impact | — | — | ±40 points | 
Attack Methodology:
For candidate Natalie Carter, the server was rigged to detect AI crawlers and inflate her resume with exaggerated titles, leadership claims, and tailored achievements appealing to algorithmic scoring.
Result:
When Atlas retrieved the pages, it ranked Natalie highest at 88/100, far above others like Jessica Morales at 78. In contrast, using human-visible resumes loaded locally—bypassing user-agent tricks—dropped her to 26/100, flipping the leaderboard entirely.
This shift demonstrates how cloaked content injects retrieval bias into decision-making, affecting hiring tools, procurement, or compliance systems.
Security Rankings: Most Critical AI Vulnerabilities
OWASP (Open Worldwide Application Security Project) ranked prompt injection as the number one security risk in its 2025 OWASP Top 10 for LLM Applications report, describing it as a vulnerability that can manipulate LLMs through adversarial inputs.
Table 4: OWASP Top 10 LLM Security Risks (2025)
| Rank | Threat Type | Severity Level | Exploitation Difficulty | 
|---|---|---|---|
| 1 | Prompt Injection | Critical | Low | 
| 2 | Insecure Output Handling | High | Medium | 
| 3 | Training Data Poisoning | High | High | 
| 4 | Model Denial of Service | Medium | Medium | 
| 5 | Supply Chain Vulnerabilities | High | Medium | 
| 6 | Sensitive Information Disclosure | Medium | Low | 
| 7 | Insecure Plugin Design | High | Medium | 
| 8 | Excessive Agency | Medium | Low | 
| 9 | Overreliance | Medium | Very Low | 
| 10 | Model Theft | Medium | High | 
Critical Incidents of 2025
CVE-2025-32711, which affected Microsoft 365 Copilot, has a CVSS score of 9.3, indicating high severity. Exploitation of this vulnerability, which involved AI command injection, could have potentially allowed an attacker to steal sensitive data over a network. Microsoft publicly disclosed and patched it in June.
Table 5: Major AI Security Incidents (2025)
| Date | Incident | CVE | CVSS Score | Consequences | 
|---|---|---|---|---|
| June 2025 | Microsoft 365 Copilot | CVE-2025-32711 | 9.3 | Network data theft | 
| July-Aug 2025 | LLM Data Leaks | Multiple | N/A | Chat logs, credentials leaked | 
| January 2025 | Cursor IDE | CVE-2025-54135, CVE-2025-54136 | N/A | Remote code execution | 
| February 2025 | Google Gemini | N/A | Low | Long-term memory manipulation | 
| July 2025 | X’s Grok4 | N/A | N/A | Successful jailbreak | 
| December 2024 | ChatGPT Search | N/A | Medium | Hidden text manipulation | 
Types of Prompt Injections: Threat Classification
Direct prompt injections occur when user input directly alters the behavior of the model in unintended or unexpected ways. Indirect prompt injections occur when an LLM accepts input from external sources, such as websites or files.
Table 6: Prompt Injection Typology
| Attack Type | Method | Example | Complexity | Effectiveness | 
|---|---|---|---|---|
| Direct Injection | Direct user input | “Ignore all previous instructions” | Very Low | High | 
| Indirect Injection | Via webpages/files | Hidden text on website | Low | Very High | 
| Hybrid Attack | Combined with XSS/CSRF | Prompt + JavaScript code | Medium | Critical | 
| Zero-Click Attack | Via email/notifications | Malicious email in Outlook | Low | Critical | 
| Multimodal Injection | Instructions in images | Hidden text in pictures | Medium | High | 
| Template Injection | Configuration manipulation | Modify system prompts | High | Critical | 
Evolution of Threats: Prompt Injection 2.0
Prompt injection attacks, where malicious input is designed to manipulate AI systems into ignoring their original instructions and following unauthorized commands instead, were first discovered by Preamble, Inc. in May 2022 and responsibly disclosed to OpenAI.
Over the last three years, these attacks have remained a critical security threat for LLM-integrated systems. The emergence of agentic AI systems, where LLMs autonomously perform multistep tasks through tools and coordination with other agents, has fundamentally transformed the threat landscape.
Modern prompt injection attacks can now combine with traditional cybersecurity exploits to create hybrid threats that systematically evade traditional security controls.
Industry Expert Opinions
Stuart MacLellan, CTO of South London and Maudsley NHS Foundation Trust:
“There are still lots of questions around AI models and how they could and should be used. There’s a real risk in my world around sharing personal information. We’ve been helping with training, and we’re defining rules to make it known which data resides in a certain location and what happens to it in an AI model.”
Perplexity Security Team:
The problem is so severe that it “demands rethinking security from the ground up.” Prompt injection attacks “manipulate the AI’s decision-making process itself, turning the agent’s capabilities against its user.”
Defense Strategies: Multi-Layered Security Approach
To counter this threat, organizations must implement provenance signals for data origins, validate crawlers against known agents, and continuously monitor AI outputs.
Table 7: Recommended Defense Measures
| Defense Layer | Measure | Effectiveness | Implementation Difficulty | 
|---|---|---|---|
| Input Level | Real-time prompt filtering | 60-70% | Medium | 
| Source Verification | User-agent validation | 50-60% | Low | 
| Cross-Checking | Compare with reference data | 70-80% | High | 
| Reputation Systems | Block manipulative sources | 65-75% | Medium | 
| Model Testing | Red teaming with AI tactics | 80-90% | High | 
| Logged-Out Mode | Operate without authentication | 90-95% | Low | 
| Output Monitoring | Continuous response analysis | 75-85% | Medium | 
| Multimodal Validation | Image/text consistency checks | 70-80% | High | 
Technical Solutions and Countermeasures
OpenAI’s Approach:
OpenAI created “logged out mode,” in which the agent won’t be logged into a user’s account as it navigates the web. This limits the browser agent’s usefulness, but also limits how much data an attacker can access.
Perplexity’s Solution:
Perplexity reports it built a detection system that can identify prompt injection attacks in real time, though cybersecurity researchers note these safeguards don’t guarantee complete protection.
Table 8: Vendor Security Features Comparison
| Vendor | Primary Defense | Secondary Defense | Real-time Detection | Logged-Out Mode | 
|---|---|---|---|---|
| OpenAI (Atlas) | Logged-out mode | Input filtering | Limited | ✓ Yes | 
| Perplexity | Real-time detection | Content validation | ✓ Yes | Partial | 
| Microsoft Copilot | Policy enforcement | Sandboxing | ✓ Yes | ✗ No | 
| Google Gemini | Memory notifications | User interaction checks | Limited | ✗ No | 
| Anthropic (Claude) | Constitutional AI | Context validation | ✓ Yes | ✓ Yes | 
Market Growth and Threat Forecast
Bloomberg projects that the generative AI market will reach $1.3 trillion by 2032. With such a scale of adoption, the importance of protection against manipulation will only grow.
Table 9: AI Security Threat Growth Forecast
| Year | AI Market Cap | Predicted Incidents | Expected Damage | 
|---|---|---|---|
| 2025 | $300 billion | 16,200 confirmed attacks | $2-3 billion | 
| 2027 | $600 billion | 35,000+ attacks | $8-10 billion | 
| 2030 | $1 trillion | 75,000+ attacks | $25-30 billion | 
| 2032 | $1.3 trillion | 120,000+ attacks | $50+ billion | 
Industries Most at Risk
Table 10: Industry Vulnerability Assessment
| Industry | Risk Level | Primary Threat | AI Adoption Rate | Potential Impact | 
|---|---|---|---|---|
| Financial Services | Critical | Data exfiltration | 78% | $10B+ annual | 
| Healthcare | Critical | Patient data leaks | 62% | $8B+ annual | 
| Legal Services | High | Confidential doc exposure | 54% | $5B+ annual | 
| Recruitment/HR | High | Bias injection | 71% | $3B+ annual | 
| E-commerce | Medium-High | Review manipulation | 83% | $6B+ annual | 
| Manufacturing | Medium | IP theft | 45% | $4B+ annual | 
| Education | Medium | Academic fraud | 38% | $1B+ annual | 
Real-World Impact Scenarios
Scenario 1: Corporate Espionage
A competitor uses agent-aware cloaking to poison AI research tools, causing a Fortune 500 company to make strategic decisions based on falsified market data. Estimated loss: $50-100 million.
Scenario 2: Political Manipulation
During an election cycle, AI-powered news aggregators are fed manipulated content about candidates, influencing voter perception without leaving traditional traces.
Scenario 3: Financial Fraud
AI-powered trading algorithms are fed false financial data through cloaked pages, triggering automated trades that benefit attackers. Market manipulation cost: $500 million+.
The Human Element
Table 11: User Awareness and Behavior
| Demographic | AI Trust Level | Security Awareness | Verification Habits | 
|---|---|---|---|
| Gen Z (18-24) | 68% trust | 32% aware | Rarely verify | 
| Millennials (25-40) | 54% trust | 48% aware | Sometimes verify | 
| Gen X (41-56) | 41% trust | 61% aware | Often verify | 
| Boomers (57+) | 28% trust | 45% aware | Usually verify | 
| Tech Professionals | 35% trust | 87% aware | Always verify | 
Regulatory Response and Compliance
As of 2025, several jurisdictions are implementing AI security regulations:
- EU AI Act: Mandatory risk assessments for high-risk AI systems
- US Executive Orders: Federal agencies required to implement AI security frameworks
- China’s AI Regulations: Strict content control and security measures
- GDPR Extensions: New provisions for AI data processing
Table 12: Global Regulatory Landscape
| Region | Regulation Status | Enforcement Level | Penalties | 
|---|---|---|---|
| European Union | Active | Strict | Up to 7% global revenue | 
| United States | In development | Moderate | Case-by-case | 
| United Kingdom | Consultation phase | Moderate | TBD | 
| China | Active | Very strict | License revocation | 
| Japan | In development | Light | Advisory only | 
Best Practices for Organizations
- Implement Multi-Factor Verification: Never rely solely on AI-retrieved information for critical decisions
- Continuous Monitoring: Deploy 24/7 monitoring systems for AI agent behavior
- Red Team Exercises: Conduct regular adversarial testing with prompt injection scenarios
- Employee Training: Ensure staff understand AI manipulation risks
- Vendor Assessment: Evaluate AI service providers’ security measures
- Incident Response Plans: Develop specific protocols for AI security breaches
Emerging Technologies and Future Defenses
Researchers are exploring new architectures that could inherently block prompt injections in agentic systems, using strict information-flow controls to prevent an AI agent from ever outputting data it wasn’t authorized to access.
Industry standards are emerging, and major tech providers such as Microsoft are continually investing in more deterministic security features to stay ahead of attackers.
Conclusion: A New Reality of Digital Security
Agent-aware cloaking evolves classic SEO tactics into AI overview (AIO) threats, amplifying impacts on automated judgments like product rankings or risk assessments. Hidden prompt injections could even steer AI behaviors toward malware or data exfiltration.
As AI browsers like Atlas proliferate, defense measures will define the battle for web integrity. Organizations that fail to invest in multi-layered protection of AI systems now risk catastrophic consequences in the near future.
Key Takeaway: This is not a theoretical threat but a current reality requiring immediate action from every organization using AI technologies. The window for proactive defense is closing rapidly, and the cost of inaction grows exponentially with each passing quarter.
The question is no longer whether your organization will face AI manipulation attacks, but when—and whether you’ll be prepared to defend against them.
