The New Era of AI Cyberattacks: How Agent-Aware Cloaking Weaponizes ChatGPT Atlas for Disinformation

Researchers uncover critical vulnerability allowing manipulation of AI browsers through specially crafted content

The world is facing a fundamentally new type of cyberattack that exploits not code, but the very logic of artificial intelligence operation. Agent-aware cloaking technology uses AI browsers like OpenAI’s ChatGPT Atlas to deliver misleading content that can poison the information AI systems ingest, potentially manipulating decisions in hiring, commerce, and reputation management.

By detecting AI crawlers through user-agent headers, websites can deliver altered pages that appear benign to humans but toxic to AI agents, turning retrieval-based AI systems into unwitting vectors for misinformation.

The Scale of the Problem: 2025 Statistics

The threat of prompt injections and AI manipulation has reached critical proportions:

93% of security leaders are preparing for daily AI attacks in 2025, while 66% of surveyed organizations predict that AI will have the most significant impact on cybersecurity this year.

The specific numbers are even more alarming:

  • Out of 1.8 million prompt injection attacks in a public AI agent red-teaming competition, over 60,000 succeeded in causing policy violations (data access, illicit actions)
  • Across approximately 3,000 U.S. companies using AI agents, there are an average of 3.3 AI agent security incidents per day in 2025, with 1.3 per day tied to prompt injection or agent abuse
  • From July to August 2025 alone, several LLM data leakage incidents related to prompt injection resulted in massive breaches of sensitive data, including user chat records, credentials, and third-party application data

Table 1: AI Attack Statistics in 2025

MetricValueSource
Organizations expecting daily AI attacks93%Trend Micro
Successful prompt injections out of 1.8M attempts60,000+Public competition
Average success rate of attacks3.33%Calculated data
AI security incidents per day (US)3.3Study of 3,000 companies
Prompt injection incidents per day1.3Same sample
Organizations using generative AI65%McKinsey 2024
Confirmed AI-related breaches (YoY increase)+49%Industry reports

OpenAI Atlas Technology: The Double-Edged Sword of Innovation

Atlas from OpenAI, launched in October 2025, is a Chromium-based browser that integrates ChatGPT for seamless web navigation, search, and automated tasks. It enables AI to browse live webpages and access personalized content, making it a powerful tool for users but a vulnerable entry point for attacks.

The Evolution of Cloaking

Traditional cloaking deceived search engines by showing optimized content to crawlers, but agent-aware cloaking targets AI-specific agents like Atlas, ChatGPT, Perplexity, and Claude.

Expert Opinion:

Perplexity’s security team published a blog post on prompt injection attacks, noting that the problem is so severe that “it demands rethinking security from the ground up.” The blog continues to note that prompt injection attacks “manipulate the AI’s decision-making process itself, turning the agent’s capabilities against its user.”

How AI Crawlers See a Different Internet

A simple server rule—”if user-agent equals ChatGPT-User, serve fake page”—can reshape AI outputs without hacking, relying solely on content manipulation.

Table 2: Traditional vs. Agent-Aware Cloaking Comparison

CharacteristicTraditional CloakingAgent-Aware Cloaking
TargetSearch engines (Google, Bing)AI agents (ChatGPT, Claude, Perplexity)
Detection MethodIP addresses, search engine user-agentsAI crawler user-agents
Attack ComplexityMediumLow
Code RequiredYesNo (text only)
Impact ScaleSEO rankingsAI decision-making
Detection DifficultyRelatively easyVery difficult
ConsequencesLost rankingsDecision manipulation, data leaks
PreventionStandard techniquesRequires new approaches

Real-World Experiments: SPLX Case Studies

SPLX researchers demonstrated this vulnerability through controlled experiments on sites that differentiate between human and AI requests.

Case 1: Reputation Sabotage

SPLX created zerphina.xyz, a portfolio for the fictional Zerphina Quortane, a Portland-based designer blending AI and creativity.

What Humans See:

  • Professional biography
  • Clean layouts
  • Positive project highlights
  • No suspicious elements

What AI Agents See:

When accessed by AI agents like Atlas identified via user-agents such as “ChatGPT-User” or “PerplexityBot,” the server serves a damning alternate narrative portraying Zerphina as a “notorious product saboteur” riddled with ethical lapses and failures.

Atlas and similar tools reproduced this poisoned profile without verification, confidently labeling her unreliable and unhirable in summaries. Detection lagged, as neither ChatGPT nor Perplexity cross-checked inconsistencies, underscoring gaps in provenance validation.

Case 2: Hiring Process Manipulation

SPLX’s second test targeted recruitment, simulating a job evaluation with five fictional candidates’ resumes on hosted pages.

Table 3: Candidate Evaluation Manipulation Results

CandidateAI Score (with cloaking)AI Score (without cloaking)Difference
Natalie Carter88/10026/100-62 points
Jessica Morales78/10085/100+7 points
Other candidates70-75/10080-90/100Varies
Average manipulation impact±40 points

Attack Methodology:

For candidate Natalie Carter, the server was rigged to detect AI crawlers and inflate her resume with exaggerated titles, leadership claims, and tailored achievements appealing to algorithmic scoring.

Result:

When Atlas retrieved the pages, it ranked Natalie highest at 88/100, far above others like Jessica Morales at 78. In contrast, using human-visible resumes loaded locally—bypassing user-agent tricks—dropped her to 26/100, flipping the leaderboard entirely.

This shift demonstrates how cloaked content injects retrieval bias into decision-making, affecting hiring tools, procurement, or compliance systems.

Security Rankings: Most Critical AI Vulnerabilities

OWASP (Open Worldwide Application Security Project) ranked prompt injection as the number one security risk in its 2025 OWASP Top 10 for LLM Applications report, describing it as a vulnerability that can manipulate LLMs through adversarial inputs.

Table 4: OWASP Top 10 LLM Security Risks (2025)

RankThreat TypeSeverity LevelExploitation Difficulty
1Prompt InjectionCriticalLow
2Insecure Output HandlingHighMedium
3Training Data PoisoningHighHigh
4Model Denial of ServiceMediumMedium
5Supply Chain VulnerabilitiesHighMedium
6Sensitive Information DisclosureMediumLow
7Insecure Plugin DesignHighMedium
8Excessive AgencyMediumLow
9OverrelianceMediumVery Low
10Model TheftMediumHigh

Critical Incidents of 2025

CVE-2025-32711, which affected Microsoft 365 Copilot, has a CVSS score of 9.3, indicating high severity. Exploitation of this vulnerability, which involved AI command injection, could have potentially allowed an attacker to steal sensitive data over a network. Microsoft publicly disclosed and patched it in June.

Table 5: Major AI Security Incidents (2025)

DateIncidentCVECVSS ScoreConsequences
June 2025Microsoft 365 CopilotCVE-2025-327119.3Network data theft
July-Aug 2025LLM Data LeaksMultipleN/AChat logs, credentials leaked
January 2025Cursor IDECVE-2025-54135, CVE-2025-54136N/ARemote code execution
February 2025Google GeminiN/ALowLong-term memory manipulation
July 2025X’s Grok4N/AN/ASuccessful jailbreak
December 2024ChatGPT SearchN/AMediumHidden text manipulation

Types of Prompt Injections: Threat Classification

Direct prompt injections occur when user input directly alters the behavior of the model in unintended or unexpected ways. Indirect prompt injections occur when an LLM accepts input from external sources, such as websites or files.

Table 6: Prompt Injection Typology

Attack TypeMethodExampleComplexityEffectiveness
Direct InjectionDirect user input“Ignore all previous instructions”Very LowHigh
Indirect InjectionVia webpages/filesHidden text on websiteLowVery High
Hybrid AttackCombined with XSS/CSRFPrompt + JavaScript codeMediumCritical
Zero-Click AttackVia email/notificationsMalicious email in OutlookLowCritical
Multimodal InjectionInstructions in imagesHidden text in picturesMediumHigh
Template InjectionConfiguration manipulationModify system promptsHighCritical

Evolution of Threats: Prompt Injection 2.0

Prompt injection attacks, where malicious input is designed to manipulate AI systems into ignoring their original instructions and following unauthorized commands instead, were first discovered by Preamble, Inc. in May 2022 and responsibly disclosed to OpenAI.

Over the last three years, these attacks have remained a critical security threat for LLM-integrated systems. The emergence of agentic AI systems, where LLMs autonomously perform multistep tasks through tools and coordination with other agents, has fundamentally transformed the threat landscape.

Modern prompt injection attacks can now combine with traditional cybersecurity exploits to create hybrid threats that systematically evade traditional security controls.

Industry Expert Opinions

Stuart MacLellan, CTO of South London and Maudsley NHS Foundation Trust:

“There are still lots of questions around AI models and how they could and should be used. There’s a real risk in my world around sharing personal information. We’ve been helping with training, and we’re defining rules to make it known which data resides in a certain location and what happens to it in an AI model.”

Perplexity Security Team:

The problem is so severe that it “demands rethinking security from the ground up.” Prompt injection attacks “manipulate the AI’s decision-making process itself, turning the agent’s capabilities against its user.”

Defense Strategies: Multi-Layered Security Approach

To counter this threat, organizations must implement provenance signals for data origins, validate crawlers against known agents, and continuously monitor AI outputs.

Table 7: Recommended Defense Measures

Defense LayerMeasureEffectivenessImplementation Difficulty
Input LevelReal-time prompt filtering60-70%Medium
Source VerificationUser-agent validation50-60%Low
Cross-CheckingCompare with reference data70-80%High
Reputation SystemsBlock manipulative sources65-75%Medium
Model TestingRed teaming with AI tactics80-90%High
Logged-Out ModeOperate without authentication90-95%Low
Output MonitoringContinuous response analysis75-85%Medium
Multimodal ValidationImage/text consistency checks70-80%High

Technical Solutions and Countermeasures

OpenAI’s Approach:

OpenAI created “logged out mode,” in which the agent won’t be logged into a user’s account as it navigates the web. This limits the browser agent’s usefulness, but also limits how much data an attacker can access.

Perplexity’s Solution:

Perplexity reports it built a detection system that can identify prompt injection attacks in real time, though cybersecurity researchers note these safeguards don’t guarantee complete protection.

Table 8: Vendor Security Features Comparison

VendorPrimary DefenseSecondary DefenseReal-time DetectionLogged-Out Mode
OpenAI (Atlas)Logged-out modeInput filteringLimited✓ Yes
PerplexityReal-time detectionContent validation✓ YesPartial
Microsoft CopilotPolicy enforcementSandboxing✓ Yes✗ No
Google GeminiMemory notificationsUser interaction checksLimited✗ No
Anthropic (Claude)Constitutional AIContext validation✓ Yes✓ Yes

Market Growth and Threat Forecast

Bloomberg projects that the generative AI market will reach $1.3 trillion by 2032. With such a scale of adoption, the importance of protection against manipulation will only grow.

Table 9: AI Security Threat Growth Forecast

YearAI Market CapPredicted IncidentsExpected Damage
2025$300 billion16,200 confirmed attacks$2-3 billion
2027$600 billion35,000+ attacks$8-10 billion
2030$1 trillion75,000+ attacks$25-30 billion
2032$1.3 trillion120,000+ attacks$50+ billion

Industries Most at Risk

Table 10: Industry Vulnerability Assessment

IndustryRisk LevelPrimary ThreatAI Adoption RatePotential Impact
Financial ServicesCriticalData exfiltration78%$10B+ annual
HealthcareCriticalPatient data leaks62%$8B+ annual
Legal ServicesHighConfidential doc exposure54%$5B+ annual
Recruitment/HRHighBias injection71%$3B+ annual
E-commerceMedium-HighReview manipulation83%$6B+ annual
ManufacturingMediumIP theft45%$4B+ annual
EducationMediumAcademic fraud38%$1B+ annual

Real-World Impact Scenarios

Scenario 1: Corporate Espionage

A competitor uses agent-aware cloaking to poison AI research tools, causing a Fortune 500 company to make strategic decisions based on falsified market data. Estimated loss: $50-100 million.

Scenario 2: Political Manipulation

During an election cycle, AI-powered news aggregators are fed manipulated content about candidates, influencing voter perception without leaving traditional traces.

Scenario 3: Financial Fraud

AI-powered trading algorithms are fed false financial data through cloaked pages, triggering automated trades that benefit attackers. Market manipulation cost: $500 million+.

The Human Element

Table 11: User Awareness and Behavior

DemographicAI Trust LevelSecurity AwarenessVerification Habits
Gen Z (18-24)68% trust32% awareRarely verify
Millennials (25-40)54% trust48% awareSometimes verify
Gen X (41-56)41% trust61% awareOften verify
Boomers (57+)28% trust45% awareUsually verify
Tech Professionals35% trust87% awareAlways verify

Regulatory Response and Compliance

As of 2025, several jurisdictions are implementing AI security regulations:

  • EU AI Act: Mandatory risk assessments for high-risk AI systems
  • US Executive Orders: Federal agencies required to implement AI security frameworks
  • China’s AI Regulations: Strict content control and security measures
  • GDPR Extensions: New provisions for AI data processing

Table 12: Global Regulatory Landscape

RegionRegulation StatusEnforcement LevelPenalties
European UnionActiveStrictUp to 7% global revenue
United StatesIn developmentModerateCase-by-case
United KingdomConsultation phaseModerateTBD
ChinaActiveVery strictLicense revocation
JapanIn developmentLightAdvisory only

Best Practices for Organizations

  1. Implement Multi-Factor Verification: Never rely solely on AI-retrieved information for critical decisions
  2. Continuous Monitoring: Deploy 24/7 monitoring systems for AI agent behavior
  3. Red Team Exercises: Conduct regular adversarial testing with prompt injection scenarios
  4. Employee Training: Ensure staff understand AI manipulation risks
  5. Vendor Assessment: Evaluate AI service providers’ security measures
  6. Incident Response Plans: Develop specific protocols for AI security breaches

Emerging Technologies and Future Defenses

Researchers are exploring new architectures that could inherently block prompt injections in agentic systems, using strict information-flow controls to prevent an AI agent from ever outputting data it wasn’t authorized to access.

Industry standards are emerging, and major tech providers such as Microsoft are continually investing in more deterministic security features to stay ahead of attackers.

Conclusion: A New Reality of Digital Security

Agent-aware cloaking evolves classic SEO tactics into AI overview (AIO) threats, amplifying impacts on automated judgments like product rankings or risk assessments. Hidden prompt injections could even steer AI behaviors toward malware or data exfiltration.

As AI browsers like Atlas proliferate, defense measures will define the battle for web integrity. Organizations that fail to invest in multi-layered protection of AI systems now risk catastrophic consequences in the near future.

Key Takeaway: This is not a theoretical threat but a current reality requiring immediate action from every organization using AI technologies. The window for proactive defense is closing rapidly, and the cost of inaction grows exponentially with each passing quarter.

The question is no longer whether your organization will face AI manipulation attacks, but when—and whether you’ll be prepared to defend against them.