Risk and Compliance in the Age of AI: Key Findings

Introduction

AI adoption has moved from pilot programs to core business infrastructure faster than most risk teams can track. McKinsey's 2025 State of AI report found that organizational AI use jumped from 55% in 2023 to 78% by mid-2024 — a pace that has left governance programs well behind.

That gap isn't a knowledge problem — executives know AI creates risk. What's missing is coverage: active monitoring, documented controls, and enforcement that actually reaches the AI systems running in production.

This post synthesizes key findings from recent AI risk and compliance research, identifies where organizations are falling short, and outlines what a credible response requires now.

The findings cover four areas: the confidence-coverage gap, the agentic AI attack surface, persistent failures around bias and transparency, and regulatory pressure that is already outrunning most internal programs.

Key Takeaways

Most organizations have governance policies on paper but lack runtime controls covering their actual AI deployments.
Agentic AI systems represent a threat surface that firewalls and DLP tools were never designed to see.
Bias, opacity, and training data exposure remain the most common compliance failures, despite years of awareness.
The EU AI Act is already in force and extraterritorial; waiting for U.S. federal clarity is not a compliant posture.
Runtime enforcement is where AI risk management has to land: controls that act at the moment of inference, not after a policy review.

Finding #1: The Confidence-Coverage Gap Is Widening

Executives Know the Risk — But Programs Aren't Keeping Up

The gap between perceived and actual AI security coverage is documented and wide. IBM's IBV research found that 96% of executives believe adopting generative AI makes a security breach likely within three years. A separate IBM IBV study found that only 24% of current generative AI projects are secured.

Near-universal concern, minimal secured coverage — that's the confidence-coverage gap in numbers, and it's getting harder to close.

McKinsey puts a number on how few organizations even review AI outputs before use: only 27% of organizations reviewed all generative AI content before use, and a similar share reviewed 20% or less. WEF found just 16% of enterprises are prepared for AI-enabled reinvention, while 74% report challenges adopting AI at scale.

AI confidence-coverage gap statistics showing organizational readiness versus actual AI security

What "Coverage" Actually Means

Having a governance policy or a data privacy program is not coverage. Coverage means:

Active monitoring at the model and agent level
Documented risk assessments per deployment
Enforcement controls that act before harm occurs
A centralized inventory that shows what AI systems are running and who owns them

Most organizations have the first item. Few have all four. That gap between policy and execution is precisely where exposure accumulates — and why organizations fall behind before they realize it.

Why the Gap Keeps Growing

The organizational dynamic driving this is consistent: product and engineering teams deploy AI fast, while risk, legal, and compliance teams are brought in after the fact. Each unmonitored model or agent added without a risk assessment widens the gap.

The business consequences are concrete. The FTC banned Rite Aid from using AI facial recognition for five years after finding the company deployed the technology without reasonable safeguards — a reminder that enforcement doesn't wait for organizations to finish their governance roadmaps.

Finding #2: Agentic AI Has Introduced an Attack Surface Organizations Aren't Ready For

What Makes Agentic AI Structurally Different

Earlier AI deployments were relatively contained: a user sends a prompt, a model returns a response. Agentic AI operates on a fundamentally different model. These systems pursue goals autonomously — calling external APIs, querying retrieval systems via RAG, chaining decisions across multiple agents, and executing transactions without human review at each step.

OWASP describes this directly: agentic AI integration has expanded the autonomy, scale, and capabilities of AI systems, and the associated risks have expanded with them. Gartner's 2025 survey of 360 IT application leaders found 75% of organizations had piloted or deployed some AI agents, while only 15% were considering, piloting, or deploying fully autonomous agents — a gap that leaves most deployments without controls built for how these systems actually behave.

The Specific Threat Categories

The OWASP LLM Top 10 for 2025 identifies the attack categories most relevant to agentic deployments:

Prompt injection (LLM01) — hijacking agent instructions mid-task through adversarial inputs in user data or retrieved content
Data and model poisoning (LLM04) — corrupting retrieval stores so agents act on manipulated knowledge
Excessive agency (LLM06) — agents taking actions beyond their intended scope, calling unauthorized tools or triggering unintended transactions
Vector and embedding weaknesses (LLM08) — retrieval poisoning that corrupts what an agent "knows" from its RAG layer

OWASP LLM Top 10 agentic AI threat categories prompt injection to retrieval poisoning

These threats are structurally invisible to rule-based security stacks. Firewalls inspect network traffic; DLP tools scan known data patterns; code scanners check static files. None of them see what an agent decides to do at runtime.

The Accountability Problem

When an agent takes an unauthorized action autonomously, attributing responsibility is harder than in traditional software — and regulators are starting to ask the question. EU AI Act Article 86 gives affected persons a right to explanation for high-risk AI decisions with legal or significant effects. Most organizations currently have no mechanism to provide that explanation for agent-driven decisions.

That's the gap PromptHalo's Runtime Security is designed to close. Rather than retrofitting general-purpose security tools, it operates inline on every inference, tool call, and agent-to-agent handoff — generating decision-level audit trails that map directly to regulatory requirements like the EU AI Act, without requiring model retraining or code rewrites.

Finding #3: Traditional Risk Categories Are Still Failing — Bias, Opacity, and Data Exposure

Bias: Still Happening, Now Explicitly Regulated

The Amazon recruiting tool case from 2015 remains the canonical example: a model trained on a decade of historical resumes learned to penalize applications that included the word "women's" and downgraded graduates of two all-women's colleges. Amazon scrapped the tool, but the underlying dynamic — bias embedded in training data surfacing as discriminatory outputs — has not gone away.

What has changed is the regulatory context. Colorado's SB24-205 explicitly defines algorithmic discrimination as unlawful and requires developers and deployers of high-risk AI systems to conduct impact assessments, provide documentation, and notify consumers. The EU AI Act Article 10 requires data governance measures to detect and mitigate bias for high-risk systems.

Enforcement is real and accelerating:

The FTC banned Rite Aid from using facial recognition technology for five years
The Dutch Data Protection Authority fined Clearview AI €30.5 million for illegal facial recognition data collection
Colorado and EU regulators have formalized impact assessment requirements that create ongoing audit obligations

For regulated enterprises, these cases define the enforcement floor — not the ceiling.

The Transparency Problem Regulators Are Penalizing

Most AI systems, particularly those built on large language models, cannot easily explain why they produced a given output. This makes defending AI-driven decisions to regulators, customers, or auditors structurally difficult.

EU AI Act Article 13 requires transparency for deployers. Article 86 grants explanation rights for high-risk decisions. If a model makes a credit decision, an underwriting call, or a hiring recommendation, an organization needs to be able to reconstruct why — at the decision level, not just in aggregate.

Training Data: An Underestimated Liability

Reuters reported that Anthropic settled a class action from U.S. authors alleging copyright infringement in AI training in August 2025. Many organizations lack clear legal bases for their training data — a liability that exists before a model ever serves a single inference.

That upstream exposure is largely outside the reach of runtime controls. PromptHalo's scope is runtime behavioral monitoring, not training data governance — but runtime is precisely where most compliance programs have the largest unaddressed gap, and where enforcement exposure can be reduced fastest.

Finding #4: Regulatory Pressure Is Outpacing Organizational Readiness

The Regulatory Landscape Is Already Dense

The regulatory picture in 2025 is no longer "emerging." It's active:

EU AI Act: In force, with fines up to €35 million or 7% of global annual turnover for prohibited AI practices under Article 99
EU extraterritoriality: Article 2 applies to any provider whose AI system output is used in the EU, regardless of where the company is headquartered
U.S. state laws: Utah (effective May 2024), Colorado (SB24-205, with updated requirements under SB26-189), California (SB 53 signed September 2025), and Texas (HB149) are all active or advancing
NYC Local Law 144: Enforcement on automated employment decision tools began July 2023
Federal activity: Stanford AI Index 2025 found U.S. federal agencies introduced 59 AI-related regulations in 2024, more than double the 2023 count

2025 AI regulatory landscape overview EU AI Act US state laws and federal activity

The Readiness Gap

Most organizations are either waiting for U.S. federal clarity or treating compliance as a future project. Both approaches carry real exposure. The EU AI Act is extraterritorial — any company serving EU residents is already subject to it, regardless of where it's headquartered.

Gartner predicts AI regulatory violations will drive a 30% increase in legal disputes for tech companies by 2028. Gartner also found that organizations running regular AI system assessments are three times more likely to achieve high generative AI value. Compliance investment isn't just defensive — it produces measurable operational returns.

The Documentation Trap

Those returns depend on getting the foundations right — and documentation is one of the most overlooked foundations. Regulators aren't only asking whether AI systems cause harm. They're asking for proof that organizations assessed the risk before deployment. The EU AI Act requires:

Technical documentation before placing high-risk AI on the market (Article 11)
Conformity assessments (Article 19)
Fundamental rights impact assessments for specified deployers (Article 27)
Post-market monitoring (Article 72)

A system that never causes harm is still non-compliant without this documentation trail.

What These Findings Mean: Shifting to a Runtime-First AI Risk Program

Policy documentation and pre-deployment reviews are necessary — but they are not sufficient. AI risk management now requires runtime enforcement: controls that act at the moment of inference or agent decision, not six weeks later at the quarterly governance review.

What a Mature AI Risk Program Looks Like

Organizations building credible AI risk programs in 2025 are moving toward:

A centralized model and agent inventory — governance starts with visibility
Continuous drift and behavioral anomaly monitoring that catches production changes before they become incidents
Automated compliance checks mapped to specific regulatory frameworks, not generic policy statements
Human-in-the-loop escalation paths for high-stakes decisions requiring accountability trails
Append-only audit logs that can be replayed for regulators, not reconstructed after the fact

Five components of a mature AI risk program from inventory to audit logs

The Audit Trail Imperative

For regulated industries, every AI decision touching a customer, a financial transaction, or a compliance workflow needs a tamper-evident, decision-level log. NIST AI RMF MEASURE 2.4 addresses this directly, requiring monitoring of AI functionality in production. SR 11-7 requires ongoing model monitoring and documentation for financial services firms.

PromptHalo's compliance-ready audit logs capture every decision with its reason, the acting agent or passport identity, session and tenant context, and timestamp. The log is append-only and tamper-evident — once written, it cannot be modified or removed.

For security and compliance teams in regulated industries, that means a replayable evidence trail that holds up to regulatory scrutiny — exportable for compliance reporting and usable in post-incident investigation without reconstruction or gaps.

Frequently Asked Questions

What is the biggest AI compliance risk organizations face today?

The most critical gap is between having governance policies on paper and having actual runtime controls in place, particularly for agentic AI systems making autonomous decisions. Existing security tools were not built to monitor agent actions, tool calls, or retrieval queries — which means most compliance programs have blind spots exactly where exposure is greatest.

How is agentic AI different from traditional AI when it comes to risk and compliance?

Unlike static models that generate responses to prompts, agentic AI takes autonomous actions: calling APIs, querying retrieval systems, and handing off tasks between agents. This creates attack vectors like prompt injection, tool call abuse, and retrieval poisoning that traditional GRC frameworks and security stacks were never designed to address.

What frameworks should organizations use to manage AI risk in 2025?

NIST AI RMF, the EU AI Act's risk-based requirements, and ISO/IEC 42001 are the most widely adopted starting points. Organizations in regulated industries should also align to OWASP LLM Top 10 for model-specific attack coverage, and sector-specific guidance like SR 11-7 for financial services.

What is the difference between AI governance and AI compliance?

Governance is the internal strategy and policies an organization sets for responsible AI use — proactive and self-directed. Compliance is alignment with external regulatory or voluntary frameworks, reactive to external requirements. Both are necessary, and neither substitutes for the other.

How do regulated industries like financial services approach AI risk differently?

Financial services firms face layered obligations: SR 11-7 model risk guidance, AML requirements, consumer protection laws, and now AI-specific rules. This means AI risk programs must include documented model validation, continuous monitoring, and auditable decision trails for every AI system touching customer or transaction data.

How can organizations prove AI compliance to regulators without slowing down innovation?

The key is building compliance into the deployment layer rather than treating it as an after-the-fact review. Automated audit logs, inline monitoring, and pre-deployment red-teaming produce stronger regulatory evidence than reconstructing the record later — and they don't require slowing down deployment cycles.