Agentic AI in Claims: Autonomy Levels and Guardrails

Jun 30, 2026

Claims Automation

13 min read

TL;DR – Most claims teams are about to hand more decisions to AI in 2026. This is a practical model for deciding how much autonomy to grant, claim type by claim type, and the four guardrails that keep speed from turning into risk.

Agentic AI in insurance claims has crossed from pilot to production. Systems no longer merely flag files or score fraud — they intake losses, verify coverage, order evidence, calculate reserves, and in some cases settle claims end-to-end without a human touchpoint. This shift from AI-as-tool to AI-as-actor creates a governance gap that traditional model risk frameworks were not designed to close.

So here is the question landing on every claims leader’s desk right now: how much should you let AI decide on its own while proving every action stays inside the regulations?

The answer is to scale autonomy with explicit guardrails as evidence and trust accumulate.

What you will learn:

Understand what agentic AI in claims is and how it differs from basic automation.
How to decide which claim types are ready for more AI autonomy by claim type, complexity, and risk.
Discover the guardrails that keep AI fast, compliant, and auditable.
Learn why human-in-the-loop design is still essential for high-stakes decisions.
Review the 2026 regulatory changes shaping AI governance in claims.
Find out how to scale AI claims automation without increasing operational risk.

What Agentic AI Autonomy Means in Claims

Agentic AI plans and carries out multi-step agent work toward a goal, calling tools and making decisions along the way, rather than answering one prompt at a time. Autonomy is the degree to which that agent acts without a human approving each step.

26%

of insurers plan to use an agentic AI solutions by 2027

65%

of insurers are scaling AI agents for claims processing in 2026

40%+

of agentic AI projects will be canceled by the end of 2027

An insurer can deploy genuinely agentic technology and still run it at very low autonomy, with a person signing off on everything. The capability and the permission are separate dials.

The adoption curve is steep. Celent found that 22% of insurers plan to have an agentic AI solution in production by year-end 2026, with claims and underwriting expected to lead. Industry reporting puts the momentum higher still, with 65% of insurers planning to scale AI agents for claims processing in 2026.

Gartner expects more than 40% of agentic AI projects to be canceled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. The number that matters for a claims executive is the gap between those plans and production reality, because closing it is usually a governance problem more than a model problem.

The practical move is to treat autonomy as a setting you raise deliberately, claim type by claim type, with proof at every step.

The Four Levels of AI Claims Maturity, and Why Each One Needs Guardrails

The four levels of AI claims maturity are Assist, Collaborate, Lead, and Autonomous. They describe how much of the claims operation you have handed to AI, from AI that retrieves and drafts while the adjuster decides, up to AI that runs narrow, high-frequency claims straight through.

These four levels describe where your organization sits on its AI journey. They track how far you have taken AI automation across the claims operation as a whole. We walk through the full journey from AI-assisted to agentic AI claims in Five Sigma’s automation-first maturity path framing.

Level 1 – Assist. The AI retrieves, summarizes, and drafts. It answers “what’s in this file?” and pulls the policy language, the prior notes, the missing documents. The adjuster makes every decision. This is the safest entry point and the most common today.
Level 2 – Collaborate. The AI recommends the next action and pre-fills the work: a reserve range, a coverage position, a draft letter, a fraud flag for review. The adjuster reviews and approves each step. The value here is speed and consistency, with a human still in the loop on every call.
Level 3 – Lead. The AI runs a defined claim type end to end. It executes the routine steps, keeps the file moving, and surfaces the work that needs a person, while escalating anything outside its limits. The adjuster supervises by exception and authorizes regulated actions like payments.
Level 4 – Autonomous. The AI handles a narrow, well-understood class of low-complexity, high-frequency claims straight through, inside hard limits. Oversight shifts to monitoring, sampling, and audit rather than file-by-file review. This level belongs only where the rules are unambiguous, the dollar amounts are capped, and the audit trail is airtight.

Wherever you sit on that curve, the requirement is the same. Whether your AI is collecting data, moving across your systems to reach every claim’s information, analyzing it, drafting summaries and recommendations for the adjuster, or running a claim end to end from FNOL to payment, every one of those tools needs its own guidance and guardrails.

Claims hold some of the most sensitive data your organization touches. An AI tool should get access on the same terms a human adjuster does, and carry the same responsibility for what it reads and acts on. That data should never leave the organization, train an outside model, or be used for anything beyond the claim it serves. The rules that define what a human adjuster may see and do should apply to AI in the same way. A person sets the line for every AI tool or agent: what it may do on its own, and what has to go to human oversight. Every AI decision runs inside boundaries a human designed, and none happens without them.

“From the moment a customer lodges a claim through to final settlement, our claims team is supported by AI to streamline the claim process and reduce delays, while ensuring every decision is overseen by experienced claims professionals.”

Dean van Es, CEO, Fast Cover

Why Existing Governance Frameworks Fall Short

Existing AI governance frameworks fall short because they were designed to oversee AI that informs human decisions, not AI that makes them. So they break the moment an agent starts executing decisions on its own.

Most insurer AI governance programs were built for a world where AI informs human decisions. When the AI itself becomes the decision-maker and executor, three structural failures emerge:

Accountability diffusion: Multi-step agentic workflows disperse ownership across configuration teams, model vendors, and orchestration layers. The NAIC’s Big Data and Artificial Intelligence Working Group flagged this at its March 2026 Spring Meeting as a material unresolved risk.
Cascading errors: A misclassification in step two of a five-step workflow compounds through every subsequent step. Unlike a single bad model output, the error may execute thousands of times before detection.
Audit trail complexity: Tracing a consequential decision across multiple autonomous agents requires a fundamentally different logging architecture than documenting a single model’s inputs and outputs.

Why Guardrails Decide Whether Autonomy Scales

Guardrails are the regulatory, policy, and technical boundaries that constrain what an agent is allowed to do, and they are what lets you raise autonomy without raising risk. Autonomy without guardrails is a liability; guardrails without autonomy is just a slower manual process. You need both, calibrated together.

The regulatory boundary is no longer optional. The NAIC Model Bulletin on the use of AI systems by insurers asks carriers to maintain governance, documentation, and demonstrable human oversight of AI-driven decisions, and to test for unfair discrimination. Nearly half of U.S. states have adopted the NAIC Model Bulletin which obliges insurance companies that use AI to run written AI governance programs, documented controls, and consumer notification when AI affects a decision. In Europe, the EU AI Act classifies insurance claims and underwriting AI as high-risk, with obligations applying from 2 August 2026, bringing logging, human oversight, and conformity requirements with it.

Modern guardrails combine the regulatory compliance environment, the insurer’s own configurable policies and SOPs, role-based access, full audit trails, and human checkpoints at the decision points you choose. The agent operates with flexibility inside those boundaries instead of following a static script, which is what makes higher autonomy both safer and more useful than the rule engines it replaces.

The Four Guardrails That Make Autonomy Safe in Claims

Detecting AI-generated evidence is a claim-wide orchestration problem. No single check is enough: examine every artifact for authenticity, cross-check it against the full claim and external data, and route flagged inconsistencies to investigators with a documented audit trail and a human in the loop.

Moving up the dial only works if the controls move with it. Four guardrails matter most, and each maps directly to what regulators and auditors will ask you to prove.

Role-based access. AI should operate inside the same permission boundaries as the person who invokes it, not above them. It should reach only the data its task requires, and respect the line of business and sub-organization scope of the claim.
Audit trail. Every AI action needs to be logged and traceable, the same way a human action would be. When a regulator runs a market conduct exam, they expect documented evidence of how an AI-driven decision was reached.
Policy enforcement. Automation should never be able to perform an action a user would not be allowed to perform. Critical actions, payments, approvals, configuration changes, should require explicit human authorization. And the system has to honor your internal rules, not just generic ones, because your authority limits, escalation paths, and reserve practices are specific to your book.
Regulatory compliance. The controls above need to live inside a documented compliance posture: alignment with the NAIC Model Bulletin and the EU AI Act where it applies, testing for unfair discrimination, data isolation so one customer’s data never trains a general model or leaks to another, and certifications like SOC 2 Type 2, GDPR, HIPAA, and CCPA. Deloitte’s guidance points the same direction: continuous oversight, board-level accountability, and independent model audits.

This is also why so many programs stall. Adoption is wide but shallow. By one Sedgwick measure cited across the industry, only about 12% of insurers say they have fully mature AI capability and just 7% have achieved scalable success. The teams that scale are the ones that build the guardrails first.

“From first notice of loss through settlement, we can deliver a faster, more transparent claims experience while maintaining the compliance, fairness, and human oversight our clients expect.”

Michael P’Ng, Co-Founder, Quartz Claims

How to Move Up the Autonomy Curve Without Losing Control

Raise autonomy where the work is high-frequency and rule-based first, prove it with audit data, then expand. The return is real when the sequence is disciplined. BCG reports that agentic AI handling claims end to end has cut claim handling time by 40% and lifted net promoter scores by 15 points. Allianz ran its agentic claims agent for food spoilage claims in Australia and reported an 80% reduction in claim processing and settlement time.

Discipline matters more than ambition. Autonomy that isn’t wired into your SOPs, your core system, and your audit process stays a demo.

How Clive™ AI Applies Guardrails at Every Level

This is the path Five Sigma built Clive™ AI on. Clive is our agentic AI for claims, and it runs on top of any existing claims system or natively inside our platform.

Clive operates within the same permission profiles as the person using it, so it never accesses data beyond an authorized scope, and it respects organization and sub-organization boundaries. AI-generated outputs are presented for review, and regulated or financial actions require explicit human authorization. Clive does not independently approve, execute, or finalize a payment. Every AI-assisted interaction is logged under the same monitoring and audit controls as the rest of the platform, which gives you the traceability an exam demands. Customer data is logically isolated and is not used to train any general large language model.

The part that matters most for an automation-first rollout: Clive enforces your policies, not only ours. Permission profiles, approval roles, and authority limits are configured to your operating model, so the autonomy you grant at each level stays inside the rules your organization already runs on.

“Clive quickly analyzes complex data, allowing our team to focus on key decisions. It complements human expertise, enhancing efficiency and improving outcomes. Our adjusters appreciate Clive’s helpful summaries and insights.”

Mark Habersack, Executive Director of Risk Management, Resorts World Las Vegas

The Agentic AI Guardrails Checklist for Insurance Claims

Use this checklist to pressure-test an agentic AI claims deployment before you raise autonomy. The 36 controls below group into eight areas, from autonomy scoping to policyholder rights. Each one maps to something a regulator, auditor, or your own risk committee will ask you to prove.

Autonomy Scoping & Authority Limits

	Check
✓	Define a claim-type eligibility matrix — which claim types are cleared for which ARISE level
✓	Set dollar-threshold authority tables by line of business; agent must escalate above threshold
✓	Document an action taxonomy: every external action classified as autonomous, human-approved, or prohibited
✓	Enforce least-agency principle: agents hold minimum permissions; use short-lived credentials
✓	Maintain a prohibited action list: no coverage denials on protected characteristics, no total loss declarations or litigation reserves without human review

Human-in-the-Loop (HITL) Design

	Check
✓	Map claim complexity to HITL tier: glass/APD → L4 autonomous; property B&C/WC → L3 one-click approval; BI/liability/disputes → mandatory human review
✓	Code four escalation trigger types: authority, evidence, consequence, sensitivity
✓	Require complete escalation packets — no thin handoffs; agent passes full context and recommended next step
✓	Deploy on-the-loop portfolio dashboards monitoring STP rate, override rate, escalation rate daily

Explainability & Audit Trail

	Check
✓	Generate decision-time, tamper-evident records: inputs used, rules applied, tools invoked, final determination
✓	Log every agent action with version, timestamp, credential, and output — store in immutable storage
✓	Produce plain-language adverse action explanations for any unfavorable claimant decision
✓	Link traces across multi-agent workflows via a common claim/session identifier

Security (OWASP ASI Top 10)

	Check
✓	Sanitize all inbound claimant documents against prompt injection
✓	Deploy an AI firewall with schema restrictions and a tool-call allowlist per agent
✓	Use short-lived credentials and attribute-based access control
✓	Validate RAG sources and context freshness; no stale policy or claim data in agent retrieval
✓	Implement circuit breakers at each workflow stage with rollback capability
✓	Maintain a kill switch with graceful agent suspension for drift or compromise

Bias, Fairness & Anti-Discrimination

	Check
✓	Run pre-deployment bias tests on race, gender, age, geography — measure demographic parity and disparate impact
✓	Audit proxy variables (e.g., zip code) and document mitigations
✓	Schedule quarterly bias monitoring of denial rates, settlement amounts, and cycle time by segment

Governance & Accountability

	Check
✓	Designate a single named executive owner for each deployed agentic system
✓	Establish a cross-functional AI Governance Committee (Claims, Legal, IT/Security, Actuarial, Risk)
✓	Maintain an agent registry: name, version, ARISE level, authorized actions, data scope, owner — updated within 48 hours of any change
✓	Apply model change governance: material changes (>5% decision distribution shift) trigger re-review
✓	Hold vendors to contractual audit rights over agent behavior; insurer retains full regulatory accountability
✓	Test the AI incident response plan annually via tabletop exercise

Monitoring & Continuous Validation

	Check
✓	Deploy an agent observability dashboard: tool-call patterns, latency, error rates, retrieval outliers
✓	Monitor for model drift in claim types, settlement amounts, and STP/escalation rates
✓	Require shadow mode (parallel AI + human) before any new agent type goes live
✓	Run quarterly governance reviews (bias, escalation, overrides) and annual full re-authorization
✓	Document and train staff on manual fallback procedures — do not allow institutional knowledge to atrophy

Policyholder Rights & Disclosure

	Check
✓	Disclose AI involvement to claimants when agents influenced claims decisions
✓	Provide a right to human review for any AI-influenced adverse decision, with a documented SLA
✓	Ensure AI agents self-identify as non-human in all direct claimant communications

Compliance Deadlines at a Glance

Deadline	Requirement
June 30, 2026	Colorado AI Act: risk policies, impact assessments, consumer disclosure, incident reporting
August 2, 2026	EU AI Act: high-risk AI conformity assessment, Fundamental Rights Impact Assessment, EU database registration
September 2026	NAIC AI Evaluation Tool pilot concludes — market conduct examination standards expected

Key takeaways

Agentic AI in claims has moved from pilot to production, and capability and permission are separate dials: you can run powerful technology at low autonomy until the evidence supports more.
Scale autonomy claim type by claim type, starting with high-frequency, rule-based work, and prove each step with audit data before expanding.
The gap between plans and production is usually a governance problem more than a model problem.
Four guardrails make autonomy safe: role-based access, audit trail, policy enforcement, and regulatory compliance.
The regulatory clock is real for 2026: the NAIC Model Bulletin (adopted by nearly half of U.S. states), the EU AI Act (high-risk obligations from August 2), and the Colorado AI Act (June 30) all require documented oversight.

Frequently asked questions

What is agentic AI autonomy in claims?

It’s the degree to which an AI agent acts without a human approving each step. Agentic AI plans and executes multi-step work toward a goal, calling tools and making decisions, rather than answering one prompt at a time. Capability and the permission to act are separate settings.

How much autonomy should insurers give AI in claims?

As much as the evidence supports, raised deliberately by claim type. Start with high-frequency, rule-based claims at low autonomy, prove safety with audit data, then expand. Gartner expects over 40% of agentic AI projects to be canceled by 2027, usually for weak governance rather than weak models.

Does giving AI more autonomy mean losing control of claims?

No. Autonomy and control scale together when guardrails are calibrated to each level. Every AI decision runs inside boundaries a human designed, and regulated actions like payments still require explicit human authorization.

What guardrails does agentic AI need in claims?

There are four basic guardrails: role-based access (the AI sees only what its task requires), an audit trail (every action logged and traceable), policy enforcement (no action a person couldn’t take, with human sign-off on payments), and regulatory compliance (NAIC, EU AI Act, bias testing, data isolation, SOC 2 Type 2).

Which AI regulations apply to insurance claims in 2026?

The NAIC Model Bulletin (adopted by nearly half of U.S. states) requires written governance and human oversight. The EU AI Act treats claims AI as high-risk, with obligations from August 2, 2026. The Colorado AI Act adds disclosure and impact-assessment duties from June 30, 2026.

Related resources

Blog: From AI-assisted to agentic AI claims: where the adjuster’s role moves
PR: Google issues a Clive AI Case Study with Five Sigma and LangChain
Data sheet: Five Sigma AI & Automated Processing Safeguards
Product overview: Clive™: The Multi-Agent AI Claims Expert

Contents