BHSI Risk Classification System

Production-facing underwriting workflow that compresses manual D&O research into a structured AI-assisted risk review.

Designed and deployed a production risk classification system for Berkshire Hathaway Specialty Insurance, using multi-agent LLM orchestration and hybrid rule-based logic to automate D&O policy assessments and reduce manual underwriting review from hours to minutes.

Key Outcomes

95% reduction in manual review time

Faster first-pass risk decisions for underwriters

Repeatable structured output across risk factors

View code

Context

Problem and Context

Directors and officers underwriting required analysts to gather financial, legal, and market context from multiple sources before making a risk recommendation.

The workflow was slow, inconsistent across reviewers, and difficult to scale when submission volume increased.

Approach

Approach and Architecture

The system combines agent-based research, retrieval over structured and unstructured data, and hybrid rules for risk scoring so model output remains grounded in underwriting logic.

Instead of generating a single free-form summary, the pipeline produces explicit risk factors, supporting evidence, and a normalized recommendation payload.

Diagrams

System Diagrams

Static diagrams included with the project to show architecture, workflow, and data movement at a glance.

BHSI system architecture diagram — End-to-end underwriting system layout across ingestion, orchestration, retrieval, and decision outputs.

Implementation

Implementation Details

FastAPI coordinates the underwriting workflow while background tasks fan out research requests across financial, news, and document sources.

Gemini-powered reasoning sits behind guardrailed prompts and structured schemas so downstream scoring and UI layers can consume predictable fields.

Async orchestration for parallel evidence collection

BigQuery-backed enrichment and retrieval

Hybrid score calculation layered on top of LLM outputs

Results

Results and Tradeoffs

The project turned a multi-hour analyst workflow into a minutes-long assisted review, making the platform useful as a first-pass risk triage system.

The biggest product win was not just speed, but producing an output shape that underwriting teams could review and challenge instead of treating the model as a black box.

95% reduction in review time

Automated risk classification with evidence-backed summaries

Lessons

Lessons and Next Steps

Insurance workflows need explainability as much as model quality. Structured evidence and rule visibility matter more than polished prose.

For production-readiness, the next step would be stronger observability around agent failures, retrieval quality, and override behavior from human reviewers.

Explore More

Related Projects

Browse adjacent work from the same archive group or jump back to the project archive.

Back to archive

Agentic Systems

Deep Research Workflow

Multi-agent research system that routes work by query complexity and adds evaluator loops to improve quality while reducing LLM cost.

Open case study

Agentic Systems

Autonomous Trading System

Concurrent trading-floor simulation where specialist agents research markets, generate signals, and stress-test portfolio decisions.

Open case study

Agentic Systems

LangGraph Autonomous Task Agent (Sidekick)

Stateful task agent that combines browsing, tool use, memory, and evaluation loops for multi-step execution.

Open case study