BHSI Risk Classification System
Production-facing underwriting workflow that compresses manual D&O research into a structured AI-assisted risk review.
Designed and deployed a production risk classification system for Berkshire Hathaway Specialty Insurance, using multi-agent LLM orchestration and hybrid rule-based logic to automate D&O policy assessments and reduce manual underwriting review from hours to minutes.
Key Outcomes
Context
Problem and Context
Directors and officers underwriting required analysts to gather financial, legal, and market context from multiple sources before making a risk recommendation.
The workflow was slow, inconsistent across reviewers, and difficult to scale when submission volume increased.
Approach
Approach and Architecture
The system combines agent-based research, retrieval over structured and unstructured data, and hybrid rules for risk scoring so model output remains grounded in underwriting logic.
Instead of generating a single free-form summary, the pipeline produces explicit risk factors, supporting evidence, and a normalized recommendation payload.
Diagrams
System Diagrams
Static diagrams included with the project to show architecture, workflow, and data movement at a glance.
Implementation
Implementation Details
FastAPI coordinates the underwriting workflow while background tasks fan out research requests across financial, news, and document sources.
Gemini-powered reasoning sits behind guardrailed prompts and structured schemas so downstream scoring and UI layers can consume predictable fields.
Results
Results and Tradeoffs
The project turned a multi-hour analyst workflow into a minutes-long assisted review, making the platform useful as a first-pass risk triage system.
The biggest product win was not just speed, but producing an output shape that underwriting teams could review and challenge instead of treating the model as a black box.
Lessons
Lessons and Next Steps
Insurance workflows need explainability as much as model quality. Structured evidence and rule visibility matter more than polished prose.
For production-readiness, the next step would be stronger observability around agent failures, retrieval quality, and override behavior from human reviewers.
Explore More
Related Projects
Browse adjacent work from the same archive group or jump back to the project archive.
Agentic Systems
Deep Research Workflow
Multi-agent research system that routes work by query complexity and adds evaluator loops to improve quality while reducing LLM cost.
Open case studyAgentic Systems
Autonomous Trading System
Concurrent trading-floor simulation where specialist agents research markets, generate signals, and stress-test portfolio decisions.
Open case studyAgentic Systems
LangGraph Autonomous Task Agent (Sidekick)
Stateful task agent that combines browsing, tool use, memory, and evaluation loops for multi-step execution.
Open case study