Deep Research Workflow
Multi-agent research system that routes work by query complexity and adds evaluator loops to improve quality while reducing LLM cost.
Production-style multi-agent research system implementing async orchestration, intelligent routing, and automated evaluation loops to optimise research quality and API cost.
Context
Problem and Context
Single-pass LLM research assistants tend to be expensive, brittle, and hard to trust for analytical work that spans multiple sources.
The challenge was to design a system that could vary its depth based on the prompt while still validating quality before delivery.
Approach
Approach and Architecture
The workflow uses an explicit router and planner to decide whether a request needs lightweight synthesis or a deeper multi-agent research path.
Evaluation is treated as a first-class step rather than an afterthought, allowing the system to retry or refine when the answer quality is below threshold.
Diagrams
System Diagrams
Static diagrams included with the project to show architecture, workflow, and data movement at a glance.
Implementation
Implementation Details
Async execution enables parallel search and evidence gathering, which materially reduces latency on the heavier research paths.
Pydantic schemas keep agent handoffs typed so downstream stages can reason over predictable payloads instead of parsing free-form text.
Results
Results and Tradeoffs
The main improvement came from matching the workflow depth to the task instead of forcing every request through the same expensive pipeline.
The evaluation loop made the system more production-like because it surfaced quality as an operational concern rather than a manual review task.
Lessons
Lessons and Next Steps
Agent systems become easier to evolve when their interfaces are explicit and typed. The orchestration layer matters as much as the prompts.
A future iteration would add richer telemetry per stage so routing and evaluation thresholds can be tuned from observed behavior instead of static heuristics.
Explore More
Related Projects
Browse adjacent work from the same archive group or jump back to the project archive.
Agentic Systems
BHSI Risk Classification System
Production-facing underwriting workflow that compresses manual D&O research into a structured AI-assisted risk review.
Open case studyAgentic Systems
Autonomous Trading System
Concurrent trading-floor simulation where specialist agents research markets, generate signals, and stress-test portfolio decisions.
Open case studyAgentic Systems
LangGraph Autonomous Task Agent (Sidekick)
Stateful task agent that combines browsing, tool use, memory, and evaluation loops for multi-step execution.
Open case study