Claude Haiku and Sonnet run across 4 production agents: email personalization, reply classification, market context synthesis, and lead scoring.
HOW WE USE IT
Anthropic Claude is the only paid AI dependency in the Avo stack. The policy is explicit: cloud LLMs for decisions, never local AI for anything trading-adjacent. Local model inference (Ollama, llama.cpp) was evaluated and rejected for two reasons: quality degradation on nuanced tasks was measurable, and the operational overhead of keeping a local model current with new releases was not worth the cost savings at current volume.
Four production agents use Claude in active deployment. The cold email personalization agent uses Claude Haiku to generate individualized pitches from scraped website content. In the May 2026 batch, 500 prospects were processed: the agent fetched each website, extracted key phrases via structured output, and generated a 3-4 sentence personalized email in the voice specified in the system prompt. Results were written to ClickHouse (alien.personalized_emails) with confidence scores. High-confidence emails (353 leads, based on successful website fetch) went to production queue; low-confidence (147 leads) to a manual review queue.
The reply classification agent uses Claude Haiku to classify inbound email replies into: interested, not-interested, out-of-office, referral, and unsubscribe. This classification feeds the lead state machine in PostgreSQL, triggering appropriate follow-up sequences.
Example workflow: building a structured-output extraction agent for a client's data pipeline. 1. Define the output schema as a Zod object (TypeScript) or Pydantic model (Python) that matches what the downstream system expects. 2. Pass the schema to Claude via tool_use with the schema as the input_schema. Claude returns a structured JSON blob that matches the schema rather than prose. 3. Validate the tool_use response against the schema before writing to the database. If validation fails (Claude occasionally returns partial fields), log the raw response and push the lead to a manual review queue rather than crashing. 4. Enable prompt caching by placing the static portion of the system prompt (persona, output instructions, examples) in the first content block with cache_control: {type: ephemeral}. Only the variable portion (the prospect's website content) changes per call. 5. Monitor cache hit rate via the usage.cache_read_input_tokens field in the API response. Target above 85% on batch runs. Below that, the variable content is bleeding into the cached block. 6. For batch processing, use the Anthropic Batch API (up to 10,000 requests per batch) to reduce cost by 50% versus synchronous calls. Write results to a staging table and reconcile after the batch completes.
The market context synthesizer (argus-context-synthesizer) uses Claude Haiku to generate 2-3 sentence contextual narratives when a signal fires. It synthesizes 5 data layers: macro regime, sector rotation, cross-asset signals, historical precedent, and known catalysts. The output is a human-readable explanation of why the signal is firing now, differentiated from competitors who return raw numbers.
The lead scoring V2 system uses Claude Sonnet for the firmographic enrichment step, extracting AUM estimates and investment focus from SEC IAPD descriptions and website copy.
Prompt caching is enabled on all agents where the system prompt exceeds 1,024 tokens. The email personalization agent's system prompt is approximately 1,800 tokens and is cache-prefilled on every batch run, reducing per-call cost by approximately 80% on the prompt portion.
Production numbers
4
Production agents
500
Emails personalized (batch)
Haiku
Primary model
~80%
Prompt cache savings
We upgraded from a static 6-factor lead score to a three-tier behavioral composite integrating email engagement, AUM, and headcount, projecting 5 to 10% conversion uplift.
3,889 Tier A leads (V1)
Read case study →
AI / Machine LearningWe generated 500 personalized cold email pitches using Claude Haiku and a Rust web scraper for $1.40 total, achieving 34% open rate versus 11% for category-level templates.
500 Leads processed
Read case study →
Start a project
Most projects ship in under two weeks. Start with a free 30-minute discovery call.
Start a project →