AI / Machine LearningSTATUS: PRODUCTIONSHIPPED IN 3 DAYS

AI-Driven Email Personalization at Scale

We generated 500 personalized cold email pitches using Claude Haiku and a Rust web scraper for $1.40 total, achieving 34% open rate versus 11% for category-level templates.

500

Leads processed

353

High-confidence personalizations (70.6%)

34%

Open rate (personalized)

$1.40

Total Claude API cost

Rust 1.84 (scraper + extractor)Claude claude-haiku-4-5 APIClickHouse 26.3 (alien.personalized_emails)Tokio 1.40 (50-concurrent HTTP)

CHAPTER 01

Problem & Context

Cold email at volume dies on generic copy. Open rates for bulk outreach average 2 to 4 percent in B2B SaaS. The message looks templated because it is. Recipients recognize the pattern in the first sentence and delete without reading.

The Avo Engine needed to send cold outreach to hundreds of marketing agency leads without paying for Apollo, Clearbit, or any managed enrichment provider. The build-don't-rent mandate ruled out every SaaS email tool that charged per record. The goal: generate genuinely personalized pitches from public information alone, at 500-lead batch scale, for a cost approaching zero.

The constraint that defined the architecture: each pitch had to read as if it were written by someone who spent ten minutes on the prospect's website. Not mail-merge. Not persona slots. Actual specific language pulled from what the company actually says about itself.

CHAPTER 02

Approach & Architecture

The pipeline had three stages: fetch, extract, generate. For each lead record, a Rust scraper issued an HTTP GET to the prospect's primary domain with a 10-second timeout and a user-agent that matched a standard browser. No JavaScript rendering. Pure HTML. The raw response was stripped to text using a lightweight HTML-to-text parser that preserved semantic structure while discarding navigation chrome and footer noise.

The stripped text was scanned for key phrases: service offerings, stated client types, outcome language, technology mentions, and social proof signals. This extraction ran as a Rust function with zero API calls. The output was a structured JSON object with offerings, client_types, outcome_language, and social_proof fields.

The structured JSON was passed to Claude Haiku with a prompt template that constrained output length to 80 to 120 words, tone to direct engineer-to-engineer, and structure to a 1-sentence hook referencing something specific from their site, a 2-sentence problem statement, a 1-sentence offer, and a CTA. No dashes, no filler phrases, no generic opener.

Confidence scoring was binary. If the fetch succeeded and extraction returned at least two populated fields, the record got confidence 0.75. If the fetch failed or extraction returned empty, the record fell to confidence 0.50 and received a category-level template.

ARCHITECTURE OVERVIEW

INGEST

Rust 1.84 (scraper + extractor)

FEATURES

Claude claude-haiku-4-5 API

TRAIN

ClickHouse 26.3 (alien.personalized_emails)

v1 / v2 / v3

eval loop

SERVE

Tokio 1.40 (50-concurrent HTTP)

Production predictions feed back into training set. Continuous retraining cadence

500 Leads processed353 High-confidence personalizations (70.6%)34% Open rate (personalized)

CHAPTER 03

Implementation

The batch processor was a single Rust binary that read leads from a ClickHouse table, fetched and scored each one concurrently using Tokio with a semaphore limiting concurrency to 50 simultaneous HTTP connections, and wrote results to alien.personalized_emails (ReplacingMergeTree keyed on lead ID).

The Claude API call used streaming disabled, max tokens 200, temperature 0.4. Temperature below 0.5 kept output consistent across similar inputs without making the copy robotic. The prompt included five few-shot examples of accepted outputs and two examples of rejected outputs with inline labels explaining why they failed.

One postmortem finding: the first version of the extraction step used regex to pull service descriptions. On sites with heavy JavaScript-rendered content, the HTML contained almost nothing. The fix added a secondary extraction pass using Claude Haiku on raw HTML snippets below 2,000 tokens for fallback parsing, at a cost of roughly $0.0003 per call. This fallback path recovered 23% of records that would otherwise have been templated.

Output was stored as drafts only. No send was triggered automatically. The store-as-draft policy was non-negotiable and enforced at the schema level: the personalized_emails table had no sent_at column.

TECH STACK

Rust 1.84 (scraper + extractor)Claude claude-haiku-4-5 APIClickHouse 26.3 (alien.personalized_emails)Tokio 1.40 (50-concurrent HTTP)

CHAPTER 04

Results & Metrics

500 leads processed. 353 records reached high-confidence status. 147 received category-level templates. Total Claude API cost for the batch: approximately $1.40. Processing time for 500 leads with 50-concurrent fetch: 4 minutes 12 seconds.

In early A/B testing on the send queue, personalized drafts showed a 34% open rate versus 11% for the category-level templates. Click-through on the personalized variant ran at 6.2% versus 1.8% on templates. The filler-phrase filter prevented an estimated 67 low-quality personalized pitches from reaching the send queue.

500

Leads processed

353

High-confidence personalizations (70.6%)

34%

Open rate (personalized)

$1.40

Total Claude API cost

CHAPTER 05

Lessons & Technical Decisions

DECISION · 01

Specificity beats volume. 353 high-quality personalized emails outperformed what 500 generic emails would have produced. Dropping 147 records to template rather than forcing a low-quality personalization was the right call.

DECISION · 02

Extraction quality gates the whole pipeline. The model is not the bottleneck. The bottleneck is structured signal extraction from noisy HTML. Investing in the extraction layer first, before prompt engineering, returned more quality improvement per hour than any prompt tweak.

DECISION · 03

Draft-only by design. Storing emails without a sent_at column enforced the no-auto-send policy in a way that no application-level check could. The schema was the safeguard, not the code.

START A PROJECT

Need something like this?

We build fast. Most projects ship in under two weeks. Start with a free 30-minute discovery call.

Start a Project

Related case studies

AI / Machine Learning

ML Signal Scoring: From 48% Accuracy to a 72% Win Rate Through Architectural Selection

We rebuilt the signal scoring pipeline from scratch, fixing look-ahead contamination and adding a top-decile filter that produced 72.2% win rate on selected signals.

72.2% Win rate (top-decile signals)

Read case study →

AI / Machine Learning

Regime Detection: A 50-Point Win Rate Spread and the System That Learned to Exploit It

We found a 50-percentage-point win rate spread between market regimes, fixed a regime classifier that was routing by symbol name instead of market structure, and built a live suppression system for anti-patterns.

62.1% Win rate in choppy regime

Read case study →

AI / Machine Learning

Correlation Engine: Real-Time NxN Correlation and Cluster Detection Across 1,200 Symbols

We built a Rust correlation engine processing 1,200 symbols with incremental sliding window updates at 340ms p95 per cycle, 14x faster than full recompute.

1,200 Symbols in correlation matrix

Read case study →

Start a Project

AI / Machine LearningSTATUS: PRODUCTIONSHIPPED IN 3 DAYS

AI-Driven Email Personalization at Scale

We generated 500 personalized cold email pitches using Claude Haiku and a Rust web scraper for $1.40 total, achieving 34% open rate versus 11% for category-level templates.

500

Leads processed

353

High-confidence personalizations (70.6%)

34%

Open rate (personalized)

$1.40

Total Claude API cost

Rust 1.84 (scraper + extractor)Claude claude-haiku-4-5 APIClickHouse 26.3 (alien.personalized_emails)Tokio 1.40 (50-concurrent HTTP)

CHAPTER 01

Problem & Context

CHAPTER 02

Approach & Architecture

ARCHITECTURE OVERVIEW

INGEST

Rust 1.84 (scraper + extractor)

FEATURES

Claude claude-haiku-4-5 API

TRAIN

ClickHouse 26.3 (alien.personalized_emails)

v1 / v2 / v3

eval loop

SERVE

Tokio 1.40 (50-concurrent HTTP)

Production predictions feed back into training set. Continuous retraining cadence

500 Leads processed353 High-confidence personalizations (70.6%)34% Open rate (personalized)

CHAPTER 03

Implementation

TECH STACK

Rust 1.84 (scraper + extractor)Claude claude-haiku-4-5 APIClickHouse 26.3 (alien.personalized_emails)Tokio 1.40 (50-concurrent HTTP)

CHAPTER 04

Results & Metrics

500

Leads processed

353

High-confidence personalizations (70.6%)

34%

Open rate (personalized)

$1.40

Total Claude API cost

CHAPTER 05

Lessons & Technical Decisions

DECISION · 01

DECISION · 02

DECISION · 03

Draft-only by design. Storing emails without a sent_at column enforced the no-auto-send policy in a way that no application-level check could. The schema was the safeguard, not the code.

START A PROJECT

Need something like this?

We build fast. Most projects ship in under two weeks. Start with a free 30-minute discovery call.

Start a Project

Related case studies

AI / Machine Learning

ML Signal Scoring: From 48% Accuracy to a 72% Win Rate Through Architectural Selection

We rebuilt the signal scoring pipeline from scratch, fixing look-ahead contamination and adding a top-decile filter that produced 72.2% win rate on selected signals.

72.2% Win rate (top-decile signals)

Read case study →

AI / Machine Learning

Regime Detection: A 50-Point Win Rate Spread and the System That Learned to Exploit It

62.1% Win rate in choppy regime

Read case study →

AI / Machine Learning

Correlation Engine: Real-Time NxN Correlation and Cluster Detection Across 1,200 Symbols

We built a Rust correlation engine processing 1,200 symbols with incremental sliding window updates at 340ms p95 per cycle, 14x faster than full recompute.

1,200 Symbols in correlation matrix

Read case study →