We built a five-layer parallel context engine that synthesizes macro, sector, correlation, historical, and catalyst data into a 2-sentence market narrative within 1.5 seconds of signal emission.
1-1.5 sec
Synthesis latency (p95)
500ms
Context fetch p95 (5 parallel)
$0.02
API cost per synthesis
70%
Cache hit rate (60-min TTL)
CHAPTER 01
Anomaly detection in financial markets faces a problem that anomaly detection in most other domains does not: the same numerical signal can be bullish, bearish, or meaningless depending on context. A VIX spike to 28 during an orderly market uptrend means something different from a VIX spike to 28 during a Fed decision week. Detecting the anomaly is the easy part. Telling a customer what it means, in context, in 30 seconds, without requiring them to consult three additional data sources, is the hard part.
The Argus platform detected signals across global equities, crypto, and commodities using a combination of statistical anomaly scoring and pattern matching. An AAPL anomaly at 9.2/10 appeared in the user interface. The customer's first question was invariably: okay, but why? And should I act on it? Answering that question required synthesizing macro regime data, sector performance, cross-asset correlation structure, historical precedent, and near-term catalysts, all from different data sources with different update cadences.
The competitive gap was structural. Bloomberg Terminal required a user to manually query FRED for macro data, run a sector screen in a separate panel, look up the earnings calendar in a third module, and construct the narrative themselves. The design target was: when a high-confidence signal fires, automatically generate a 2 to 3 sentence narrative that answers why is this firing and what does context suggest, delivered within 1.5 seconds of signal emission.
CHAPTER 02
The contextual reasoning system was designed as a Rust binary executing 5 parallel asynchronous context fetches using Tokio. Each fetch targeted a different data layer: macro regime from FRED series and VIX via ClickHouse macro_data; sector context from XLF, XLV, XLI, XLK, XLY performance from bars_1d; cross-asset correlation from the correlation engine; historical precedent from a 5-year signal archive; and catalysts from the earnings calendar and macro events forward window.
The 5 fetches ran in parallel with a combined budget of approximately 500 milliseconds at p95. Results were aggregated into a structured JSON payload and passed to Claude Haiku with a template prompt of approximately 1,000 to 1,500 tokens. Total latency from signal emission to narrative delivery targeted 1 to 1.5 seconds. Cache hit cases returned in under 50 milliseconds.
The anomaly scoring pipeline upstream used a multi-source ensemble. Statistical signals contributed approximately 60% of the composite score. Cross-asset signals contributed 25%. Historical pattern match strength contributed 15%. Real-time synthesis ran only for signals scoring above 8.5 on the 10-point scale.
ARCHITECTURE OVERVIEW
INGEST
Rust 1.84 (argus-context-synthesizer)
FEATURES
Claude Haiku API
TRAIN
ClickHouse 26.3
v1 / v2 / v3
SERVE
Redis 7.2
Production predictions feed back into training set. Continuous retraining cadence
CHAPTER 03
Historical precedent matching used a nearest-neighbor search over the signal archive stored in ClickHouse. For each new high-confidence signal, the query retrieved the 5 to 10 most similar historical signals based on symbol, regime at time of signal, signal type, and composite score. For each historical match, the system retrieved the 5-day and 20-day forward returns to construct the precedent statement.
The Claude Haiku synthesis prompt was structured as a fill-in-the-template call rather than an open-ended generation. The macro regime block, sector context block, correlation divergence block, historical precedent block, and catalyst block were each pre-formatted as structured text before insertion into the prompt. The prompt enforced a 3-sentence output limit and required the final sentence to be a directional bias statement: NEUTRAL, BULLISH, or BEARISH with a time horizon. Synthesis cost averaged $0.02 per call. At an estimated 30,000 signals per month requiring real-time synthesis, monthly API cost was approximately $600.
Redis caching used a compound key of symbol + signal_type + date_hour. Rounding the timestamp to the hour ensured repeated queries for the same symbol within a 60-minute window returned the cached narrative. Estimated cache hit rate was 70% at 30,000 signals per month, reducing actual Haiku API calls to approximately 9,000 per month.
TECH STACK
CHAPTER 04
The contextual synthesis latency target of 1.5 seconds at p95 was validated in load testing against the ClickHouse query patterns used by each layer. Macro regime fetch ran at 100 to 150 milliseconds cold. Cross-asset correlation fetch ran at 100 to 150 milliseconds cold with 70% hit rate. Historical precedent ran at 200 to 300 milliseconds cold with 80% cache hit rate at 24-hour TTL. Parallel execution produced a combined p95 fetch latency of approximately 500 milliseconds. Adding Claude Haiku synthesis produced a total p95 delivery latency of 1.0 to 1.5 seconds.
Anomalies above 8.5 score converted to directionally correct outcomes in approximately 64% of historical cases, compared to 51% for signals in the 6.0 to 8.5 range.
1-1.5 sec
Synthesis latency (p95)
500ms
Context fetch p95 (5 parallel)
$0.02
API cost per synthesis
70%
Cache hit rate (60-min TTL)
CHAPTER 05
DECISION · 01
The original anomaly detection design attempted to use a local language model for narrative synthesis. At 14B parameters, synthesis latency exceeded 45 seconds per call, breaking the real-time delivery budget by a factor of 30. The decision to use Claude Haiku via the Anthropic API solved both problems: synthesis latency dropped to 500 to 1,000 milliseconds with qualitatively richer output.
DECISION · 02
The 5-layer parallel fetch architecture was chosen over sequential fetch because the dominant latency was the slowest individual layer, not the sum. Running all 5 layers sequentially would have produced p95 fetch latency exceeding the total 1.5-second budget.
DECISION · 03
The decision to tier synthesis by score was driven by cost and latency constraints, not by information value. A customer receiving a 6.5-score signal on a symbol they hold would benefit from contextual narrative. But serving that narrative in real-time for all signals above 6.0 would require approximately 4 times more Haiku API calls.
START A PROJECT
We build fast. Most projects ship in under two weeks. Start with a free 30-minute discovery call.
Start a ProjectWe rebuilt the signal scoring pipeline from scratch, fixing look-ahead contamination and adding a top-decile filter that produced 72.2% win rate on selected signals.
72.2% Win rate (top-decile signals)
Read case study →
AI / Machine LearningWe found a 50-percentage-point win rate spread between market regimes, fixed a regime classifier that was routing by symbol name instead of market structure, and built a live suppression system for anti-patterns.
62.1% Win rate in choppy regime
Read case study →
AI / Machine LearningWe built a Rust correlation engine processing 1,200 symbols with incremental sliding window updates at 340ms p95 per cycle, 14x faster than full recompute.
1,200 Symbols in correlation matrix
Read case study →