Kafka in production

Kafka evaluated for the signal transport layer. Redis Streams chosen for Avo's own pipeline. Avo builds Kafka integrations for clients who already run it.

HOW WE USE IT

Kafka at Avo

Apache Kafka is the dominant event streaming platform in the enterprise. It handles durable, ordered, replayable event logs across multiple consumer groups with strong delivery guarantees. Any client running significant event volume (10,000+ events per second sustained), operating on AWS MSK or Confluent Cloud, or needing cross-datacenter replication is a Kafka shop. Avo knows the platform well enough to build on top of it and to make the honest call on when it is the right choice versus cheaper alternatives.

For Avo's own signal transport pipeline, we evaluated Kafka and chose Redis Streams instead. The reason is operational surface area. Kafka requires a broker cluster (minimum 3 nodes for replication), ZooKeeper or KRaft for coordination, and ongoing tuning of partition count, replication factor, and consumer group lag. For a single-server pipeline running under 1,000 events per minute, that is engineering overhead with no throughput benefit. Redis Streams gives us at-least-once delivery, consumer groups, offset tracking, and replay from a service that is already running for caching and rate limiting. We run zero additional infrastructure for it.

For client work, Avo builds Kafka integrations when the client already operates a Kafka cluster or when their event volume makes Redis Streams the wrong choice. The most common scenario is a client whose backend team already runs Kafka for application events (user signups, order placements, clickstream), and they want to wire a new service into that existing bus.

Example workflow: connecting a new analytics microservice to a client's existing Kafka cluster. 1. Add the kafka-rust crate (or rdkafka via librdkafka bindings) to the service Cargo.toml. 2. Configure the consumer with group.id set to the new service name, auto.offset.reset = earliest for initial replay of historical events, and enable.auto.commit = false for explicit offset control. 3. Consume from the topic in a Tokio task. On each batch, process the events and write results to ClickHouse. 4. Only commit offsets after the ClickHouse insert acknowledges. If the insert fails, the next restart reprocesses from the last committed offset. 5. Store the consumer group offset in Kafka (not externally) to avoid the dual-write consistency problem. 6. Monitor consumer lag via the kafka-lag-exporter sidecar. Alert if lag exceeds 5 minutes of throughput equivalent.

Tradeoffs to think through honestly. Kafka's exactly-once semantics require idempotent producers and transactional consumers. Implementing this correctly adds 2 to 3 weeks of engineering. If the client can tolerate at-least-once processing (most analytical pipelines can, with deduplication at write time), skip the transaction overhead entirely. Schema evolution is the other consistent pain point: without a schema registry enforcing compatibility, a producer adding a required field breaks all consumers immediately. Avo defaults to optional fields and envelope versioning to avoid this. The operational cost of a Kafka cluster (patching, broker restarts, partition rebalancing) is real and should be priced into any engagement that requires Avo to own the infrastructure.

Production numbers

>10K/sec

Min event rate to justify Kafka

Redis Streams

Avo internal transport

Existing Kafka clusters

Client-side use case

After write, not before

Offset commit pattern

Case studies using Kafka

Real-Time

Real-Time ClickHouse Queries Powering Live Dashboards and Signal Generation

Argus generated regime signals, novelty anomalies, and trend-following calls from a 1,400-feature engine built on AVX2 SIMD.

3.2x Macro endpoint (1,476ms → 465ms warm)

Read case study →

Data

Real-Time Market Data Pipeline

We built a 723M-row market data pipeline ingesting 10 exchanges simultaneously at under 50ms tick-to-storage latency.

723M+ Total rows stored

Read case study →

Start a project

Need a Kafka build?

Most projects ship in under two weeks. Start with a free 30-minute discovery call.

Start a project →

Start a Project

Kafka in production

Kafka evaluated for the signal transport layer. Redis Streams chosen for Avo's own pipeline. Avo builds Kafka integrations for clients who already run it.

HOW WE USE IT