Python in production

Python handles IBKR integration via ib_insync, ML model training with XGBoost and LightGBM, and Claude API agents for outreach.

HOW WE USE IT

Python at Avo

Python 3.12 handles three distinct workloads in the Avo stack: IBKR broker integration, ML model training, and Claude API agent orchestration. It is explicitly not used for anything latency-sensitive.

The IBKR connection runs via ib_insync 0.9.86 against IB Gateway 10.37 on port 4002 (paper) or 4001 (live). IB Gateway runs headless via Xvfb :99, managed by IBC for auto-login, and restarts automatically at 23:55 UTC daily to match IBKR's mandatory maintenance cycle. The connection from Python is a single TCP socket: no session timeouts, no competing-session bugs that plague the Client Portal Gateway. ib_insync's event loop integrates cleanly with asyncio, so position updates and order fills can be awaited rather than polled.

ML training uses LightGBM and XGBoost (Python 3.12, scikit-learn 1.4) against feature tables exported from ClickHouse. The regime detection model was trained on 270M+ minute bars. Training runs as a nightly Python batch job on the Hetzner server; the serialized model is loaded into the Rust argus-regime binary via a C ABI bridge for inference. Python computes the model; Rust runs it in production. This split keeps inference latency predictable (no Python GIL, no GC pauses) while preserving the ML ecosystem for training.

Example workflow: training a new signal model for a client's data product. 1. Export the feature table from ClickHouse as a Parquet file using clickhouse-connect. Filter to the relevant date range and symbol universe. 2. Load in pandas, split into train (80%) and validation (20%) sets by time, never by random shuffle (avoids look-ahead leakage). 3. Train a LightGBM classifier with early stopping on the validation AUC. Use SHAP values to identify the top-10 predictive features before finalizing the model. 4. Serialize the trained model with joblib. Write a C-compatible inference function using ctypes so the model can be called from Rust without spawning a Python subprocess. 5. Load the .so in Rust using unsafe extern "C" blocks. Gate the unsafe block with a safety comment documenting the invariants. 6. Run the inference benchmark in the Rust test suite. Target under 2ms per inference call including the FFI overhead.

Lead scoring V2 uses XGBoost plus logistic regression with SEC IAPD firmographic features and engagement signals from Instantly webhooks. The training script is a 400-line Python file that pulls from PostgreSQL (engagement_metrics), ClickHouse (behavioral signals), and PostHog (website visits) to build the training set.

The Claude API agents for cold email personalization run in Python using the Anthropic SDK. Each agent fetches a prospect's website, extracts key phrases, and generates a short personalized pitch. 500 leads were processed in a single batch; results stored in ClickHouse (alien.personalized_emails).

Production numbers

3.12

Python version

4002 (paper)

IBKR port

270M+ bars

Training data (LightGBM)

We debugged 65 compounding bugs across seven subsystems of a live trading engine, fixed a score overflow that silently blocked all dark_matter_rs signals, and cut Redis memory from 11.8GB to 7.15GB.

65 Bugs fixed in one session

Read case study →

View all 7 case studies using Python→

Start a project

Need a Python build?

Most projects ship in under two weeks. Start with a free 30-minute discovery call.

Start a project →

Start a Project

Python in production

Python handles IBKR integration via ib_insync, ML model training with XGBoost and LightGBM, and Claude API agents for outreach.

HOW WE USE IT

Python at Avo

Production numbers

3.12

Python version

4002 (paper)

IBKR port

270M+ bars

Training data (LightGBM)

500

Email leads processed

Case studies using Python

AI / Machine Learning

ML Signal Scoring: From 48% Accuracy to a 72% Win Rate Through Architectural Selection

We rebuilt the signal scoring pipeline from scratch, fixing look-ahead contamination and adding a top-decile filter that produced 72.2% win rate on selected signals.

72.2% Win rate (top-decile signals)

Read case study →

AI / Machine Learning

We debugged 65 compounding bugs across seven subsystems of a live trading engine, fixed a score overflow that silently blocked all dark_matter_rs signals, and cut Redis memory from 11.8GB to 7.15GB.

65 Bugs fixed in one session

Read case study →

View all 7 case studies using Python→

Start a project

Need a Python build?

Most projects ship in under two weeks. Start with a free 30-minute discovery call.

Start a project →