design-note · 2026-04-17

paisamaker-dual-lane

Subject: execution infrastructure for automated 0DTE options trading. Started March 30, 2026 · IBKR paper account · active.

This is paper trading to validate the execution infrastructure, not a strategy backtest. The measurement target is safety-gate behavior under live conditions, not P&L.

The problem

Automate 0DTE SPX options trading against two independent signal sources (a custom GEX evaluator and a separate advisor service) without blowing up when one part of the pipeline misbehaves. The system runs 24/7 on a Hetzner VM, polls a gamma-data vendor every 30 seconds, evaluates ~100 signals per session, and executes 3–5 trades per day through an IBKR paper account.

The decision I almost made

Single-threaded order submission with retries. One loop: read signals → pick strike → submit order → wait for fill → repeat. Simple. Testable. Obvious.

Why I killed it

The IBKR gateway has a ~6-second reconnect window I observed in pre-production testing. During that window, a synchronous retry loop blocks the entire pipeline — meaning the next signal doesn't even get evaluated, let alone acted on. On a 30-second cadence, losing 6 seconds to a reconnect means missing 20% of the evaluation window. On a 0DTE product where the last hour is where the move happens, that's unacceptable.

I also realized the two signal sources needed different safety semantics. The GEX evaluator fires on regime flips — high frequency, noisy, needs tight per-ticker cooldowns. The advisor service fires on CASCADE_WATCH / CHARM_SQUEEZE / GAMMA_RECLAIM alerts — lower frequency, higher conviction, needs watermark-based deduplication. One loop couldn't do both.

What I built instead

Dual-lane architecture. Separate ingestion paths for gexwatch and advisor. Per-source, per-ticker gates. One lane failing doesn't block the other. Unified execution dispatcher downstream, but upstream they don't know about each other.
IBKR mode state machine. Three modes: DATA_ONLY → RECONNECT_WARMUP → FULLY_LIVE, with EXECUTION_DEGRADED on disconnect. Trades only fire in FULLY_LIVE. After a reconnect, 60 seconds of stability required before returning to FULLY_LIVE. No synchronous retries. The system trades less during instability — by design.
Fail-closed by default. 20+ safety gates, each returning a specific reject code: kill switch (2 levels: halt entries / flatten all), spread gate (rejects bid-ask > 30%), bid validation (rejects into empty markets), contract validation (0DTE expiry, trading class, right/direction match), paper-port fail-closed (rejects if port isn't 4002 or 7497), EOD flatten (force close 15:45 ET with MKT escalation starting 15:35), max hold time (30–90 min by ticker), 3-tier exit escalation (LMT@mid → LMT@bid → MKT).
Position deduplication with an atomic UNIQUE index. Two fast signals on the same contract can't race to create duplicate positions. The DB rejects the second one before IBKR sees it.

What I'm watching for

Fill rate at market open (currently ~33%, improving). The 30% spread gate is aggressive during the first 15 minutes when bid-ask is wide. Tuning this means balancing fill rate against slippage.
Kill-switch trigger rate. L1 (halt entries) should fire occasionally on abnormal volatility. L2 (flatten all) should never fire in normal operation. If L2 ever fires, something upstream is broken.
False rejections from the spread gate. I'm logging every rejection with full context. If a real tradable contract is being rejected as too wide, the gate threshold needs revisiting.

What Week 1 proved

All 20+ safety gates held. Zero manual interventions. The system survived one IBKR reconnect event cleanly — no phantom orders, no stuck positions. Best single trade was +$410 on an SPX 0DTE.

What this tests is the infrastructure. Whether the strategy generates positive expected value is a separate, longer-run question — at least 6–8 weeks of paper data before that conversation is honest.

Stack

Python 3.12 · ib_insync · asyncio · single-host file-mounted state · GEXBot REST API · systemd + Docker · GitHub Actions CI · Discord webhooks · YAML config (56 tunable params) · 1,373 tests · 26,458 lines · 20+ safety gates

See GMAN traces on a similar agent →Talk about it on LinkedIn →