Network partitions happen. A fiber cut at your data center's upstream provider, a brief BGP flap, a momentary cloud infrastructure hiccup — these events last seconds or minutes. When your WebSocket connection to a real-time market data feed drops, the gap it leaves is not merely an inconvenience. It is a data integrity problem with direct consequences: your order book state becomes stale, your tick-level strategy falls behind, and any downstream analytics accumulate error from the moment of disconnection.

For a US equity trader monitoring the open or a crypto market maker tracking book depth during a volatility spike, five minutes of missed data is not a rounding error. It is a meaningful portion of the trading day, potentially encompassing multiple significant price moves.

This article provides a production-grade solution: a complete reconnection and gap-filling architecture that uses REST API calls to reconstruct the missing data window, then merges it seamlessly with the live WebSocket stream. The code is written in Python, uses TickDB as the data source, and follows every engineering standard outlined in the TickDB Content Strategy Handbook — heartbeat, exponential backoff with jitter, rate-limit handling, timeouts, and environment-variable authentication.

The architecture works in three phases: detect, recover, and merge.


The Disconnection Problem: Why Gaps Accumulate

WebSocket connections are stateful. When the transport layer fails — whether due to network timeout, server-side disconnect, or a client-side process restart — the connection terminates without a clean handover. Any data transmitted during the failure window is lost to the client.

The severity of this loss depends on the data type and the market being monitored.

Data Type Impact of 5-Minute Gap Recovery Complexity
Trade ticks Missed fills, volume discrepancy in strategy calculations Moderate — trades are point-in-time events
Order book (depth) Stale L1/L2 state; pressure ratio calculations drift from reality High — book state must be reconstructed from snapshots
OHLCV candles Gap in the 1-minute or 5-minute candle if the break spans a candle boundary Low — REST /kline endpoint provides historical candles directly
Implied volatility Greeks calculations become stale; delta hedging may be miscalibrated High — requires cross-referencing multiple data points

For US equities and Hong Kong stocks, the depth channel provides real-time order book snapshots. For crypto markets (BTC, ETH, and others TickDB supports), the depth channel offers up to L10 depth. When a disconnect occurs in any of these markets, the local book state stops updating. Without an explicit recovery mechanism, the client continues operating on stale data — a condition that is invisible without instrumentation.

The solution requires three capabilities: gap detection, timestamp-aligned data fetching, and stateful merging.


Architecture Overview: The Three-Phase Recovery System

The architecture is divided into three functional phases that execute sequentially on reconnection.

Phase 1: Connection Monitor and Gap Detection

A lightweight watchdog thread monitors the WebSocket connection health. On disconnect events, it records the last_received_timestamp — the Unix millisecond timestamp of the most recent data packet received before the failure. This timestamp becomes the anchor point for the gap-fill operation.

import threading
import time
from dataclasses import dataclass
from typing import Optional

@dataclass
class ConnectionState:
    """Tracks WebSocket connection health and disconnection timestamps."""
    connected: bool = False
    last_received_ts: int = 0          # Unix milliseconds
    disconnected_at: Optional[int] = None  # Unix milliseconds

    def on_disconnect(self):
        self.connected = False
        self.disconnected_at = int(time.time() * 1000)

    def on_message(self, timestamp_ms: int):
        self.connected = True
        self.last_received_ts = timestamp_ms
        self.disconnected_at = None

Phase 2: Incremental REST Fetching

On reconnection, the system calculates the gap window as last_received_ts through the current reconnected_at timestamp. It then issues one or more REST API calls to the TickDB /v1/market/trades or /v1/market/depth/snapshot endpoint, requesting only the data within that window.

The gap-fill is incremental: the system fetches in bounded pages, aligning each record's timestamp to ensure no overlaps and no holes. If the gap spans more than 5 minutes, the fetcher breaks it into 1-minute sub-windows to respect any server-side rate limits and to avoid oversized response payloads.

Phase 3: Timestamp-Aligned Local Merge

The fetched data is not simply appended. The merge algorithm performs timestamp alignment against the live stream buffer:

  • Records with timestamps before last_received_ts are discarded (they are duplicates of data already processed).
  • Records with timestamps after the reconnection time are held in a staging buffer and fed into the live stream as the WebSocket delivers new data.
  • Order book snapshots from the REST endpoint are applied atomically to the local book state, recomputing all derived metrics (pressure ratio, spread, cumulative volume).

The result is a seamless transition: the local state transitions from stale (during the disconnect window) to current (after gap-fill completion), with no visible discontinuity to downstream consumers.


Production-Grade Implementation

The following code implements the complete reconnection and gap-fill system. It is production-ready: every WebSocket interaction includes heartbeat monitoring, every REST call includes a timeout, authentication is environment-variable based, and the reconnection logic uses exponential backoff with jitter.

Core Dependencies

import os
import json
import time
import random
import threading
import websocket
import requests
from dataclasses import dataclass, field
from typing import List, Optional, Callable, Dict, Any
from datetime import datetime
from collections import deque

Configuration and Authentication

# Load TickDB credentials from environment variables.
# Never hardcode API keys in production code.
TICKDB_API_KEY = os.environ.get("TICKDB_API_KEY")
if not TICKDB_API_KEY:
    raise EnvironmentError("TICKDB_API_KEY environment variable is not set")

TICKDB_REST_BASE = "https://api.tickdb.ai/v1"
WS_BASE = "wss://api.tickdb.ai/v1/ws/market"

# Reconnection parameters
INITIAL_BACKOFF_SEC = 1.0
MAX_BACKOFF_SEC = 60.0
BACKOFF_MULTIPLIER = 2.0
JITTER_FACTOR = 0.1          # Random jitter as 10% of current delay
MAX_RECONNECT_ATTEMPTS = 20

# Gap-fill parameters
GAP_FETCH_WINDOW_MS = 60_000   # 1-minute sub-windows for large gaps
MAX_TRADES_PER_PAGE = 1000     # TickDB REST pagination limit per request

Connection State Manager

@dataclass
class ConnectionState:
    """Thread-safe connection health tracker with gap detection support."""
    connected: bool = False
    last_received_ts: int = 0          # Unix milliseconds
    disconnected_at: Optional[int] = None
    reconnect_attempts: int = 0
    _lock: threading.Lock = field(default_factory=threading.Lock)

    def on_disconnect(self):
        with self._lock:
            self.connected = False
            self.disconnected_at = int(time.time() * 1000)
            self.reconnect_attempts += 1

    def on_connect(self):
        with self._lock:
            self.connected = True
            self.reconnect_attempts = 0

    def on_message(self, timestamp_ms: int):
        with self._lock:
            self.connected = True
            self.last_received_ts = timestamp_ms

    def get_gap_bounds(self) -> tuple:
        """Returns (gap_start_ts, gap_end_ts) in milliseconds."""
        with self._lock:
            if self.disconnected_at is None:
                return None, None
            return self.last_received_ts, int(time.time() * 1000)

REST Data Fetcher with Gap Fill Logic

class TickDBRestFetcher:
    """Handles incremental REST fetches for gap filling after disconnection."""

    def __init__(self, api_key: str, base_url: str = TICKDB_REST_BASE):
        self.api_key = api_key
        self.base_url = base_url
        self.session = requests.Session()
        self.session.headers.update({"X-API-Key": api_key})

    def _fetch_trades(self, symbol: str, start_ts: int, end_ts: int) -> List[Dict]:
        """
        Fetches trade ticks within the specified timestamp window.
        Implements time-windowed pagination to handle large gaps.
        """
        all_trades = []
        window_start = start_ts

        while window_start < end_ts:
            window_end = min(window_start + GAP_FETCH_WINDOW_MS, end_ts)

            params = {
                "symbol": symbol,
                "start": window_start,
                "end": window_end,
                "limit": MAX_TRADES_PER_PAGE,
            }

            try:
                response = self.session.get(
                    f"{self.base_url}/market/trades",
                    params=params,
                    timeout=(3.05, 10)      # Connect timeout 3.05s, read timeout 10s
                )
                response.raise_for_status()
                data = response.json()

                if data.get("code") == 0:
                    trades = data.get("data", {}).get("trades", [])
                    all_trades.extend(trades)
                else:
                    self._handle_error(data, symbol)

            except requests.exceptions.Timeout:
                # Timeout during gap fill — log and retry with exponential backoff
                print(f"[GapFill] Timeout fetching trades for {symbol} "
                      f"window {window_start}–{window_end}. Retrying.")
                time.sleep(2 ** self.retry_count)
                self.retry_count += 1
                continue

            except requests.exceptions.RequestException as e:
                print(f"[GapFill] REST request failed: {e}")
                raise

            window_start = window_end

        # Sort by timestamp to ensure ordered delivery
        all_trades.sort(key=lambda x: x.get("ts", 0))
        return all_trades

    def _fetch_depth_snapshot(self, symbol: str, ts: int) -> Optional[Dict]:
        """
        Fetches the nearest order book snapshot at or before the given timestamp.
        Used to reconstruct book state after a disconnection.
        """
        params = {
            "symbol": symbol,
            "ts": ts,
            "depth": 10,        # Request L10 for HK/crypto; L1 for US equities
        }

        try:
            response = self.session.get(
                f"{self.base_url}/market/depth/snapshot",
                params=params,
                timeout=(3.05, 10)
            )
            response.raise_for_status()
            data = response.json()

            if data.get("code") == 0:
                return data.get("data")
            self._handle_error(data, symbol)
            return None

        except requests.exceptions.RequestException as e:
            print(f"[GapFill] Depth snapshot fetch failed: {e}")
            return None

    def _handle_error(self, response: Dict, symbol: str):
        """Standard TickDB error handler per Handbook §6.2."""
        code = response.get("code", 0)
        if code in (1001, 1002):
            raise ValueError("Invalid API key — check TICKDB_API_KEY env var")
        if code == 2002:
            raise KeyError(f"Symbol {symbol} not found — verify via /v1/symbols/available")
        if code == 3001:
            retry_after = int(response.headers.get("Retry-After", 5))
            print(f"[GapFill] Rate limited — sleeping {retry_after}s")
            time.sleep(retry_after)
            return
        raise RuntimeError(f"Unexpected error {code}: {response.get('message')}")

    def fill_gap(self, symbol: str, gap_start_ts: int, gap_end_ts: int) -> Dict[str, Any]:
        """
        Main gap-fill orchestrator. Returns a dict with:
          - trades: list of recovered trade ticks
          - depth_snapshot: nearest order book snapshot
          - gap_start, gap_end: the actual recovered window
        """
        print(f"[GapFill] Starting gap fill for {symbol}: "
              f"{gap_start_ts}–{gap_end_ts} "
              f"({(gap_end_ts - gap_start_ts) / 1000:.1f}s gap)")

        trades = self._fetch_trades(symbol, gap_start_ts, gap_end_ts)
        depth = self._fetch_depth_snapshot(symbol, gap_end_ts)

        print(f"[GapFill] Recovered {len(trades)} trades, "
              f"depth snapshot at {gap_end_ts}")

        return {
            "trades": trades,
            "depth_snapshot": depth,
            "gap_start": gap_start_ts,
            "gap_end": gap_end_ts,
        }

WebSocket Client with Reconnection and Gap Fill

class TickDBReconnectingClient:
    """
    WebSocket client with automatic reconnection and REST-based gap filling.

    ⚠️ This implementation uses the websocket-client library for clarity.
    For production HFT workloads with sub-100ms latency requirements,
    migrate to aiohttp/asyncio with a dedicated event loop.
    """

    def __init__(
        self,
        symbol: str,
        channels: List[str],
        on_trade: Optional[Callable] = None,
        on_depth: Optional[Callable] = None,
        on_gap_filled: Optional[Callable] = None,
    ):
        self.symbol = symbol
        self.channels = channels
        self.api_key = TICKDB_API_KEY

        self.conn_state = ConnectionState()
        self.rest_fetcher = TickDBRestFetcher(self.api_key)

        # Callbacks for downstream consumers
        self.on_trade = on_trade or (lambda t: None)
        self.on_depth = on_depth or (lambda d: None)
        self.on_gap_filled = on_gap_filled or (lambda r: None)

        # Local buffers for gap-fill staging
        self._trade_buffer: deque = deque(maxlen=10000)
        self._running = False
        self._ws = None
        self._thread = None
        self._reconnect_thread = None

    def _build_ws_url(self) -> str:
        """Constructs the authenticated WebSocket URL."""
        channels_param = ",".join(self.channels)
        return f"{WS_BASE}?api_key={self.api_key}&symbol={self.symbol}&channels={channels_param}"

    def _on_ws_open(self, ws):
        """Called when the WebSocket connection is established."""
        self.conn_state.on_connect()
        print(f"[WS] Connected to {self.symbol} on channels {self.channels}")

        # Check for a gap and trigger recovery if needed
        gap_start, gap_end = self.conn_state.get_gap_bounds()
        if gap_start is not None and gap_end > gap_start:
            self._recover_gap(gap_start, gap_end)

    def _on_ws_message(self, ws, message: str):
        """Processes incoming WebSocket messages and detects gaps."""
        try:
            msg = json.loads(message)

            # Handle pong (heartbeat response)
            if msg.get("type") == "pong":
                return

            # Extract timestamp — varies by message type
            ts = msg.get("ts") or msg.get("data", {}).get("ts", 0)
            self.conn_state.on_message(int(ts))

            # Route to appropriate handler
            msg_type = msg.get("type", "")
            if msg_type == "trade":
                self._process_trade(msg.get("data", {}))
            elif msg_type == "depth":
                self._process_depth(msg.get("data", {}))
            elif msg_type == "kline":
                self._process_kline(msg.get("data", {}))

        except json.JSONDecodeError:
            print("[WS] Received malformed message — skipping")

    def _process_trade(self, trade_data: Dict):
        """Routes a trade tick — from live stream or gap-fill buffer."""
        ts = int(trade_data.get("ts", 0))
        # Skip trades older than the last known timestamp (deduplication)
        if ts <= self.conn_state.last_received_ts:
            return
        self.on_trade(trade_data)

    def _process_depth(self, depth_data: Dict):
        """Routes a depth update."""
        self.on_depth(depth_data)

    def _process_kline(self, kline_data: Dict):
        """Routes a candle update."""
        self.on_kline(kline_data)

    def _on_ws_close(self, ws, close_status_code, close_msg):
        """Called when the WebSocket connection closes."""
        self.conn_state.on_disconnect()
        print(f"[WS] Disconnected (code={close_status_code}). "
              f"Scheduling reconnect in background.")
        self._schedule_reconnect()

    def _on_ws_error(self, ws, error):
        """Logs WebSocket errors without crashing."""
        print(f"[WS] Error: {error}")

    def _schedule_reconnect(self):
        """Schedules a reconnection attempt in a background thread."""
        if self._reconnect_thread and self._reconnect_thread.is_alive():
            return
        self._reconnect_thread = threading.Thread(
            target=self._reconnect_loop,
            daemon=True
        )
        self._reconnect_thread.start()

    def _reconnect_loop(self):
        """Implements exponential backoff with jitter for reconnection attempts."""
        attempt = 0
        while attempt < MAX_RECONNECT_ATTEMPTS:
            delay = min(
                INITIAL_BACKOFF_SEC * (BACKOFF_MULTIPLIER ** attempt),
                MAX_BACKOFF_SEC
            )
            # Add jitter to prevent thundering herd on shared network events
            delay += random.uniform(0, delay * JITTER_FACTOR)

            print(f"[Reconnect] Attempt {attempt + 1}/{MAX_RECONNECT_ATTEMPTS} "
                  f"in {delay:.2f}s")
            time.sleep(delay)

            try:
                ws_url = self._build_ws_url()
                self._ws = websocket.WebSocketApp(
                    ws_url,
                    on_open=self._on_ws_open,
                    on_message=self._on_ws_message,
                    on_close=self._on_ws_close,
                    on_error=self._on_ws_error,
                )

                # Run in a daemon thread; the heartbeat thread keeps it alive
                ws_thread = threading.Thread(
                    target=self._ws.run_forever,
                    daemon=True
                )
                ws_thread.start()

                # Wait briefly to confirm connection
                time.sleep(2)
                if self.conn_state.connected:
                    print("[Reconnect] Successfully reconnected.")
                    return

            except Exception as e:
                print(f"[Reconnect] Attempt failed: {e}")
                attempt += 1

        print("[Reconnect] Max attempts reached. Manual intervention required.")

    def _recover_gap(self, gap_start: int, gap_end: int):
        """
        Orchestrates the gap-fill process on reconnection.
        Fetches missing data via REST and stages it for downstream processing.
        """
        print(f"[GapFill] Detected gap of {(gap_end - gap_start)/1000:.1f}s. "
              f"Fetching via REST API.")
        try:
            result = self.rest_fetcher.fill_gap(self.symbol, gap_start, gap_end)

            # Route recovered trades through the deduplication filter
            for trade in result["trades"]:
                self._process_trade(trade)

            # Apply recovered depth snapshot as the current book state
            if result["depth_snapshot"]:
                self.on_depth(result["depth_snapshot"])

            self.on_gap_filled(result)

        except Exception as e:
            print(f"[GapFill] Recovery failed: {e}. "
                  f"Continuing with live stream — data gap persists.")
            # Do not re-raise: the live stream will eventually overwrite stale state.
            # In production, emit a metric or alert here.

    def _heartbeat_loop(self):
        """Sends periodic pings to keep the connection alive."""
        while self._running:
            if self._ws and self.conn_state.connected:
                try:
                    self._ws.send(json.dumps({"cmd": "ping"}))
                except Exception as e:
                    print(f"[Heartbeat] Failed: {e}")
            time.sleep(25)  # Ping every 25 seconds (server-side timeout is typically 30–60s)

    def connect(self):
        """Starts the client, establishing the WebSocket connection."""
        self._running = True

        # Start heartbeat thread
        hb_thread = threading.Thread(target=self._heartbeat_loop, daemon=True)
        hb_thread.start()

        # Establish initial connection
        self._schedule_reconnect()

    def disconnect(self):
        """Gracefully closes the connection."""
        self._running = False
        if self._ws:
            self._ws.close()
        print("[Client] Disconnected.")

Order Book Merger (Gap-Filled State Reconstruction)

@dataclass
class OrderBookState:
    """
    Maintains the current order book state, supporting atomic updates
    from both live WebSocket depth messages and REST gap-fill snapshots.
    """
    symbol: str
    bids: Dict[float, int] = field(default_factory=dict)   # price -> size
    asks: Dict[float, int] = field(default_factory=dict)   # price -> size
    last_update_ts: int = 0
    _lock: threading.Lock = field(default_factory=threading.Lock)

    def apply_snapshot(self, snapshot: Dict):
        """Atomically replaces the order book with a snapshot from REST."""
        with self._lock:
            self.bids.clear()
            self.asks.clear()

            for bid in snapshot.get("bids", []):
                self.bids[float(bid["price"])] = int(bid["size"])
            for ask in snapshot.get("asks", []):
                self.asks[float(ask["price"])] = int(ask["size"])

            self.last_update_ts = snapshot.get("ts", 0)

    def apply_update(self, update: Dict):
        """Applies a delta update from the live WebSocket stream."""
        with self._lock:
            ts = int(update.get("ts", 0))
            if ts <= self.last_update_ts:
                return  # Stale update — discard

            for bid in update.get("b", []):   # bid updates
                price, size = float(bid[0]), int(bid[1])
                if size == 0:
                    self.bids.pop(price, None)
                else:
                    self.bids[price] = size

            for ask in update.get("a", []):   # ask updates
                price, size = float(ask[0]), int(ask[1])
                if size == 0:
                    self.asks.pop(price, None)
                else:
                    self.asks[price] = size

            self.last_update_ts = ts

    def get_spread(self) -> float:
        """Returns the best bid-ask spread in dollars."""
        with self._lock:
            if not self.bids or not self.asks:
                return float('inf')
            best_bid = max(self.bids.keys())
            best_ask = min(self.asks.keys())
            return best_ask - best_bid

    def get_pressure_ratio(self, levels: int = 5) -> float:
        """
        Computes the buy/sell pressure ratio over the top N price levels.
        Ratio > 1 indicates buying pressure; ratio < 1 indicates selling pressure.
        """
        with self._lock:
            sorted_bids = sorted(self.bids.items(), reverse=True)[:levels]
            sorted_asks = sorted(self.asks.items(), key=lambda x: x[0])[:levels]

            bid_volume = sum(size for _, size in sorted_bids)
            ask_volume = sum(size for _, size in sorted_asks)

            if ask_volume == 0:
                return float('inf')
            return bid_volume / ask_volume

Putting It All Together: Usage Example

def main():
    # Initialize the order book state manager
    book_state = OrderBookState(symbol="AAPL.US")

    # Define message handlers
    def handle_trade(trade: Dict):
        print(f"[Trade] {trade.get('price')} × {trade.get('size')} "
              f"@ ts={trade.get('ts')}")

    def handle_depth(depth: Dict):
        book_state.apply_update(depth)
        print(f"[Depth] Spread={book_state.get_spread():.4f}  "
              f"Pressure={book_state.get_pressure_ratio():.2f}")

    def handle_gap_filled(result: Dict):
        print(f"[GapFilled] {len(result['trades'])} trades recovered, "
              f"window: {result['gap_start']}–{result['gap_end']}")

    # Instantiate and connect
    client = TickDBReconnectingClient(
        symbol="AAPL.US",
        channels=["trades", "depth"],
        on_trade=handle_trade,
        on_depth=handle_depth,
        on_gap_filled=handle_gap_filled,
    )

    print("[Main] Starting TickDB client with reconnection support...")
    client.connect()

    # Keep the main thread alive
    try:
        while True:
            time.sleep(10)
            print(f"[Status] Connected={client.conn_state.connected}  "
                  f"Last TS={client.conn_state.last_received_ts}")
    except KeyboardInterrupt:
        client.disconnect()


if __name__ == "__main__":
    main()

Timestamp Alignment: The Critical Detail

The gap-fill mechanism depends entirely on timestamp accuracy. Three alignment rules govern the merge process:

Rule 1: Deduplication boundary. Any data point with a timestamp less than or equal to last_received_ts is discarded during gap recovery. This prevents double-counting trades that were already processed before the disconnection. The boundary must be inclusive on the lower bound — if the last received trade has timestamp T, no recovered trade may have timestamp T or earlier.

Rule 2: Ordering guarantee. All gap-fill data is sorted by timestamp before being staged. This ensures that downstream consumers (strategies, dashboards, loggers) receive data in chronological order, which is essential for cumulative metrics like realized volatility or volume-weighted average price.

Rule 3: Atomic depth application. When a depth snapshot is fetched via REST and applied to the local OrderBookState, it replaces the entire book atomically rather than applying individual delta updates. This prevents partial-book states that could occur if live WebSocket updates interleave with gap-fill updates during the recovery window.


Deployment Recommendations by Scale

Deployment context Recommended configuration Key considerations
Individual quant trader Single process, deque buffer, console output Simple and sufficient; relies on REST gap fill for any disconnects
Team backtesting cluster Shared OrderBookState via Redis, per-symbol gap-fill workers Gap-fill runs asynchronously; workers publish to Redis; backtest engine consumes ordered stream
HFT / market-making Stateless per-message processing, no local book state, direct REST-to-strategy routing ⚠️ Requires asyncio migration; threading introduces GIL latency at tick speed
Institutional deployment Dedicated gap-fill microservice with its own WebSocket subscription per symbol, Kafka ingestion pipeline Full resilience; isolated failure domain for recovery operations

Monitoring and Observability

A reconnection and gap-fill system that does not emit metrics is invisible to operations. Production deployments should instrument at minimum:

  • Reconnection frequency: Tracks network stability and upstream TickDB service health. A spike in reconnection events warrants investigation.
  • Gap size distribution: Records the duration of each detected gap in milliseconds. Persistent large gaps indicate systematic connectivity issues.
  • Gap-fill success rate: Ratio of successfully recovered gaps to total detected gaps. A success rate below 100% means data gaps persist.
  • Pressure ratio delta: The difference between the pressure ratio immediately before the disconnect and the pressure ratio immediately after gap-fill completion. A large delta flags potential stale-state trading before recovery.
# Example instrumentation hook within _recover_gap
def _recover_gap(self, gap_start: int, gap_end: int):
    gap_size_ms = gap_end - gap_start
    # Emit metrics (using a placeholder — replace with your metrics library)
    metrics.gauge("gap.duration_ms", gap_size_ms, tags={"symbol": self.symbol})
    metrics.increment("reconnect.attempt", tags={"symbol": self.symbol})

Limitations and Trade-offs

This architecture addresses the most common reconnection scenario — a brief, involuntary disconnection — but it carries known constraints.

Tick data completeness: The REST /trades endpoint recovers trade ticks for HK equities and crypto markets where tick-level data is available. For US equities, the trades endpoint does not cover this asset class (see TickDB Core Knowledge Base). Users monitoring US equity tick flow must rely on the kline endpoint for candle-level reconstruction.

Order book snapshot lag: REST-fetched depth snapshots represent the book state at a point in time. If the gap is long (e.g., 10+ minutes), the fetched snapshot may be several seconds behind the current book by the time it is applied. Live WebSocket depth updates will overwrite it, but the transition introduces a brief period of reduced precision.

Duplicate handling at sub-millisecond precision: In extremely high-frequency scenarios (HFT strategies executing sub-second), timestamp deduplication using Unix milliseconds may be insufficient. Microsecond-precision timestamps, if available, require corresponding millisecond-level precision adjustments in the deduplication boundary.


Summary

Network disconnections are inevitable. Five minutes of missed data is survivable — with the right architecture.

The system described in this article provides a complete, production-ready solution: a watchdog monitor that detects disconnection events, an incremental REST fetcher that reconstructs the missing data window with bounded sub-requests, and a timestamp-aligned merger that integrates recovered data into the live stream without duplication or gaps.

The implementation covers all engineering standards: heartbeat, exponential backoff with jitter, rate-limit handling, timeouts on every HTTP request, environment-variable authentication, and thread-safe state management. The OrderBookState class provides a reference implementation for maintaining book integrity across disconnection boundaries.

For users requiring sub-100ms latency at HFT frequencies, the threading-based implementation should be migrated to asyncio with aiohttp. The architectural logic — detect, recover, merge — remains identical; only the concurrency model changes.


Next Steps

If you're building a market-making or order-flow strategy on HK equities or crypto, the gap-fill architecture above provides the foundation for maintaining a continuously consistent book state. Sign up at tickdb.ai and generate an API key to begin testing against live depth data.

If you need 10+ years of historical OHLCV data for backtesting your reconnection and gap-fill logic across historical market events, reach out to enterprise@tickdb.ai for Professional-tier access with extended historical coverage.

If you use AI coding assistants, search for and install the tickdb-market-data SKILL in your AI tool's marketplace for direct integration of TickDB endpoints into your development workflow.


This article does not constitute investment advice. Markets involve risk; past performance does not guarantee future results.