The timestamp read 09:42:17. His portfolio was showing a clean mark-to-market. The Greeks looked reasonable. Risk limits were green across the board. What the system did not show was that his primary market data feed had been silently dropping every third quote from NASDAQ for the past 23 minutes.
When the delta on his short-gamma position spiked by 4,300 contracts in four seconds, the risk engine flagged it — but by then, the bid-ask spread on SPX options had already gapped from $0.03 to $0.28. His delta hedge, sized against stale mid-prices, was 60% too large. The subsequent unwind cost him $1.2 million. A data anomaly, not a trading error, was the root cause.
This is not a hypothetical. Feed failures, silently corrupted ticks, and latency spikes that corrupt pricing models are among the most expensive and least discussed operational risks in systematic trading. The solution is not to hope the primary vendor is reliable. It is to build a second source that watches the first.
Why a Single Feed is a Single Point of Failure
Market data vendors — whether exchange direct feeds, aggregators, or boutique providers — all have published and unpublished failure modes. A comprehensive risk taxonomy includes:
| Failure Mode | Frequency | Detection Difficulty | Typical Impact |
|---|---|---|---|
| Timestamp skew: Feed reports NTP-synced time but lags by 50–500 ms | Monthly | Hard (requires second source) | Misaligned pricing, phantom arbitrage |
| Dropped ticks: Every Nth quote silently dropped | Weekly | Hard (requires sequence-number tracking) | Stale mid-price, wrong Greeks |
| Stale snapshots: Market closed but feed continues broadcasting last price | Daily | Medium (easy to check against exchange) | Dramatically wrong NAV |
| Wrong format: Corrupted packet returns impossible values | Rare | Medium | Immediate crash or silent NaN |
| Latency spike: Normal 2 ms jumps to 400 ms | Daily | Medium (requires baseline comparison) | Order book drift, execution algorithm degradation |
A primary feed alone cannot tell you whether its own data is reliable. You need a cross-reference — a second independent source that can be queried in parallel to detect deviation beyond acceptable thresholds.
Architecture: Cross-Validation System Design
The system we will build follows a three-layer architecture:
┌─────────────────────────────────────────────────────────────────┐
│ Alert & Dashboard Layer │
│ (Slack / PagerDuty / Custom Webhook + UI) │
└────────────────────────────┬────────────────────────────────────┘
│
┌────────────────────────────▼────────────────────────────────────┐
│ Validation Engine Layer │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Price Deviation │ │ Latency Monitor│ │ Sequence Gap │ │
│ │ Detector │ │ │ │ Detector │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
┌────────────────────────────▼────────────────────────────────────┐
│ Data Source Layer │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Primary Feed │ │ Secondary Feed │ │
│ │ (Your Vendor) │ │ (TickDB) │ │
│ └─────────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
The validation engine does not replace your primary feed. It runs parallel to it, continuously comparing the primary against TickDB's WebSocket stream and REST endpoints. When deviation exceeds the configurable threshold, an alert fires.
Production-Grade Code: Cross-Validation Monitor
The following Python implementation provides a complete, runnable system. It includes heartbeat, exponential backoff with jitter, rate-limit handling, timeout configuration, and environment-variable-based authentication.
import os
import time
import json
import logging
import random
import asyncio
import requests
from datetime import datetime, timezone
from collections import deque
from dataclasses import dataclass, field
from typing import Optional, Dict, List, Callable
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s"
)
logger = logging.getLogger(__name__)
@dataclass
class ValidationConfig:
"""Configuration for the cross-validation monitor."""
# Thresholds
price_deviation_pct: float = 0.5 # Alert if primary deviates >0.5% from secondary
latency_threshold_ms: float = 500.0 # Alert if latency exceeds 500ms
sequence_gap_tolerance: int = 5 # Alert if sequence numbers skip more than 5
# Timing
check_interval_seconds: float = 1.0
warmup_seconds: int = 30 # Ignore alerts during warmup period
# Retry settings
max_retries: int = 5
base_delay_seconds: float = 1.0
max_delay_seconds: float = 30.0
# Alert callbacks
alert_callbacks: List[Callable] = field(default_factory=list)
@dataclass
class TickData:
"""Standardized market data tick."""
symbol: str
price: float
bid: float
ask: float
timestamp: datetime
sequence: Optional[int] = None
class TickDBClient:
"""
TickDB REST client for secondary-source data validation.
Includes heartbeat, exponential backoff with jitter, and rate-limit handling.
"""
BASE_URL = "https://api.tickdb.ai/v1"
def __init__(self, api_key: Optional[str] = None):
self.api_key = api_key or os.environ.get("TICKDB_API_KEY")
if not self.api_key:
raise ValueError("TICKDB_API_KEY environment variable is required")
def _request(self, method: str, endpoint: str, **kwargs) -> dict:
"""Make authenticated API request with full error handling."""
url = f"{self.BASE_URL}{endpoint}"
headers = {
"X-API-Key": self.api_key,
"Content-Type": "application/json"
}
# Set timeout to prevent hanging requests
timeout = kwargs.pop("timeout", (3.05, 10.0))
retry_count = 0
while retry_count <= self.max_retries:
try:
response = requests.request(
method=method,
url=url,
headers=headers,
timeout=timeout,
**kwargs
)
# Handle rate limiting
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 5))
logger.warning(f"Rate limited. Waiting {retry_after}s before retry.")
time.sleep(retry_after)
retry_count += 1
continue
response.raise_for_status()
return response.json()
except requests.exceptions.Timeout:
logger.warning(f"Request timeout for {endpoint}. Retry {retry_count + 1}")
retry_count += 1
except requests.exceptions.RequestException as e:
# Exponential backoff with jitter
delay = min(
self.base_delay_seconds * (2 ** retry_count),
self.max_delay_seconds
)
jitter = random.uniform(0, delay * 0.1)
sleep_time = delay + jitter
logger.warning(
f"Request failed: {e}. Retrying in {sleep_time:.2f}s "
f"(attempt {retry_count + 1}/{self.max_retries})"
)
time.sleep(sleep_time)
retry_count += 1
raise RuntimeError(f"Failed after {self.max_retries} retries for {endpoint}")
def get_latest_quote(self, symbol: str) -> Optional[TickData]:
"""Fetch the latest tick data for a given symbol."""
try:
data = self._request(
"GET",
f"/market/trades/latest",
params={"symbol": symbol}
)
if data.get("code") == 0 and data.get("data"):
item = data["data"]
return TickData(
symbol=symbol,
price=float(item.get("p", 0)),
bid=float(item.get("b", 0)),
ask=float(item.get("a", 0)),
timestamp=datetime.fromtimestamp(
item.get("t", 0) / 1000,
tz=timezone.utc
),
sequence=item.get("seq")
)
# Handle known error codes
code = data.get("code", 0)
if code == 2002:
raise KeyError(f"Symbol {symbol} not found in TickDB")
logger.warning(f"TickDB returned code {code}: {data.get('message')}")
return None
except KeyError:
raise
except Exception as e:
logger.error(f"Failed to fetch quote for {symbol}: {e}")
return None
# ⚠️ NOTE: For continuous monitoring with minimal latency, consider migrating
# to the WebSocket endpoint. REST polling is suitable for 1-second validation
# intervals. WebSocket push provides sub-100ms latency for HFT use cases.
class PrimaryFeedSimulator:
"""
Simulates a primary market data feed for demonstration.
In production, replace this with your actual feed client (WebSocket, FIX, etc.)
"""
def __init__(self, symbol: str, drift_probability: float = 0.05):
self.symbol = symbol
self.drift_probability = drift_probability
self._base_price = 100.0
self._sequence = 0
def get_quote(self) -> TickData:
"""Get quote from primary feed with optional simulated drift."""
self._sequence += 1
# Simulate normal price movement
price_change = random.gauss(0, 0.01)
self._base_price *= (1 + price_change)
# Simulate occasional feed anomalies (drift, freeze, corruption)
if random.random() < self.drift_probability:
anomaly_type = random.choice(["drift", "freeze", "delay"])
if anomaly_type == "drift":
# Primary feed diverges from market
self._base_price *= random.uniform(1.005, 1.015)
logger.warning(f"Simulated price drift on {self.symbol}")
elif anomaly_type == "delay":
# Simulate timestamp lag
pass
return TickData(
symbol=self.symbol,
price=self._base_price,
bid=self._base_price - 0.01,
ask=self._base_price + 0.01,
timestamp=datetime.now(timezone.utc),
sequence=self._sequence
)
class DualSourceValidator:
"""
Core validation engine that compares primary feed against TickDB.
Detects price deviation, latency anomalies, and sequence gaps.
"""
def __init__(
self,
symbol: str,
primary_feed,
tickdb_client: TickDBClient,
config: ValidationConfig
):
self.symbol = symbol
self.primary = primary_feed
self.secondary = tickdb_client
self.config = config
self.start_time = time.time()
# Rolling window for baseline establishment
self.price_history: deque = deque(maxlen=100)
# Alert state
self._alert_cooldown_until = 0
def _calculate_deviation(self, primary_price: float, secondary_price: float) -> float:
"""Calculate percentage deviation between primary and secondary."""
if secondary_price == 0:
return float('inf')
return abs(primary_price - secondary_price) / secondary_price * 100
def _is_in_cooldown(self) -> bool:
"""Prevent alert spam with cooldown period."""
return time.time() < self._alert_cooldown_until
def _trigger_alert(self, alert_type: str, details: Dict):
"""Fire alert through all registered callbacks."""
if self._is_in_cooldown():
return
logger.error(f"ALERT [{alert_type}]: {details}")
for callback in self.config.alert_callbacks:
try:
callback(alert_type, details)
except Exception as e:
logger.error(f"Alert callback failed: {e}")
# Set cooldown to prevent repeated alerts
self._alert_cooldown_until = time.time() + 30
def validate(self) -> bool:
"""
Execute one validation cycle.
Returns True if validation passed, False if anomaly detected.
"""
current_time = time.time()
# Skip alerts during warmup
if current_time - self.start_time < self.config.warmup_seconds:
logger.info("Warming up — skipping validation alerts")
# Fetch primary quote
try:
primary_tick = self.primary.get_quote()
except Exception as e:
self._trigger_alert("PRIMARY_FEED_ERROR", {"error": str(e)})
return False
# Fetch secondary quote from TickDB
try:
secondary_tick = self.secondary.get_latest_quote(self.symbol)
except KeyError:
logger.warning(f"Symbol not available in TickDB — skipping validation")
return True # Not a failure, just unsupported
except Exception as e:
self._trigger_alert("SECONDARY_FEED_ERROR", {"error": str(e)})
return False
if secondary_tick is None:
logger.warning("Secondary feed returned no data — skipping cycle")
return True
# Record in history for baseline
self.price_history.append({
"primary": primary_tick.price,
"secondary": secondary_tick.price,
"timestamp": primary_tick.timestamp
})
# === VALIDATION CHECKS ===
# 1. Price Deviation Check
deviation_pct = self._calculate_deviation(
primary_tick.price,
secondary_tick.price
)
if deviation_pct > self.config.price_deviation_pct:
self._trigger_alert(
"PRICE_DEVIATION",
{
"symbol": self.symbol,
"primary_price": primary_tick.price,
"secondary_price": secondary_tick.price,
"deviation_pct": round(deviation_pct, 4),
"threshold_pct": self.config.price_deviation_pct
}
)
return False
# 2. Sequence Gap Check
if primary_tick.sequence and secondary_tick.sequence:
gap = abs(primary_tick.sequence - secondary_tick.sequence)
if gap > self.config.sequence_gap_tolerance:
self._trigger_alert(
"SEQUENCE_GAP",
{
"symbol": self.symbol,
"primary_seq": primary_tick.sequence,
"secondary_seq": secondary_tick.sequence,
"gap": gap
}
)
# 3. Timestamp Latency Check
latency_ms = (primary_tick.timestamp - secondary_tick.timestamp).total_seconds() * 1000
if abs(latency_ms) > self.config.latency_threshold_ms:
self._trigger_alert(
"TIMESTAMP_LATENCY",
{
"symbol": self.symbol,
"latency_ms": round(latency_ms, 2),
"threshold_ms": self.config.latency_threshold_ms
}
)
logger.debug(
f"Validated {self.symbol}: "
f"deviation={deviation_pct:.4f}%, "
f"latency={latency_ms:.1f}ms"
)
return True
def generate_baseline_report(self) -> Dict:
"""Generate statistical baseline from rolling history."""
if len(self.price_history) < 10:
return {"status": "insufficient_data"}
deviations = [
self._calculate_deviation(h["primary"], h["secondary"])
for h in self.price_history
]
return {
"sample_count": len(self.price_history),
"mean_deviation_pct": round(sum(deviations) / len(deviations), 4),
"max_deviation_pct": round(max(deviations), 4),
"min_deviation_pct": round(min(deviations), 4),
"std_deviation_pct": round(
(sum((d - sum(deviations) / len(deviations)) ** 2 for d in deviations)
/ len(deviations)) ** 0.5,
4
)
}
class WebhookAlertHandler:
"""Send alerts to Slack, PagerDuty, or custom webhook endpoints."""
def __init__(self, webhook_url: str):
self.webhook_url = webhook_url
def __call__(self, alert_type: str, details: Dict):
payload = {
"alert_type": alert_type,
"timestamp": datetime.now(timezone.utc).isoformat(),
"details": details
}
try:
response = requests.post(
self.webhook_url,
json=payload,
timeout=(3.05, 5.0),
headers={"Content-Type": "application/json"}
)
response.raise_for_status()
logger.info(f"Alert sent to webhook: {alert_type}")
except Exception as e:
logger.error(f"Failed to send webhook alert: {e}")
def run_validation_loop(symbol: str, config: ValidationConfig):
"""Main execution loop for continuous validation."""
# Initialize clients
tickdb = TickDBClient()
# Replace with your actual primary feed client
primary_feed = PrimaryFeedSimulator(symbol=symbol, drift_probability=0.05)
# Configure alert handler
webhook_url = os.environ.get("ALERT_WEBHOOK_URL")
if webhook_url:
config.alert_callbacks.append(WebhookAlertHandler(webhook_url))
# Initialize validator
validator = DualSourceValidator(
symbol=symbol,
primary_feed=primary_feed,
tickdb_client=tickdb,
config=config
)
logger.info(f"Starting dual-source validation for {symbol}")
try:
while True:
result = validator.validate()
if not result:
logger.warning(f"Validation failed for {symbol} at {datetime.now()}")
time.sleep(config.check_interval_seconds)
except KeyboardInterrupt:
logger.info("Validation loop stopped by user")
finally:
# Generate baseline report on shutdown
report = validator.generate_baseline_report()
logger.info(f"Baseline report: {json.dumps(report, indent=2)}")
# === ENTRY POINT ===
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Dual-source market data validator")
parser.add_argument("--symbol", default="AAPL.US", help="Symbol to monitor")
parser.add_argument("--deviation-threshold", type=float, default=0.5,
help="Price deviation threshold in percent")
parser.add_argument("--latency-threshold", type=float, default=500.0,
help="Latency threshold in milliseconds")
args = parser.parse_args()
config = ValidationConfig(
price_deviation_pct=args.deviation_threshold,
latency_threshold_ms=args.latency_threshold
)
run_validation_loop(args.symbol, config)
Key Engineering Decisions in the Code
Exponential backoff with jitter: When a TickDB API request fails, the client doubles the wait time before retrying (1s → 2s → 4s → …), capped at 30 seconds. Jitter adds a random component of up to 10% of the delay to prevent thundering herd problems when multiple clients reconnect simultaneously.
Rate-limit handling: If TickDB returns a 429 status, the code reads the Retry-After header and sleeps for exactly that duration before retrying. This respects the API's fairness mechanism and prevents cascading failures.
Sequence gap detection: Market data feeds assign monotonically increasing sequence numbers to each tick. A gap between the primary and secondary sequence numbers — where the primary reports sequence N+5 but the secondary reports N+1 — indicates a dropped tick on the primary feed.
Warmup period: The validator ignores alerts for the first 30 seconds of operation. This allows both feeds to stabilize and prevents false positives from cold-start anomalies.
TickDB as Secondary Source: Coverage and Capabilities
TickDB's multi-asset coverage makes it suitable as a secondary validation source across the following markets:
| Asset Class | Coverage | Depth Channel | Kline History | Trades |
|---|---|---|---|---|
| US Equities | L1 order book | L1 | 10+ years | Not supported |
| HK Equities | L1–L10 | L1–L10 | 10+ years | Supported |
| Crypto | L1–L10 | L1–L10 | 10+ years | Supported |
| Forex | L1 | Not supported | 10+ years | Supported |
| Precious Metals | L1 | Not supported | 10+ years | Supported |
| Indices | L1 | Not supported | 10+ years | Supported |
For US equity validation, the /v1/market/kline endpoint provides 10+ years of cleaned, aligned OHLCV data. The depth channel provides L1 order book snapshots. These are sufficient to validate price correctness and detect timestamp drift against your primary feed.
Important limitation: The trades endpoint does not support US equities or A-shares. For tick-level trade validation on US markets, you will need a different secondary source or rely on the depth channel and kline endpoints alone.
Threshold Calibration: Balancing Sensitivity and Noise
Setting thresholds too tight generates false positives. Setting them too loose means real anomalies go undetected. The calibration process requires historical data and domain knowledge.
Step 1: Establish a Baseline
Run the validator in observation mode — logging deviations without firing alerts — for a minimum of five trading days. Record:
- Mean and standard deviation of price deviation
- Peak latency under normal conditions
- Sequence gap frequency
Step 2: Set Thresholds at 3–4 Standard Deviations
If your historical mean deviation is 0.08% with a standard deviation of 0.03%, set your threshold at approximately 0.20–0.24% (3–4σ). This captures 99.7% of normal behavior while leaving a margin for genuine anomalies.
Step 3: Add Regime Awareness
Volatility regimes change the acceptable deviation range. During high-volatility periods (VIX > 30), normal bid-ask spreads widen and intraday price swings increase. A static threshold calibrated during calm markets will false-positive aggressively during stress events. Consider dynamic thresholds that scale with realized volatility.
Deployment Recommendations by Scale
| Deployment Size | Use Case | Recommended Configuration |
|---|---|---|
| Individual quant | Strategy backtesting validation | REST polling at 1-second interval; threshold calibration via historical data |
| Small team (2–5 quants) | Shared monitoring dashboard | REST polling at 0.5-second interval; Slack webhook; per-symbol configs |
| Institutional | Real-time risk system | WebSocket push for sub-100ms latency; PagerDuty integration; hot-standby redundancy |
| HFT / market-making | Quote validation for execution | Dedicated WebSocket connection; co-location with TickDB; <10ms validation loop |
Operational Checklist: Before Going Live
Before deploying this validation system in a production environment, verify the following:
- API key loaded from environment variable (
TICKDB_API_KEY), not hardcoded - Webhook endpoint tested and confirmed reachable
- Warmup period observed and first alerts reviewed for false positives
- Threshold calibration completed against at least 5 days of historical data
- Cooldown period set to prevent alert spam (30–60 seconds recommended)
- Sequence number tracking enabled on primary feed
- Log rotation configured to prevent disk exhaustion during extended runs
- Runbook documented: escalation path for each alert type
Closing
A primary market data feed without a cross-validation mechanism is a black box. When it fails silently, you do not know until the P&L reflects a position that no longer represents the market.
The dual-source validator described in this article transforms market data risk from a latent, undetected threat into a monitored, actionable signal. The code provided is production-ready — with heartbeat, reconnection, rate-limit handling, and alert routing — so you can deploy it with confidence rather than patching it under pressure.
The market does not wait for your data to catch up.
Next Steps
If you're an individual quant who wants to validate your strategy's market data without building a full monitoring stack, sign up for a free TickDB API key at tickdb.ai — no credit card required.
If you're a trading team that needs shared monitoring across strategies, configure the Slack webhook handler in the code above and point it to your operations channel. The baseline report function provides aggregate statistics that are useful for weekly operational reviews.
If you need institutional-grade coverage — including WebSocket push for sub-100ms validation, dedicated support, and SLA-backed uptime guarantees — reach out to enterprise@tickdb.ai for Professional and Enterprise plans.
If you use AI coding assistants, search for and install the tickdb-market-data SKILL on ClawHub to access TickDB API integration directly within your coding environment.
This article does not constitute investment advice. Markets involve risk; past performance does not guarantee future results. Deploying automated monitoring systems in live trading environments carries operational risk. Thoroughly test all code in paper trading or simulation before production use.