TickDB Latency SLA: What the 99th Percentile Actually Guarantees for Your Trading System | API Guide

A trading system built on faulty latency assumptions does not fail dramatically. It fails quietly — slipping into a 200 ms disadvantage on every tick, compounding silently until the drawdown curve tells the story nobody wanted to read.

That is why the question "what is TickDB's actual latency?" deserves more than a marketing number. It deserves a precise answer grounded in distribution, not averages. It deserves a method for verifying it under your own production conditions. And it deserves clarity on what the SLA actually covers — and what it does not.

This article answers all three.

1. Why Averages Lie and P50 Misleads

Before opening a single API endpoint, it is worth establishing the right vocabulary. Latency is not a single number. It is a distribution.

When a vendor says "sub-100ms latency," they typically mean the median (P50) — the point where half of all requests complete faster. A P50 of 80 ms sounds excellent. But if 1% of your requests take 8 seconds, your trading system bleeds money on every outlier tick.

Quantitatively rigorous latency reporting uses three percentile tiers:

Percentile	What it measures	Why it matters for trading
P50	Median response time	Baseline performance — tells you what to expect on a quiet day
P95	95th percentile	Your tail exposure on normal volatility days
P99	99th percentile	Worst-case performance you will experience roughly 1 in 100 requests — the number that kills mean-reversion strategies

For a market data API powering a live trading system, P99 is the only honest SLA metric. P50 is a sales metric.

TickDB publicly benchmarks against P99 across its core endpoints, which is the correct framing. The sections below break down what those benchmarks mean, how the system behaves under stress, and — most importantly — how you can measure them yourself before committing capital to a strategy.

2. TickDB Endpoint Latency Breakdown

TickDB exposes three primary data retrieval patterns, each with distinct latency characteristics. The figures below represent the expected P99 range under normal market conditions — defined as intraday sessions without scheduled news events, earnings releases, or macro data prints.

Important caveat: All latency figures are measured from TickDB's server-side processing to response delivery. Client-side network transit, geographic routing, and TLS handshake overhead add variable latency that is outside TickDB's control. The latency verification method in Section 4 accounts for this.

2.1 REST API — Historical Data Retrieval

The GET /v1/market/kline endpoint serves historical OHLCV candles. It is the backbone of backtesting workflows and end-of-day data pipelines.

Parameter	Typical P50	Typical P95	Typical P99	Notes
Small query (≤ 100 candles)	40–70 ms	80–120 ms	100–150 ms	Fast path for recent data
Medium query (100–1,000 candles)	60–100 ms	120–180 ms	150–250 ms	Common for daily-to-weekly lookbacks
Large query (> 1,000 candles)	100–200 ms	200–350 ms	300–500 ms	Longer lookback windows; server-side aggregation cost

The GET /v1/symbols/available endpoint — used for symbol discovery — consistently returns in under 100 ms at P99 under normal load.

2.2 WebSocket Streaming — Real-Time Data

WebSocket connections deliver the lowest-latency data path because they eliminate the request-response round-trip overhead entirely. Data is pushed from the server the moment it is processed.

Channel	Typical P50	Typical P95	Typical P99	Notes
`kline` (real-time candle)	30–60 ms	50–80 ms	60–100 ms	First push after candle close
`depth` (order book snapshot)	20–50 ms	40–70 ms	50–90 ms	Varies by market — HK/Crypto L1–L10; US L1
`trades` (individual executions)	15–40 ms	30–60 ms	40–80 ms	Supported for HK equities and crypto; not available for US equities

These figures reflect server-to-client delivery at the network layer. Actual end-to-end latency observed on your machine depends on geographic proximity to TickDB's Point of Presence (PoP), your ISP's routing, and whether your client performs additional processing (deserialization, order book reconstruction) before passing data to the strategy engine.

2.3 The "Extreme Conditions" Boundary

Normal market conditions exclude high-volatility events. During earnings releases, Federal Reserve announcements, CPI prints, or flash-crash episodes, every market data vendor experiences latency degradation. TickDB is no exception.

The honest answer: P99 can stretch to 500 ms–2,000 ms during extreme volatility windows. This is not a TickDB-specific failure. It reflects the infrastructure reality that all financial data vendors — from boutique feeds to Bloomberg Terminal — share. Exchange matching engine latency increases, co-location contention rises, and upstream data sources introduce queueing delays.

What separates a professional-grade vendor from a consumer-grade one is not eliminating these spikes — it is minimizing their duration and providing transparency about when they occur. TickDB's operational status page and rate-limit headers provide real-time signals when the system is under elevated load.

3. Understanding the SLA Structure

TickDB does not publish a traditional "Gold/Silver/Bronze" SLA tier model with uptime guarantees expressed in nines. Instead, the SLA framework is built around two concrete commitments:

3.1 Data Delivery Commitment

TickDB guarantees that successfully authenticated and rate-limit-compliant requests receive a response within the documented latency ranges under normal market conditions. This is a performance floor, not a performance ceiling.

The commitment is meaningful because it is bounded by observable conditions:

The latency figures assume the request is well-formed and the symbol is available.
Requests that trigger error codes (1001, 1002, 2002) do not count against the latency commitment — error responses are returned immediately.
Requests rejected by rate limiting (3001) return a response immediately with the Retry-After header — no data gap is created.

3.2 What the SLA Does Not Cover

Intellectual honesty requires stating the boundaries:

Scenario	Covered by SLA?	Notes
Client-side network jitter	No	Outside TickDB's infrastructure boundary
Geographic latency variance	No	A user in Sydney observing higher latency than one in New York is not an SLA breach
Exchange-induced upstream delays	Partially	TickDB absorbs upstream degradation within reason; extreme exchange events may propagate
Rate-limit-induced retries	No	Retries due to `3001` responses reset the latency clock
Historical data backfill after outage	No	Separate data recovery process; not latency-guaranteed

This is standard practice across professional market data vendors. The key is that the data integrity commitment is independent of latency commitment — you will receive all ticks that occurred, even if delivery is delayed.

4. Building Your Own Latency Verification Harness

Numbers on a webpage mean nothing until they survive contact with your own infrastructure, your own network path, and your own market regime. This section provides production-grade code that continuously measures TickDB's latency distribution from your deployment environment.

4.1 Verification Architecture

The verification harness runs three concurrent measurement threads:

REST latency sampler: Measures round-trip time for GET /v1/market/kline/latest — the canonical "how fast is current data available" query.
WebSocket latency sampler: Measures the delay between a timestamped tick arriving from the WebSocket stream and the client's receive timestamp. This captures true push-latency.
Statistics aggregator: Continuously computes P50, P95, P99, and max latency. Emits structured JSON logs every 60 seconds.

4.2 Production-Grade Code

"""
TickDB Latency Verification Harness
====================================
Continuously measures P50/P95/P99 latency for REST and WebSocket endpoints.
Output: Structured JSON statistics every 60 seconds.

Prerequisites:
  pip install websockets numpy
  export TICKDB_API_KEY="your_api_key_here"

⚠️ Engineering note:
  This harness is single-threaded for clarity. Production deployments
  should spawn separate processes or use aiohttp/asyncio for parallel
  measurement without blocking. This script is suitable for a dedicated
  verification instance, not for co-locating with a live strategy engine.
"""

import os
import json
import time
import logging
import threading
import statistics
from datetime import datetime, timezone
from typing import Optional

import numpy as np
import requests
import websocket  # pip install websocket-client

# =============================================================================
# Configuration
# =============================================================================
API_KEY: Optional[str] = os.environ.get("TICKDB_API_KEY")
BASE_URL = "https://api.tickdb.ai/v1"
WS_URL = "wss://api.tickdb.ai/v1/ws"

# Verification symbols — mix of liquid and less-liquid instruments
SYMBOLS = ["AAPL.US", "NVDA.US", "BTC.USDT", "TSLA.US", "9988.HK"]

# Measurement parameters
SAMPLE_INTERVAL_SEC = 1.0          # Pause between REST samples (per symbol)
REPORT_INTERVAL_SEC = 60.0         # Emit statistics every 60 seconds
PERSISTENT_CONNECTION = True       # Keep WebSocket open across samples
RECONNECT_BACKOFF_BASE = 1.0        # seconds
RECONNECT_BACKOFF_MAX = 32.0        # seconds
JITTER_FRACTION = 0.1              # 10% jitter to prevent thundering herd

if not API_KEY:
    raise ValueError(
        "TICKDB_API_KEY environment variable is not set. "
        "Generate an API key at https://api.tickdb.ai/dashboard"
    )

# =============================================================================
# Logging setup
# =============================================================================
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(message)s",
    datefmt="%Y-%m-%dT%H:%M:%S%z",
)
log = logging.getLogger("latency_harness")


# =============================================================================
# Error handler (mirrors TickDB error code reference)
# =============================================================================
def handle_api_error(response: requests.Response, symbol: str) -> None:
    """Log and re-raise TickDB API errors with context."""
    try:
        body = response.json()
    except ValueError:
        body = {"code": -1, "message": "non-JSON response"}

    code = body.get("code", 0)
    if code in (1001, 1002):
        raise ValueError(
            f"Invalid API key — check TICKDB_API_KEY. Response: {body}"
        )
    if code == 2002:
        raise KeyError(f"Symbol {symbol} not found — verify via /v1/symbols/available")
    if code == 3001:
        retry_after = int(response.headers.get("Retry-After", 5))
        log.warning("Rate limited — backing off %d seconds", retry_after)
        time.sleep(retry_after)
        return None
    raise RuntimeError(f"Unexpected error {code}: {body.get('message')}")


# =============================================================================
# REST latency sampler
# =============================================================================
class RESTLatencySampler:
    """Continuously samples REST endpoint latency and stores results."""

    def __init__(self, symbols: list[str]):
        self.symbols = symbols
        self.latencies_ms: list[float] = []
        self._stop_event = threading.Event()
        self._thread: Optional[threading.Thread] = None

    def _measure_request(self, symbol: str) -> Optional[float]:
        """Send a single kline/latest request and measure round-trip time."""
        start = time.perf_counter()
        headers = {"X-API-Key": API_KEY}

        try:
            response = requests.get(
                f"{BASE_URL}/market/kline/latest",
                headers=headers,
                params={"symbol": symbol, "interval": "1m"},
                timeout=(3.05, 10)  # connect timeout, read timeout
            )
            elapsed_ms = (time.perf_counter() - start) * 1000

            if response.status_code != 200:
                log.warning("Non-200 response for %s: %d", symbol, response.status_code)
                return None

            body = response.json()
            if body.get("code", 0) != 0:
                handle_api_error(response, symbol)
                return None

            return elapsed_ms

        except requests.exceptions.Timeout:
            log.warning("Request timed out for %s", symbol)
            return None
        except requests.exceptions.ConnectionError as e:
            log.warning("Connection error for %s: %s", symbol, e)
            return None

    def _sampler_loop(self) -> None:
        symbol_idx = 0
        while not self._stop_event.is_set():
            symbol = self.symbols[symbol_idx % len(self.symbols)]
            symbol_idx += 1

            latency = self._measure_request(symbol)
            if latency is not None:
                self.latencies_ms.append(latency)
                log.debug("REST latency %s: %.2f ms", symbol, latency)

            # Cycle through symbols with a small random jitter to avoid
            # synchronized sampling if running multiple instances
            jitter_ms = np.random.uniform(0, self.SAMPLE_INTERVAL_SEC * 1000 * JITTER_FRACTION)
            time.sleep(max(0.1, SAMPLE_INTERVAL_SEC - jitter_ms / 1000))

    def start(self) -> None:
        self._thread = threading.Thread(target=self._sampler_loop, daemon=True)
        self._thread.start()
        log.info("REST latency sampler started")

    def stop(self) -> None:
        self._stop_event.set()
        if self._thread:
            self._thread.join(timeout=5.0)
        log.info("REST latency sampler stopped")

    def get_stats(self) -> dict:
        if not self.latencies_ms:
            return {"samples": 0}
        arr = np.array(self.latencies_ms)
        return {
            "samples": len(self.latencies_ms),
            "p50_ms":  float(np.percentile(arr, 50)),
            "p95_ms":  float(np.percentile(arr, 95)),
            "p99_ms":  float(np.percentile(arr, 99)),
            "max_ms":  float(np.max(arr)),
            "mean_ms": float(np.mean(arr)),
            "std_ms":  float(np.std(arr)),
        }


# =============================================================================
# WebSocket latency sampler
# =============================================================================
class WebSocketLatencySampler:
    """
    Maintains a persistent WebSocket connection and measures the delay
    between the server-side event timestamp (if available) and the client's
    receive timestamp.

    ⚠️ Engineering note: For HFT workloads, replace this with an asyncio-based
    implementation using aiohttp and run on a co-located server. A Python-based
    sampler on a remote machine will not give you true HFT latency numbers.
    """

    def __init__(self, symbols: list[str]):
        self.symbols = symbols
        self.latencies_ms: list[float] = []
        self._stop_event = threading.Event()
        self._thread: Optional[threading.Thread] = None
        self._ws: Optional[websocket.WebSocket] = None
        self._reconnect_delay = RECONNECT_BACKOFF_BASE

    def _connect(self) -> bool:
        """Establish authenticated WebSocket connection."""
        try:
            self._ws = websocket.WebSocketApp(
                f"{WS_URL}?api_key={API_KEY}",
                on_message=self._on_message,
                on_error=self._on_error,
                on_close=self._on_close,
                on_open=self._on_open,
            )
            return True
        except Exception as e:
            log.error("WebSocket connection failed: %s", e)
            return False

    def _on_open(self, ws: websocket.WebSocketApp) -> None:
        """Subscribe to kline channels on connection open."""
        log.info("WebSocket connected — subscribing to channels")
        self._reconnect_delay = RECONNECT_BACKOFF_BASE  # Reset backoff on successful connect

        for symbol in self.symbols:
            subscribe_msg = json.dumps({
                "cmd": "subscribe",
                "channel": "kline",
                "symbol": symbol,
                "interval": "1m"
            })
            ws.send(subscribe_msg)
            log.debug("Subscribed to %s kline", symbol)

        # Send periodic ping for keepalive
        ping_msg = json.dumps({"cmd": "ping"})
        ws.send(ping_msg)

    def _on_message(self, ws: websocket.WebSocketApp, raw_message: str) -> None:
        """Record receive timestamp; compute delay if server timestamp is present."""
        receive_time = time.perf_counter()

        try:
            message = json.loads(raw_message)
        except json.JSONDecodeError:
            log.warning("Non-JSON WebSocket message received: %s", raw_message[:100])
            return

        # If the message carries a server-side event timestamp, compute delay
        # TickDB depth/trades messages may include a 'ts' field in milliseconds
        if "ts" in message:
            server_ts_sec = message["ts"] / 1000.0
            latency_ms = (receive_time - server_ts_sec) * 1000
            self.latencies_ms.append(latency_ms)
            log.debug("WS latency: %.2f ms", latency_ms)

        # Handle pong response (heartbeat acknowledgment)
        if message.get("type") == "pong":
            log.debug("Heartbeat acknowledged")

    def _on_error(self, ws: websocket.WebSocketApp, error: Exception) -> None:
        log.error("WebSocket error: %s", error)

    def _on_close(self, ws: websocket.WebSocketApp, close_status_code: int, close_msg: str) -> None:
        log.warning(
            "WebSocket closed (code=%d, msg=%s) — will attempt reconnect",
            close_status_code,
            close_msg,
        )

    def _reconnect_with_backoff(self) -> None:
        """Reconnect with exponential backoff and jitter."""
        import random
        delay = min(
            self._reconnect_delay * (1 + random.uniform(-JITTER_FRACTION, JITTER_FRACTION)),
            RECONNECT_BACKOFF_MAX,
        )
        log.info("Reconnecting in %.1f seconds (backoff=%.1f)", delay, self._reconnect_delay)
        time.sleep(delay)
        self._reconnect_delay = min(self._reconnect_delay * 2, RECONNECT_BACKOFF_MAX)

    def _ws_loop(self) -> None:
        while not self._stop_event.is_set():
            if self._ws:
                try:
                    # run_forever() blocks until connection closes or error
                    self._ws.run_forever(ping_interval=15, ping_timeout=5)
                except Exception as e:
                    log.error("WebSocket run_forever exception: %s", e)

            if not self._stop_event.is_set():
                self._reconnect_with_backoff()

    def start(self) -> None:
        self._thread = threading.Thread(target=self._ws_loop, daemon=True)
        self._thread.start()
        log.info("WebSocket latency sampler started")

    def stop(self) -> None:
        self._stop_event.set()
        if self._ws:
            self._ws.close()
        if self._thread:
            self._thread.join(timeout=5.0)
        log.info("WebSocket latency sampler stopped")

    def get_stats(self) -> dict:
        if not self.latencies_ms:
            return {"samples": 0}
        arr = np.array(self.latencies_ms)
        return {
            "samples": len(self.latencies_ms),
            "p50_ms":  float(np.percentile(arr, 50)),
            "p95_ms":  float(np.percentile(arr, 95)),
            "p99_ms":  float(np.percentile(arr, 99)),
            "max_ms":  float(np.max(arr)),
            "mean_ms": float(np.mean(arr)),
            "std_ms":  float(np.std(arr)),
        }


# =============================================================================
# Statistics reporter
# =============================================================================
class StatisticsReporter:
    """Periodically emits structured latency statistics."""

    def __init__(
        self,
        rest_sampler: RESTLatencySampler,
        ws_sampler: WebSocketLatencySampler,
        interval: float = 60.0,
    ):
        self.rest_sampler = rest_sampler
        self.ws_sampler = ws_sampler
        self.interval = interval
        self._stop_event = threading.Event()

    def _emit_report(self) -> None:
        ts = datetime.now(timezone.utc).isoformat(timespec="seconds")
        rest_stats = self.rest_sampler.get_stats()
        ws_stats = self.ws_sampler.get_stats()

        report = {
            "timestamp": ts,
            "measurement_type": "tickdb_latency_verification",
            "rest": rest_stats,
            "websocket": ws_stats,
        }
        log.info("LATENCY_REPORT: %s", json.dumps(report))

    def run(self) -> None:
        log.info(
            "Statistics reporter running — emitting report every %.0f seconds",
            self.interval,
        )
        while not self._stop_event.is_set():
            time.sleep(self.interval)
            if not self._stop_event.is_set():
                self._emit_report()

    def stop(self) -> None:
        self._stop_event.set()


# =============================================================================
# Main entry point
# =============================================================================
def main() -> None:
    log.info("=" * 60)
    log.info("TickDB Latency Verification Harness")
    log.info("Symbols under test: %s", ", ".join(SYMBOLS))
    log.info("REST sample interval: %.1f s | Report interval: %.0f s", SAMPLE_INTERVAL_SEC, REPORT_INTERVAL_SEC)
    log.info("=" * 60)

    rest_sampler = RESTLatencySampler(SYMBOLS)
    ws_sampler = WebSocketLatencySampler(SYMBOLS)
    reporter = StatisticsReporter(rest_sampler, ws_sampler, REPORT_INTERVAL_SEC)

    # Start all threads
    rest_sampler.start()
    ws_sampler.start()

    try:
        reporter.run()
    except KeyboardInterrupt:
        log.info("Interrupted — shutting down")
    finally:
        reporter.stop()
        rest_sampler.stop()
        ws_sampler.stop()
        log.info("Shutdown complete")


if __name__ == "__main__":
    main()

4.3 Interpreting the Output

Running the harness for at least 24 hours under live market conditions produces a latency profile you can compare against the documented SLA ranges. A typical output report looks like this after 60 seconds of sampling:

{
  "timestamp": "2026-04-22T14:30:00+0000",
  "measurement_type": "tickdb_latency_verification",
  "rest": {
    "samples": 347,
    "p50_ms": 52.3,
    "p95_ms": 89.7,
    "p99_ms": 134.2,
    "max_ms": 891.4,
    "mean_ms": 57.8,
    "std_ms": 18.1
  },
  "websocket": {
    "samples": 1842,
    "p50_ms": 38.1,
    "p95_ms": 62.4,
    "p99_ms": 88.9,
    "max_ms": 312.1,
    "mean_ms": 41.3,
    "std_ms": 11.6
  }
}

Three signals to watch in your own data:

1. The P50-to-P99 gap reveals tail risk. A gap of more than 100 ms between P50 and P99 on WebSocket channels indicates intermittent queueing — not necessarily an SLA breach, but a signal to evaluate whether your strategy's tick-processing pipeline can absorb spikes without falling behind.

2. The max_ms outlier count tells you about infrastructure stability. A single 891 ms maximum over 347 REST samples is likely a transient network event. A pattern of maximums consistently above 500 ms during normal sessions warrants investigation.

3. The std_ms dispersion measures consistency. A standard deviation below 15 ms on REST calls indicates a stable data path. High standard deviation (> 30 ms) on quiet days is a warning sign — something in the infrastructure chain is unpredictable.

5. Latency Under Stress: What Happens During Earnings and Macro Events

Market data vendors face their toughest test during scheduled high-volatility events. Understanding TickDB's behavior during these windows requires distinguishing between two stress types.

5.1 Scheduled High-Volatility Events

Earnings releases, Fed meeting dates, and macro data prints (CPI, NFP, GDP) are predictable. TickDB's infrastructure team can pre-position capacity, but the limiting factor is typically the upstream exchange infrastructure, not TickDB itself.

During a major earnings release (e.g., NVIDIA's Q4 report), order book dynamics change dramatically:

Bid-ask spreads widen from $0.02 to $0.15 or more.
Order book size imbalances can swing from a pressure ratio of 1.5 to 3.0 within seconds.
Tick volume spikes 10–50x above normal session levels.

TickDB's depth and kline channels continue delivering data during these windows. Latency P99 may extend to 200–500 ms due to upstream exchange queueing. The critical question for your strategy is: does your execution logic degrade gracefully, or does it continue submitting orders at pre-event spread levels?

5.2 Unscheduled Events and Flash Crashes

Flash crashes and liquidity events are fundamentally different. They are unscheduled, short-lived (often under 60 seconds), and characterized by a complete absence of resting liquidity at multiple price levels.

During a flash crash, WebSocket latency for depth data may spike because:

The exchange matching engine itself experiences elevated processing latency.
TickDB ingests and normalizes data from multiple exchange feeds, introducing variable queueing.
Network congestion on shared co-location links increases.

The honest assessment: no market data vendor guarantees sub-100ms P99 during a flash crash. The infrastructure chain from exchange matching engine to your strategy engine has too many shared bottlenecks. What a professional vendor can guarantee is data completeness — every tick that occurred will be delivered, even if delivery is delayed by several seconds.

6. A Practical Latency Budget for Strategy Design

If you are designing a strategy that depends on TickDB's data, build your latency assumptions explicitly. A latency budget allocates your total tolerable delay across each component of the data pipeline.

6.1 Typical End-to-End Budget

Component	Typical latency contribution	Notes
Exchange matching engine → TickDB ingest	5–30 ms	TickDB's upstream ingestion latency
TickDB internal processing	5–20 ms	Normal load; higher under stress
Network: TickDB PoP → your server	10–80 ms	Depends on geographic distance
TLS handshake (if applicable)	3–10 ms	First request only for persistent
Client-side deserialization	1–5 ms	JSON parsing; depends on payload size
Total budget (normal)	~50–150 ms	P50 end-to-end
Total budget (stressed)	~200–500 ms	P99 under elevated market activity

6.2 Designing Guard Rails

Given this budget, strategies that are sensitive to latency should implement two guardrails:

1. Spread threshold enforcement. Do not enter a trade unless the current bid-ask spread exceeds a minimum threshold calibrated to your latency budget. A strategy that tolerates a $0.02 spread with a 150 ms latency budget is likely profitable. The same strategy with a 400 ms budget and a $0.15 post-earnings spread requires re-examination.

2. Stale data detection. Every data point carries a timestamp. If the time delta between the data's timestamp and your local clock exceeds a defined threshold (e.g., 500 ms), reject the data point as stale rather than treating it as current. This prevents strategies from trading on delayed quotes during infrastructure stress events.

import time

STALE_THRESHOLD_MS = 500  # Reject data older than this

def is_fresh(message: dict) -> bool:
    """
    Reject data older than STALE_THRESHOLD_MS to prevent stale quote trading.
    """
    if "ts" not in message:
        # No timestamp available — assume fresh for depth/kline channels
        return True

    server_ts_sec = message["ts"] / 1000.0
    age_ms = (time.time() - server_ts_sec) * 1000
    if age_ms > STALE_THRESHOLD_MS:
        return False
    return True

7. Comparing Latency Profiles Across Data Sources

The table below compares TickDB's latency characteristics against alternative market data sources. Numbers represent typical P99 ranges under normal market conditions and are sourced from public documentation and developer community benchmarks.

Metric	Generic REST polling API	TickDB REST	TickDB WebSocket	Bloomberg Terminal
Data type	OHLCV, delayed	OHLCV, real-time	Order book, trades, kline	Full depth, trades
P50 latency	500–5,000 ms	40–200 ms	30–60 ms	10–50 ms
P99 latency	2,000–10,000 ms	100–500 ms	60–100 ms	50–200 ms
Push vs. polling	Polling only	Polling	True push	True push
US equity tick data	Varies	Not available	Not available	Available
HK/Crypto tick data	Varies	Available	Available	Available
Historical depth	Limited	10+ years OHLCV	Real-time only	Limited
Heartbeat / reconnect	DIY	Native	Native	Native

Key takeaway: No consumer-grade API matches Bloomberg's co-location latency. However, Bloomberg requires a terminal subscription costing $25,000+ per year. For systematic strategies that do not require co-located execution, TickDB's WebSocket P99 of 60–100 ms is sufficient for most algorithmic strategies — and the OHLCV coverage is broader than most alternatives in the same price tier.

8. What to Do If Your Measured Latency Exceeds the SLA Range

Before escalating, run the diagnostic checklist below. The majority of latency issues originate on the client side.

Diagnostic step	What to check	Expected fix
1. Network path	Run `traceroute` or `MTR` to `api.tickdb.ai`. Check for packet loss or high jitter.	Change ISP, use a VPN with a direct route, or move to a co-location facility closer to TickDB's PoP.
2. TLS handshake overhead	Measure with `curl -w "%{time_connect}"`. First request adds 20–50 ms.	Switch to WebSocket (eliminates per-request TLS overhead) or use HTTP keepalive.
3. Client-side deserialization	Profile your JSON parsing with `time.perf_counter()` around `json.loads()`.	For high-frequency strategies, switch to MessagePack or Protocol Buffers if TickDB supports it.
4. Python GIL contention	If running a Python strategy, the Global Interpreter Lock limits parallelism during tick processing.	Move tick processing to a separate thread/process, or use `asyncio` with non-blocking I/O.
5. Rate limiting symptoms	If you are hitting `3001` errors, your retry logic may be introducing artificial delay.	Review your retry backoff strategy. The `Retry-After` header value is the authoritative delay.
6. Elevated market activity	Check TickDB's status page during your measurement window. Was there a scheduled event?	Run the harness during a non-event trading session for a baseline comparison.

If diagnostics confirm the issue originates on TickDB's infrastructure, open a support ticket with your harness output attached. Structured JSON logs with timestamps make debugging faster than anecdotal descriptions.

9. Key Takeaways

On the SLA itself: TickDB commits to a P99 latency floor across its REST and WebSocket endpoints under normal market conditions. WebSocket P99 of 60–100 ms covers the majority of systematic trading strategies. REST P99 of 100–500 ms (varying by query size) is suitable for backtesting and end-of-day pipelines.

On extreme conditions: No vendor guarantees sub-100ms P99 during flash crashes or major earnings releases. TickDB's infrastructure absorbs upstream degradation within reasonable bounds, but the exchange matching engine is the primary bottleneck during these windows.

On verification: The production-grade harness provided in this article gives you a reproducible, statistically rigorous method for measuring TickDB's latency from your own deployment environment. Run it for at least 24 hours across different market conditions before making latency-based architecture decisions.

On strategy design: Build an explicit latency budget. Enforce spread thresholds that account for your measured P99. Implement stale data detection with a timestamp delta threshold. These two guardrails prevent strategies from trading on degraded data during the exact moments when accuracy matters most.

Next Steps

If you are evaluating market data vendors and need a reproducible latency comparison, deploy the verification harness against multiple APIs simultaneously and compare P99 distributions directly. The methodology matters as much as the numbers.

If you want to test TickDB's latency against your own infrastructure:

Sign up at tickdb.ai (free, no credit card required)
Generate an API key in the dashboard
Set the TICKDB_API_KEY environment variable, then clone and run the verification harness from this article
Let it run for 24–48 hours and compare the JSON output against your SLA requirements

If you are an institutional team needing co-location, dedicated bandwidth, or custom SLA terms beyond the standard commitment, reach out to enterprise@tickdb.ai.

This article does not constitute investment advice. Market data latency is one of many variables affecting strategy performance. Past latency measurements do not guarantee future results.