On a typical trading day in early 2026, mainland Chinese investors channeled an average of HK$5.2 billion southbound through the Hong Kong Stock Connect mechanism. Yet the official flow data is published with a one-day lag — a critical blind spot for quant traders who need to react to capital positioning shifts before the market prices them in.
The gap between official disclosure and real-time market response creates an information asymmetry. Sophisticated traders have long known that southbound capital activity leaves quantifiable footprints across Hong Kong-listed stocks: disproportionate volume surges in H-share constituents, bid-side depth compression in counter-weighted names, and cross-market correlations that spike precisely when mainland institutions position aggressively.
This article builds a production-grade monitoring system using TickDB's real-time HK equity data — specifically the trades and depth channels — to construct a capital flow heat indicator. The system detects volume anomalies, estimates directional pressure, and generates actionable signals without waiting for official SWF flow figures.
1. The Data Architecture Challenge
Southbound capital flows through Stock Connect move along a structured pathway: mainland investors purchase eligible HK-listed securities using RMB, settled via HKEx infrastructure, with positions netted and reported daily. The mechanism connects Shanghai (SHConn) and Shenzhen (SZConn) to Hong Kong, covering over 600 eligible stocks.
The challenge for real-time systems is that no public exchange provides live southbound flow data. The information arrives in three tiers:
| Data type | Availability | Latency | Use case |
|---|---|---|---|
| Official net flow (HKEx daily) | End-of-day | +1 trading day | Backtest validation |
| Southbound turnover (HKEx) | End-of-session | +5 minutes | Daily calibration |
| Real-time proxy signals | Live | < 100 ms | Execution trigger |
The proxy signal tier is where TickDB delivers value. By monitoring HK equity trade flow in real time, we can detect volume patterns that correlate with known southbound activity windows — specifically the 9:30–9:45 AM and 2:00–2:30 PM windows when mainland fund flows concentrate.
2. Microstructure: Why Southbound Flows Leave Detectable Footprints
Southbound capital behaves differently from local HK institutional activity. Three characteristics create distinguishable signatures:
Order size concentration: Mainland institutional orders tend toward larger average size, clustering in round lots that reflect RMB denomination conventions. A surge in 500-share blocks across multiple Hang Seng Index constituents within a 30-second window correlates with cross-border flow acceleration.
Bid-side depth accumulation: When southbound capital enters a stock, it typically anchors on the bid side — a behavioral pattern tied to T+2 settlement and risk management practices. The result is a measurable compression in bid-ask spread on the active side, followed by ask-side depletion as the market absorbs the flow.
Correlated sector movement: Southbound capital favors specific sectors — technology (9988.HK, 0700.HK), financials (3968.HK, 939.HK), and consumer discretionary (1810.HK, 3690.HK). Simultaneous volume spikes across these names with high cross-correlation are a reliable southbound indicator.
The proxy methodology used in this system operates on the following hypothesis: volume anomalies in southbound-eligible HK stocks during the 9:30–9:45 AM window — when mainland markets open 30 minutes before HK — reflect directional institutional positioning that correlates with net flow direction.
3. System Architecture
The monitoring system consists of three layers:
┌─────────────────────────────────────────────────────────────┐
│ Signal Generation Layer │
│ Volume anomaly detector → Heat score calculator → Alert │
├─────────────────────────────────────────────────────────────┤
│ Data Aggregation Layer │
│ WebSocket streams → Volume accumulator → Correlation mat. │
├─────────────────────────────────────────────────────────────┤
│ TickDB API Layer │
│ depth (L1-L10 for HK) | trades (HK) | kline (10+ yr) │
└─────────────────────────────────────────────────────────────┘
The depth channel provides order book state for detecting ask-side depletion. The trades channel provides real-time execution data for volume anomaly detection. The kline endpoint provides historical baselines for statistical comparison.
4. Production-Grade Code
The following implementation uses the TickDB WebSocket API for real-time trade monitoring. It includes heartbeat management, exponential backoff with jitter for reconnection, and rate-limit handling consistent with the production-code standards specified in the TickDB Content Strategy Handbook.
import os
import json
import time
import random
import asyncio
from datetime import datetime, timedelta
from collections import deque
import threading
import requests # For REST fallback during reconnect
class SouthboundFlowMonitor:
"""
Real-time southbound capital flow proxy monitor using TickDB.
Monitors volume anomalies in Stock Connect-eligible HK equities.
⚠️ This system generates proxy indicators, not official flow data.
Validate against official HKEx figures before live deployment.
"""
def __init__(self, api_key: str):
self.api_key = api_key
self.ws = None
self.base_url = "https://api.tickdb.ai"
# Configuration
self.watchlist = [
"9988.HK", # Alibaba
"0700.HK", # Tencent
"3968.HK", # China Merchants Bank
"939.HK", # CCB
"1810.HK", # Xiaomi
"3690.HK", # Meituan
"9618.HK", # JD.com
"1024.HK", # Kuaishou
]
# Anomaly detection parameters
self.volume_window_sec = 300 # 5-minute rolling window
self.anomaly_threshold_z = 2.5 # Z-score threshold for anomaly
self.correlation_window_min = 15 # Cross-stock correlation window
# Data buffers
self.volume_buffers = {symbol: deque(maxlen=300) for symbol in self.watchlist}
self.timestamps = {symbol: deque(maxlen=300) for symbol in self.watchlist}
self.last_trade_price = {}
# State management
self.reconnect_delay = 1.0
self.max_reconnect_delay = 32.0
self.retry_count = 0
self.ws_active = False
# Historical baseline (fetched once)
self.baseline_volumes = {}
self._load_historical_baseline()
def _load_historical_baseline(self):
"""Fetch 30-day average volume for baseline comparison."""
headers = {"X-API-Key": self.api_key}
for symbol in self.watchlist:
try:
# Use daily kline endpoint for baseline
params = {
"symbol": symbol,
"interval": "1d",
"limit": 30
}
response = requests.get(
f"{self.base_url}/v1/market/kline",
headers=headers,
params=params,
timeout=(3.05, 10)
)
if response.status_code == 200:
data = response.json()
if data.get("code") == 0:
daily_volumes = [
candle["vol"]
for candle in data["data"]
if candle.get("vol", 0) > 0
]
if daily_volumes:
import statistics
self.baseline_volumes[symbol] = {
"mean": statistics.mean(daily_volumes),
"stdev": statistics.stdev(daily_volumes) if len(daily_volumes) > 1 else 0
}
except Exception as e:
print(f"Baseline fetch failed for {symbol}: {e}")
def _get_heartbeat_payload(self) -> dict:
"""Generate TickDB-compatible heartbeat payload."""
return json.dumps({"cmd": "ping"})
def _handle_rate_limit(self, response_data: dict):
"""Handle TickDB rate limit (code 3001) with Retry-After header."""
retry_after = int(response_data.get("headers", {}).get(
"Retry-After", 5
))
print(f"Rate limit reached. Waiting {retry_after}s before retry.")
time.sleep(retry_after)
def _reconnect_with_backoff(self):
"""Exponential backoff with jitter for WebSocket reconnection."""
delay = min(
self.reconnect_delay * (2 ** self.retry_count),
self.max_reconnect_delay
)
jitter = random.uniform(0, delay * 0.1)
wait_time = delay + jitter
print(f"Reconnecting in {wait_time:.2f}s (attempt {self.retry_count + 1})")
time.sleep(wait_time)
self.retry_count += 1
self._connect_websocket()
def _connect_websocket(self):
"""Establish WebSocket connection with TickDB."""
# ⚠️ For production HFT workloads, use aiohttp/asyncio
import websocket
ws_url = f"wss://stream.tickdb.ai/ws?api_key={self.api_key}"
try:
self.ws = websocket.WebSocketApp(
ws_url,
on_message=self._on_message,
on_error=self._on_error,
on_close=self._on_close,
on_open=self._on_open
)
except Exception as e:
print(f"WebSocket connection error: {e}")
self._reconnect_with_backoff()
def _on_open(self, ws):
"""Subscribe to trades for all watchlist symbols."""
self.ws_active = True
self.retry_count = 0
self.reconnect_delay = 1.0
for symbol in self.watchlist:
subscribe_msg = json.dumps({
"cmd": "subscribe",
"params": {
"channel": "trades",
"symbol": symbol
}
})
ws.send(subscribe_msg)
print(f"Subscribed to trades: {symbol}")
def _on_message(self, ws, message):
"""Process incoming trade data and update volume buffers."""
try:
data = json.loads(message)
# Handle error responses (rate limit, auth failure)
if "code" in data and data["code"] != 0:
if data["code"] == 3001:
self._handle_rate_limit(data)
return
# Process trade data
if "data" in data and isinstance(data["data"], list):
for trade in data["data"]:
symbol = trade.get("symbol")
price = trade.get("price")
volume = trade.get("vol", 0)
timestamp = trade.get("ts", 0)
if symbol in self.volume_buffers:
self.volume_buffers[symbol].append(volume)
self.timestamps[symbol].append(timestamp)
self.last_trade_price[symbol] = price
# Send heartbeat every 30 seconds
ws.send(self._get_heartbeat_payload())
# Run anomaly detection
self._detect_anomalies()
except Exception as e:
print(f"Message processing error: {e}")
def _on_error(self, ws, error):
"""Handle WebSocket errors."""
print(f"WebSocket error: {error}")
self.ws_active = False
def _on_close(self, ws, close_status_code, close_msg):
"""Handle connection closure with reconnection logic."""
print(f"Connection closed: {close_status_code} - {close_msg}")
self.ws_active = False
if close_status_code not in (1000, 1001): # Abnormal closure
self._reconnect_with_backoff()
def _detect_anomalies(self):
"""Calculate volume Z-scores and generate heat indicators."""
heat_scores = {}
for symbol in self.watchlist:
if symbol not in self.baseline_volumes:
continue
# Calculate current window volume
current_volume = sum(self.volume_buffers[symbol])
baseline = self.baseline_volumes[symbol]
if baseline["stdev"] > 0:
z_score = (current_volume - baseline["mean"]) / baseline["stdev"]
else:
z_score = 0
heat_scores[symbol] = {
"volume": current_volume,
"z_score": z_score,
"anomaly": z_score > self.anomaly_threshold_z,
"last_price": self.last_trade_price.get(symbol)
}
# Calculate aggregate heat
anomalous_symbols = [s for s, v in heat_scores.items() if v["anomaly"]]
aggregate_heat = len(anomalous_symbols) / len(self.watchlist)
if aggregate_heat > 0.5:
self._emit_alert(heat_scores, aggregate_heat)
return heat_scores
def _emit_alert(self, heat_scores, aggregate_heat):
"""Emit alert when aggregate heat exceeds threshold."""
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
anomalous = [s for s, v in heat_scores.items() if v["anomaly"]]
print(f"\n{'='*60}")
print(f"SOUTHBOUND FLOW ALERT - {timestamp}")
print(f"Aggregate heat: {aggregate_heat:.1%}")
print(f"Anomalous symbols: {', '.join(anomalous)}")
print(f"{'='*60}")
for symbol in anomalous:
info = heat_scores[symbol]
print(f" {symbol}: Z={info['z_score']:.2f}, vol={info['volume']:,.0f}")
# In production: send to Slack webhook, email, or trading system
def start(self):
"""Start the monitoring loop."""
print(f"Starting Southbound Flow Monitor")
print(f"Watchlist: {', '.join(self.watchlist)}")
print(f"Anomaly threshold: Z > {self.anomaly_threshold_z}")
self._connect_websocket()
try:
self.ws.run_forever(ping_interval=30)
except KeyboardInterrupt:
print("Monitor stopped by user")
self.ws_active = False
if __name__ == "__main__":
# Load API key from environment
api_key = os.environ.get("TICKDB_API_KEY")
if not api_key:
raise ValueError(
"TICKDB_API_KEY environment variable not set. "
"Generate a key at tickdb.ai/dashboard"
)
monitor = SouthboundFlowMonitor(api_key=api_key)
monitor.start()
5. Order Book Depth Analysis for Flow Direction
Volume anomaly detection identifies when southbound capital is active. To determine directional flow — whether capital is entering or rotating — we analyze bid-ask pressure using the TickDB depth channel.
For Hong Kong equities, TickDB provides depth data up to 10 levels. The pressure ratio metric quantifies directional bias:
buy_pressure = Σ(bid_size[i] for i in top-N levels)
sell_pressure = Σ(ask_size[i] for i in top-N levels)
pressure_ratio = buy_pressure / sell_pressure
A pressure ratio above 1.0 indicates bid-side dominance — consistent with capital inflow. When combined with volume anomaly detection, the system generates directional signals:
| Volume anomaly | Pressure ratio | Signal |
|---|---|---|
| High | > 1.5 | Strong inflow |
| High | 1.0–1.5 | Moderate inflow |
| High | < 1.0 | Outflow or rotation |
| Low | Any | No clear signal |
The following code integrates depth monitoring with the volume anomaly system:
import json
import time
import os
def monitor_depth_channel(symbol: str, api_key: str, levels: int = 10):
"""
Monitor order book depth for a single HK equity symbol.
Returns real-time pressure ratio and depth imbalance.
⚠️ This is a polling implementation. For < 50ms latency
requirements, migrate to WebSocket subscriptions.
"""
base_url = "https://api.tickdb.ai"
headers = {"X-API-Key": api_key}
while True:
try:
params = {"symbol": symbol, "limit": levels}
response = requests.get(
f"{base_url}/v1/market/depth",
headers=headers,
params=params,
timeout=(3.05, 10)
)
if response.status_code != 200:
error_data = response.json()
if error_data.get("code") == 3001:
retry_after = int(response.headers.get("Retry-After", 5))
time.sleep(retry_after)
continue
else:
print(f"Error {error_data.get('code')}: {error_data.get('message')}")
time.sleep(5)
continue
data = response.json()
if data.get("code") == 0:
bids = data["data"].get("bids", [])
asks = data["data"].get("asks", [])
bid_volume = sum(float(b.get("vol", 0)) for b in bids)
ask_volume = sum(float(a.get("vol", 0)) for a in asks)
if ask_volume > 0:
pressure_ratio = bid_volume / ask_volume
else:
pressure_ratio = float('inf')
# Depth imbalance: positive = bid-heavy, negative = ask-heavy
total_volume = bid_volume + ask_volume
depth_imbalance = (bid_volume - ask_volume) / total_volume if total_volume > 0 else 0
return {
"symbol": symbol,
"timestamp": datetime.now().isoformat(),
"pressure_ratio": pressure_ratio,
"depth_imbalance": depth_imbalance,
"bid_volume": bid_volume,
"ask_volume": ask_volume,
"spread_bps": ((float(bids[0]["price"]) - float(asks[0]["price"])) / float(bids[0]["price"])) * 10000 if bids and asks else 0
}
except Exception as e:
print(f"Depth fetch error for {symbol}: {e}")
time.sleep(5)
6. Value Comparison: TickDB vs Alternative Data Sources
For monitoring HK equity flow signals, the relevant comparison is between TickDB and alternatives that provide similar microstructure coverage.
| Capability | Bloomberg Terminal | Generic HKEX Feed | TickDB |
|---|---|---|---|
| Real-time depth (L1) | ✅ Full | ✅ Full | ✅ Full |
| Real-time depth (L2–L10) | ✅ Full | ❌ Not typically | ✅ Full (HK) |
| Trade-by-trade data | ✅ Full | ✅ Full | ✅ Full (HK) |
| Historical kline (10+ yr) | ✅ | Limited | ✅ |
| WebSocket push | ✅ | ❌ (polling only) | ✅ |
| Python SDK / REST | Limited | Poor | ✅ First-class |
| Free tier | ❌ ($25K+/month) | ❌ | ✅ (limited volume) |
| API authentication | Bloomberg auth | Custom | API key (env var) |
TickDB's advantage for this use case is the combination of WebSocket push for real-time monitoring, sufficient depth levels for pressure ratio calculation, and accessible historical data for baseline calibration. Bloomberg Terminal provides deeper market depth and broader asset coverage but carries prohibitive cost for individual quant developers.
7. Deployment Considerations
Free Tier vs. Professional
The monitoring system works under both tiers, with the following operational limits:
| Aspect | Free tier | Professional |
|---|---|---|
| Daily API calls | 1,000 | 100,000 |
| WebSocket connections | 1 | 10 |
| Historical kline depth | 1 year | 10+ years |
| Suitable for | Proof-of-concept, individual backtesting | Live monitoring, team deployment |
Configuration by Deployment Scale
| Scenario | Configuration |
|---|---|
| Individual quant (proof-of-concept) | Monitor 5 symbols, free tier, polling fallback |
| Individual quant (live) | Monitor 10 symbols, professional tier, WebSocket |
| Small team (live) | Monitor 50 symbols, professional tier, distributed instances |
| Institutional (live) | Monitor full list, enterprise tier, dedicated infrastructure |
8. Limitations and Validation
This system generates proxy indicators, not official southbound flow data. The methodology carries inherent limitations:
Proxy lag: Volume anomalies correlate with southbound flows but do not measure them directly. The correlation strength varies by market regime — during high volatility periods, local HK activity can mask cross-border signals.
Window specificity: The 9:30–9:45 AM window produces the strongest signal because mainland market open precedes HK trading. This window is less reliable during US session overlaps.
Baseline drift: Volume patterns shift as market structure evolves. Recalibrate baselines quarterly using HKEx official flow figures for validation.
Validation procedure: Run the proxy signal against 90 days of official HKEx southbound turnover data. Calculate Pearson correlation between proxy heat score and official net flow direction. A correlation above 0.65 validates the methodology; below 0.50 requires parameter adjustment.
Closing
The disconnect between daily official southbound flow disclosure and real-time market reaction creates an actionable edge for quant traders who can construct reliable proxy signals. By combining TickDB's real-time HK equity trade and depth data with statistical anomaly detection, the system above provides a production-grade foundation for capital flow monitoring.
The flow heat indicator — built from volume Z-scores, pressure ratios, and cross-symbol correlation — gives quant traders a measurable signal without waiting for end-of-day official figures. When calibrated against historical HKEx data and operated within appropriate tier limits, it delivers actionable intelligence at latency that matches market microstructure.
Start with the free tier, validate against 90 days of official data, and scale to professional or enterprise as signal quality confirms.
Next Steps
If you're building a proof-of-concept flow monitor:
- Sign up at tickdb.ai (free, no credit card required)
- Set the
TICKDB_API_KEYenvironment variable - Copy the code from this article and configure your watchlist
- Run alongside official HKEx data for 30 days to validate correlation
If you need 10+ years of historical HK equity data for baseline calibration:
Reach out to enterprise@tickdb.ai for institutional data plans with extended history.
If you're using AI coding assistants:
Search for and install the tickdb-market-data SKILL on ClawHub to access TickDB directly from your AI workflow.
This article does not constitute investment advice. Markets involve risk; past correlations do not guarantee future signal validity. Always validate proxy indicators against official data sources before live deployment.