"History is written by the survivors."
This maxim, borrowed from Taleb's antifragility literature, cuts to the heart of a fundamental problem in quantitative strategy development. Your 10-year backtest on US equities captured the 2020 COVID crash—but did it capture how your strategy would behave if that crash repeated with 1.5x the velocity? What if it happened twice in six months? What if it coincided with a liquidity crisis in the underlying options market?
Historical backtesting answers one question with precision: what did this strategy do, given this specific sequence of market events? It cannot answer the more consequential question: what might this strategy do under conditions we have not yet observed?
Monte Carlo simulation provides the bridge. By treating historical returns not as a fixed reality but as a statistical distribution, you can generate thousands of plausible alternative histories—and stress your strategy against the full spectrum of market outcomes, not just the one that actually occurred.
This article walks through the mathematics of Monte Carlo simulation for trading strategy evaluation, provides production-grade Python implementation code, and demonstrates how to integrate synthetic path generation with real-time TickDB data feeds for live risk monitoring.
The Fundamental Limitation of Historical Backtesting
Before diving into the methodology, it is worth being precise about what historical backtesting cannot do.
Sample Path Dependency
A backtest assumes the sequence of returns that occurred is the only sequence that could have occurred. Consider the 2020 COVID crash:
| Period | SPY Return | VIX Close | Condition |
|---|---|---|---|
| Feb 19 – Mar 23, 2020 | −33.9% | 82.7 | Actual crash sequence |
| Scenario A | −33.9% | 82.7 | Same magnitude, different order |
| Scenario B | −33.9% | 82.7 | Same magnitude, interrupted by sharp rallies |
A mean-reversion strategy that worked in the actual sequence—buying the dip with VIX fading from 82 to 45—would have had drastically different performance in Scenario B, where VIX spiked, faded, spiked again, and faded again. The strategy's viability depends not just on the distribution of returns but on the ordering of those returns.
Historical backtesting cannot distinguish between these scenarios because only one ordering actually occurred. Monte Carlo simulation can generate all three.
Regime Blindness
Historical data is silent on regimes that did not appear in the sample. If your 10-year dataset excludes a sustained high-inflation environment (as many 2013–2023 datasets do), your strategy's performance under Fed rate hikes above 4% is genuinely unknown—not estimated, but unknown.
This is not a minor edge case. The 2022 rate-hike cycle produced a 60/40 portfolio drawdown of approximately −16%, the worst since 2008. A strategy that backtested cleanly on 2013–2021 data with its benign low-rate environment would have faced significant unexpected drawdown in 2022.
Fat Tails and Nonlinear Payoffs
Standard backtesting implicitly assumes returns are drawn from a stable distribution—typically normal, or at most Student's t. Real market returns exhibit:
- Leptokurtosis: Excess kurtosis of 3–10x normal in equity returns during stress periods
- Skewness asymmetry: Crash scenarios produce negative skew that normal-assumption backtests cannot capture
- Nonlinear payoff sensitivity: Options-like strategy payoffs (momentum breakouts, mean-reversion with hard stops) are extremely sensitive to tail events
A strategy with a 0.95 Sharpe ratio on historical data may have a 0.4 Sharpe ratio when evaluated against fat-tailed return distributions. The difference is not measurement error—it is a fundamentally different risk profile that historical data cannot reveal.
The Monte Carlo Framework for Strategy Evaluation
Monte Carlo simulation in quantitative finance refers to the use of random sampling to compute numerical results that would be deterministic given infinite computational power. For strategy evaluation, the core application is path generation: simulating thousands of plausible equity return sequences and evaluating strategy performance on each.
Mathematical Foundation
Let $r_t$ denote the log return at time $t$. Under the Geometric Brownian Motion (GBM) assumption:
$$r_t = \mu \Delta t + \sigma \sqrt{\Delta t} \cdot Z_t$$
where $Z_t \sim N(0,1)$ is a standard normal random variable, $\mu$ is the drift (expected return), and $\sigma$ is the volatility.
However, GBM is insufficient for financial applications because it assumes:
- Returns are independently and identically distributed (i.i.d.)
- Volatility is constant
- Price paths are continuous
A more robust approach uses bootstrapping with volatility clustering or jump-diffusion processes.
Bootstrap with Volatility Clustering
The empirical bootstrap resamples actual historical returns but preserves the autocorrelation structure:
- Estimate the volatility process: $\sigma_t = \text{EGARCH}(1,1)$ or $\sigma_t = \text{GARCH}(1,1)$
- Standardize returns: $z_t = r_t / \sigma_t$
- Resample ${z_t}$ with replacement to generate synthetic return sequences
- Apply resampled $z_t$ to realized $\sigma_t$ to generate path
This preserves the empirical distribution of returns—including fat tails—while generating novel orderings.
Jump-Diffusion Extension
For strategies sensitive to sudden market discontinuities (momentum, event-driven), incorporate Poisson jumps:
$$r_t = \mu \Delta t + \sigma \sqrt{\Delta t} \cdot Z_t + J_t$$
where $J_t = Y_t \cdot q_t$, with $Y_t \sim \text{LogNormal}(\mu_J, \sigma_J)$ and $q_t \sim \text{Poisson}(\lambda)$.
The jump intensity $\lambda$ controls the expected number of discontinuities per period. Empirical estimates for US equity indices suggest $\lambda \approx 2$–$4$ per year, with jump magnitudes of $3\sigma$ to $8\sigma$.
Production-Grade Python Implementation
The following implementation provides a complete Monte Carlo simulation engine with three path generation methods: geometric Brownian motion, empirical bootstrap, and jump-diffusion. The code includes proper error handling, configurable parameters, and integration with TickDB for parameter calibration.
import os
import time
import random
import numpy as np
import pandas as pd
import requests
from typing import Optional, Dict, List, Tuple
from dataclasses import dataclass
from enum import Enum
# ⚠️ For production HFT workloads, use aiohttp/asyncio for parallel path generation
class PathMethod(Enum):
GBM = "gbm"
BOOTSTRAP = "bootstrap"
JUMP_DIFFUSION = "jump_diffusion"
@dataclass
class SimulationConfig:
n_paths: int = 10000
n_steps: int = 252 # One trading year
dt: float = 1 / 252 # Daily time step
initial_capital: float = 100_000.0
drift: Optional[float] = None # If None, estimated from data
volatility: Optional[float] = None # If None, estimated from data
jump_intensity: float = 3.0 # Annualized jump frequency
jump_mean: float = -0.05 # Mean jump size (log-normal location)
jump_std: float = 0.08 # Jump size volatility
random_seed: Optional[int] = 42
confidence_levels: Tuple[float, ...] = (0.01, 0.05, 0.25, 0.50, 0.75, 0.95, 0.99)
class TickDBDataProvider:
"""Fetch historical OHLCV data from TickDB for parameter estimation."""
BASE_URL = "https://api.tickdb.ai/v1"
def __init__(self, api_key: Optional[str] = None):
self.api_key = api_key or os.environ.get("TICKDB_API_KEY")
if not self.api_key:
raise ValueError("TICKDB_API_KEY environment variable or api_key argument required")
def _get(self, endpoint: str, params: Optional[Dict] = None) -> Dict:
"""Standard TickDB GET request with timeout."""
response = requests.get(
f"{self.BASE_URL}{endpoint}",
headers={"X-API-Key": self.api_key},
params=params,
timeout=(3.05, 10) # (connect_timeout, read_timeout)
)
data = response.json()
# Handle rate limiting
if data.get("code") == 3001:
retry_after = int(response.headers.get("Retry-After", 5))
time.sleep(retry_after)
return self._get(endpoint, params) # Retry after wait
if data.get("code") not in (0, None):
raise RuntimeError(f"TickDB error {data.get('code')}: {data.get('message')}")
return data.get("data", {})
def get_historical_returns(
self,
symbol: str,
interval: str = "1d",
limit: int = 2520 # ~10 years of daily data
) -> pd.Series:
"""
Fetch OHLCV data and compute log returns.
Args:
symbol: Market symbol (e.g., "SPY.US", "AAPL.US")
interval: Candle interval ("1d", "1h", "15m")
limit: Maximum number of candles to fetch
Returns:
pd.Series of log returns
"""
kline_data = self._get(
"/market/kline",
params={"symbol": symbol, "interval": interval, "limit": limit}
)
df = pd.DataFrame(kline_data)
df["close"] = df["close"].astype(float)
df = df.sort_values("ts").reset_index(drop=True)
# Compute log returns
returns = np.log(df["close"] / df["close"].shift(1)).dropna()
return returns
class MonteCarloEngine:
"""
Monte Carlo simulation engine for strategy evaluation.
Supports three path generation methods:
- GBM: Geometric Brownian Motion
- BOOTSTRAP: Empirical bootstrap with volatility clustering
- JUMP_DIFFUSION: GBM with Poisson jumps for tail risk modeling
"""
def __init__(
self,
config: Optional[SimulationConfig] = None,
returns: Optional[pd.Series] = None,
drift: Optional[float] = None,
volatility: Optional[float] = None
):
self.config = config or SimulationConfig()
if returns is not None:
self.drift = drift if drift is not None else returns.mean() * 252 # Annualized
self.volatility = volatility if volatility is not None else returns.std() * np.sqrt(252)
self.returns = returns
elif drift is not None and volatility is not None:
self.drift = drift
self.volatility = volatility
self.returns = None
else:
raise ValueError("Either returns or (drift, volatility) must be provided")
if self.config.random_seed is not None:
random.seed(self.config.random_seed)
np.random.seed(self.config.random_seed)
def _generate_gbm_paths(self) -> np.ndarray:
"""Generate paths using Geometric Brownian Motion."""
n_paths = self.config.n_paths
n_steps = self.config.n_steps
dt = self.config.dt
# Generate random shocks: shape (n_paths, n_steps)
z = np.random.standard_normal((n_paths, n_steps))
# Accumulate price paths
drift_term = (self.drift - 0.5 * self.volatility ** 2) * dt
diffusion_term = self.volatility * np.sqrt(dt) * z
log_returns = drift_term + diffusion_term
log_prices = np.cumsum(log_returns, axis=1)
# Prepend initial price (log = 0)
paths = np.hstack([np.zeros((n_paths, 1)), log_prices])
return np.exp(paths) # Convert to price levels
def _generate_bootstrap_paths(self) -> np.ndarray:
"""
Generate paths using empirical bootstrap.
Resamples historical returns with replacement to preserve
the empirical distribution including fat tails and autocorrelation.
"""
if self.returns is None:
raise ValueError("Bootstrap method requires historical returns")
n_paths = self.config.n_paths
n_steps = self.config.n_steps
historical_returns = self.returns.values
n_historical = len(historical_returns)
paths = np.zeros((n_paths, n_steps + 1))
for i in range(n_paths):
# Resample with replacement
resampled_idx = np.random.choice(n_historical, size=n_steps, replace=True)
resampled_returns = historical_returns[resampled_idx]
paths[i, 1:] = np.cumprod(1 + resampled_returns)
return paths
def _generate_jump_diffusion_paths(self) -> np.ndarray:
"""
Generate paths using Merton jump-diffusion model.
Each time step has probability p = 1 - exp(-λ*dt) of a jump occurring.
Jump sizes follow a log-normal distribution.
"""
n_paths = self.config.n_paths
n_steps = self.config.n_steps
dt = self.config.dt
λ = self.config.jump_intensity
μ_j = self.config.jump_mean
σ_j = self.config.jump_std
# GBM component
z = np.random.standard_normal((n_paths, n_steps))
drift_term = (self.drift - 0.5 * self.volatility ** 2) * dt
diffusion_term = self.volatility * np.sqrt(dt) * z
# Jump component
jump_prob = 1 - np.exp(-λ * dt) # Probability of at least one jump
n_jumps = np.random.binomial(1, jump_prob, size=(n_paths, n_steps))
jump_sizes = np.random.lognormal(μ_j, σ_j, size=(n_paths, n_steps))
log_returns = drift_term + diffusion_term + n_jumps * (np.log(jump_sizes) - μ_j)
log_prices = np.cumsum(log_returns, axis=1)
paths = np.hstack([np.zeros((n_paths, 1)), log_prices])
return np.exp(paths)
def generate_paths(self, method: PathMethod = PathMethod.GBM) -> np.ndarray:
"""Generate simulated price paths using the specified method."""
if method == PathMethod.GBM:
return self._generate_gbm_paths()
elif method == PathMethod.BOOTSTRAP:
return self._generate_bootstrap_paths()
elif method == PathMethod.JUMP_DIFFUSION:
return self._generate_jump_diffusion_paths()
else:
raise ValueError(f"Unknown path method: {method}")
def evaluate_strategy(
self,
paths: np.ndarray,
entry_price: float = 1.0,
position_type: str = "long",
stop_loss: Optional[float] = None,
take_profit: Optional[float] = None
) -> Dict:
"""
Evaluate a simple strategy on simulated paths.
Args:
paths: Simulated price paths (n_paths, n_steps + 1)
entry_price: Entry price for the position
position_type: "long" or "short"
stop_loss: Stop-loss level (absolute price, None for no stop)
take_profit: Take-profit level (absolute price, None for no target)
Returns:
Dictionary with performance statistics
"""
n_paths = paths.shape[0]
# Terminal prices relative to entry
terminal_prices = paths[:, -1] * entry_price
# Calculate returns based on position type
if position_type == "long":
raw_returns = (terminal_prices - entry_price) / entry_price
path_prices = paths * entry_price
else: # short
raw_returns = (entry_price - terminal_prices) / entry_price
path_prices = entry_price - (paths - 1) * entry_price
# Apply stop-loss and take-profit
if stop_loss is not None or take_profit is not None:
adjusted_returns = np.zeros(n_paths)
for i in range(n_paths):
prices = path_prices[i]
if position_type == "long":
hit_stop = stop_loss is not None and np.any(prices <= stop_loss)
hit_tp = take_profit is not None and np.any(prices >= take_profit)
else:
hit_stop = stop_loss is not None and np.any(prices >= stop_loss)
hit_tp = take_profit is not None and np.any(prices <= take_profit)
if hit_stop:
adjusted_returns[i] = (stop_loss - entry_price) / entry_price if position_type == "long" else (entry_price - stop_loss) / entry_price
elif hit_tp:
adjusted_returns[i] = (take_profit - entry_price) / entry_price if position_type == "long" else (entry_price - take_profit) / entry_price
else:
adjusted_returns[i] = raw_returns[i]
returns = adjusted_returns
else:
returns = raw_returns
# Compute statistics
stats = {
"mean_return": float(np.mean(returns)),
"std_return": float(np.std(returns)),
"min_return": float(np.min(returns)),
"max_return": float(np.max(returns)),
"sharpe_ratio": float(np.mean(returns) / np.std(returns) * np.sqrt(252)) if np.std(returns) > 0 else 0.0,
"win_rate": float(np.mean(returns > 0)),
"loss_rate": float(np.mean(returns < 0)),
"percentile_1": float(np.percentile(returns, 1)),
"percentile_5": float(np.percentile(returns, 5)),
"percentile_25": float(np.percentile(returns, 25)),
"percentile_50": float(np.percentile(returns, 50)),
"percentile_75": float(np.percentile(returns, 75)),
"percentile_95": float(np.percentile(returns, 95)),
"percentile_99": float(np.percentile(returns, 99)),
"max_drawdown_5pct": float(np.mean(returns <= -0.05)), # Probability of >5% loss
"max_drawdown_10pct": float(np.mean(returns <= -0.10)),
"max_drawdown_20pct": float(np.mean(returns <= -0.20)),
"tail_risk_ratio": float(np.percentile(returns, 5) / np.percentile(returns, 95)) if np.percentile(returns, 95) != 0 else 0.0,
}
return stats
def handle_api_error(response: Dict, symbol: Optional[str] = None) -> None:
"""Standard TickDB error handler."""
code = response.get("code", 0)
if code in (1001, 1002):
raise ValueError("Invalid API key — check your TICKDB_API_KEY env var")
if code == 2002:
raise KeyError(f"Symbol {symbol} not found — verify via /v1/symbols/available")
if code == 3001:
retry_after = int(response.headers.get("Retry-After", 5))
time.sleep(retry_after)
return
if code != 0:
raise RuntimeError(f"Unexpected error {code}: {response.get('message')}")
def run_stress_test_example():
"""
Example: Stress-test a momentum strategy on SPY using
10 years of TickDB historical data + Monte Carlo simulation.
"""
# Fetch historical data from TickDB
provider = TickDBDataProvider()
returns = provider.get_historical_returns("SPY.US", interval="1d", limit=2520)
print(f"Fetched {len(returns)} days of SPY returns from TickDB")
print(f"Estimated annual drift: {returns.mean() * 252:.2%}")
print(f"Estimated annual volatility: {returns.std() * np.sqrt(252):.2%}")
# Configure simulation
config = SimulationConfig(
n_paths=10000,
n_steps=252,
initial_capital=100_000.0,
jump_intensity=3.0,
jump_mean=-0.06,
jump_std=0.10,
random_seed=42
)
# Initialize engine with historical returns
engine = MonteCarloEngine(config=config, returns=returns)
# Run all three methods
results = {}
for method in PathMethod:
print(f"\nGenerating {method.value} paths...")
paths = engine.generate_paths(method=method)
stats = engine.evaluate_strategy(
paths,
entry_price=1.0,
position_type="long",
stop_loss=0.95, # 5% stop-loss
take_profit=1.20 # 20% take-profit
)
results[method.value] = stats
print(f"\n{method.value.upper()} Results:")
print(f" Mean return: {stats['mean_return']:.2%}")
print(f" Sharpe ratio: {stats['sharpe_ratio']:.3f}")
print(f" Win rate: {stats['win_rate']:.2%}")
print(f" 5th percentile (VaR 95%): {stats['percentile_5']:.2%}")
print(f" 1st percentile (VaR 99%): {stats['percentile_1']:.2%}")
print(f" Prob. of >20% loss: {stats['max_drawdown_20pct']:.2%}")
return results
if __name__ == "__main__":
results = run_stress_test_example()
Three Stress Scenarios for Equity Strategies
Monte Carlo simulation becomes powerful when you design stress scenarios tailored to your strategy's specific vulnerabilities. Below are three scenarios derived from historical crisis analysis, each targeting a different risk dimension.
Scenario 1: Volatility Clustering Surge
What it tests: Strategies that assume volatility mean-reverts quickly. Momentum strategies with 10-day lookback periods are particularly vulnerable.
Implementation: Increase realized volatility by 2.5x while preserving drift. This is calibrated to the VIX spike in March 2020 (VIX went from 15 to 82, a 5.5x increase; we use a more conservative 2.5x to avoid extreme pathological cases).
def stress_volatility_clustering(engine: MonteCarloEngine, volatility_multiplier: float = 2.5):
"""Stress test: surge in realized volatility."""
stressed_engine = MonteCarloEngine(
config=engine.config,
returns=None,
drift=engine.drift,
volatility=engine.volatility * volatility_multiplier
)
paths = stressed_engine.generate_paths(PathMethod.GBM)
stats = stressed_engine.evaluate_strategy(paths)
return stats
Scenario 2: Correlation Breakdown
What it tests: Multi-asset strategies that rely on diversification. If correlations spike to 1.0 during a crisis (as they did in 2008 and March 2020), a "diversified" portfolio may not diversify at all.
Implementation: Generate correlated paths using a Cholesky decomposition with a stress correlation matrix where all pairwise correlations are set to 0.8+.
def generate_correlated_paths(
n_assets: int,
n_paths: int,
n_steps: int,
drifts: np.ndarray,
volatilities: np.ndarray,
correlation_matrix: np.ndarray
) -> np.ndarray:
"""
Generate correlated paths using Cholesky decomposition.
Args:
n_assets: Number of assets
n_paths: Number of simulation paths
n_steps: Number of time steps
drifts: Annual drift for each asset
volatilities: Annual volatility for each asset
correlation_matrix: Target correlation matrix (n_assets, n_assets)
Returns:
Price paths for all assets (n_assets, n_paths, n_steps)
"""
dt = 1 / 252
# Cholesky decomposition for correlation structure
L = np.linalg.cholesky(correlation_matrix)
# Generate uncorrelated standard normals
uncorrelated = np.random.standard_normal((n_paths, n_steps, n_assets))
# Apply correlation structure
correlated = np.einsum('ij,ijk->ijk', L, uncorrelated)
# Compute returns
paths = np.zeros((n_assets, n_paths, n_steps + 1))
for a in range(n_assets):
drift_term = (drifts[a] - 0.5 * volatilities[a] ** 2) * dt
diffusion_term = volatilities[a] * np.sqrt(dt) * correlated[:, :, a]
log_returns = drift_term + diffusion_term
log_prices = np.cumsum(log_returns, axis=1)
paths[a, :, 1:] = np.exp(log_prices)
return paths
# Stress correlation: 0.85 across all assets (2008 crisis level)
stress_corr = np.full((3, 3), 0.85)
np.fill_diagonal(stress_corr, 1.0)
Scenario 3: Prolonged Drawdown Regime
What it tests: Strategies with optionality that degrades over time—short premium strategies, momentum strategies with carry costs, or any strategy with theta bleed.
Implementation: Force a sustained 6-month drawdown with -30% cumulative decline, followed by a slow recovery. This tests whether your strategy can survive the drawdown period without margin calls or behavioral capitulation.
def stress_prolonged_drawdown(n_paths: int, n_steps: int) -> np.ndarray:
"""
Generate paths with a forced prolonged drawdown phase.
Phase 1 (40% of timeline): -30% cumulative decline
Phase 2 (40% of timeline): Mean-reversion, flat returns
Phase 3 (20% of timeline): Recovery to initial levels
"""
paths = np.ones((n_paths, n_steps + 1))
drawdown_end = int(n_steps * 0.4)
recovery_start = drawdown_end
recovery_end = int(n_steps * 0.8)
# Phase 1: Drawdown
phase1_returns = -0.30 / drawdown_end
for t in range(1, drawdown_end + 1):
paths[:, t] = paths[:, t - 1] * (1 + phase1_returns)
# Phase 2: Flat with noise
for t in range(drawdown_end + 1, recovery_end + 1):
noise = np.random.normal(0, 0.01, n_paths)
paths[:, t] = paths[:, t - 1] * (1 + noise)
# Phase 3: Recovery with noise
for t in range(recovery_end + 1, n_steps + 1):
recovery_rate = 0.30 / (n_steps - recovery_end)
noise = np.random.normal(0, 0.02, n_paths)
paths[:, t] = paths[:, t - 1] * (1 + recovery_rate + noise)
return paths
Integrating Monte Carlo with TickDB for Live Calibration
The simulation framework above is most powerful when calibrated to live market conditions rather than static historical parameters. The following workflow demonstrates continuous parameter updates using TickDB's real-time data feeds.
class LiveMonteCarloMonitor:
"""
Real-time Monte Carlo risk monitor using TickDB WebSocket stream.
Maintains a rolling window of returns and updates stress-test
parameters every N minutes based on current market conditions.
"""
WS_URL = "wss://stream.tickdb.ai/v1/market/kline"
def __init__(
self,
api_key: str,
symbol: str,
interval: str = "1m",
window_size: int = 60, # Rolling window of 60 periods
update_frequency: int = 300 # Update params every 300 seconds
):
self.api_key = api_key
self.symbol = symbol
self.interval = interval
self.window_size = window_size
self.update_frequency = update_frequency
self.returns_window: List[float] = []
self.last_update = 0
self.engine: Optional[MonteCarloEngine] = None
self.ws: Optional[WebSocketClient] = None
def _update_parameters(self):
"""Recalibrate Monte Carlo engine with latest return window."""
if len(self.returns_window) < self.window_size:
return
returns = pd.Series(self.returns_window[-self.window_size:])
config = SimulationConfig(
n_paths=5000, # Reduced for real-time performance
n_steps=252,
random_seed=None # No seed for genuine randomness
)
self.engine = MonteCarloEngine(config=config, returns=returns)
print(f"[{time.strftime('%H:%M:%S')}] Parameters updated:")
print(f" Rolling volatility: {returns.std() * np.sqrt(252 * 60 / self.interval_to_minutes()):.2%}")
print(f" Drift estimate: {returns.mean() * 252 * 60 / self.interval_to_minutes():.2%}")
self.last_update = time.time()
def interval_to_minutes(self) -> float:
"""Convert interval string to minutes."""
intervals = {"1m": 1, "5m": 5, "15m": 15, "1h": 60, "1d": 1440}
return intervals.get(self.interval, 1)
def _on_message(self, message: dict):
"""Handle incoming TickDB WebSocket message."""
if message.get("type") == "kline":
candle = message.get("data", {})
close_price = float(candle.get("close", 0))
open_price = float(candle.get("open", close_price))
if open_price > 0:
log_return = np.log(close_price / open_price)
self.returns_window.append(log_return)
# Maintain window size
if len(self.returns_window) > self.window_size * 2:
self.returns_window = self.returns_window[-self.window_size:]
# Check if update is needed
if time.time() - self.last_update >= self.update_frequency:
self._update_parameters()
self._run_stress_report()
def _run_stress_report(self):
"""Run and log stress-test results."""
if self.engine is None:
return
for method in [PathMethod.GBM, PathMethod.JUMP_DIFFUSION]:
paths = self.engine.generate_paths(method=method)
stats = self.engine.evaluate_strategy(paths)
print(f"\n [{method.value.upper()}] Live stress test:")
print(f" VaR 95%: {stats['percentile_5']:.2%}")
print(f" VaR 99%: {stats['percentile_1']:.2%}")
print(f" Max drawdown >20%: {stats['max_drawdown_20pct']:.2%}")
def start(self):
"""Connect to TickDB WebSocket and start monitoring."""
import websockets
params = f"symbol={self.symbol}&interval={self.interval}&api_key={self.api_key}"
async def connect():
async for websocket in websockets.connect(f"{self.WS_URL}?{params}"):
try:
# Heartbeat
await websocket.send('{"cmd": "ping"}')
await websocket.recv()
async for message in websocket:
data = json.loads(message)
self._on_message(data)
# Reconnect with exponential backoff + jitter
# ⚠️ Production implementation should handle disconnects here
except websockets.exceptions.ConnectionClosed:
print("WebSocket disconnected. Reconnecting...")
continue
asyncio.run(connect())
Interpreting Monte Carlo Results: Key Metrics
Running 10,000 simulations produces a distribution of outcomes. The following metrics provide actionable risk intelligence beyond what a single backtest can offer.
| Metric | Definition | Interpretation |
|---|---|---|
| VaR 95% (Value at Risk) | 5th percentile of returns | "In 95% of scenarios, losses do not exceed this figure." The conventional risk measure, but be aware it says nothing about the 5% tail. |
| CVaR 95% (Conditional VaR / Expected Shortfall) | Average return in the worst 5% of scenarios | The actual expected loss given that you are in the tail. For risk management, this is more informative than VaR. |
| Tail Risk Ratio | 5th percentile / 95th percentile | Measures asymmetry: a ratio of −0.5 means losses in the worst 5% are twice as large as gains in the best 5%. |
| Maximum Drawdown Distribution | Percentile table of worst peak-to-trough declines | Reveals the range of drawdown outcomes, not just the average. |
| Sharpe Ratio Distribution | Sharpe computed per path | A mean-reversion strategy with mean Sharpe 1.2 and 10th percentile Sharpe 0.3 is fragile; one with mean 1.0 and 10th percentile 0.8 is robust. |
Critical interpretation note: A Monte Carlo simulation is only as good as its assumptions. If you calibrate to a 2013–2021 dataset, you are implicitly assuming the next 10 years will resemble the most benign equity market in decades. Stress-test your assumptions, not just your strategy.
Model Validation: Backtesting the Monte Carlo
A counterintuitive but essential step: validate your Monte Carlo model against historical data.
The procedure:
- Use a rolling window: for each month in the dataset, calibrate Monte Carlo parameters using only data prior to that month.
- Run simulations and record the distribution of predicted outcomes.
- Compare predicted percentiles against actual realized outcomes.
- If actual outcomes systematically fall outside your 90% confidence bands, your model is miscalibrated.
def validate_monte_carlo(
returns: pd.Series,
window_size: int = 504, # ~2 years
step_size: int = 21, # Monthly
n_paths: int = 5000
) -> pd.DataFrame:
"""
Validate Monte Carlo model by comparing predictions to realized outcomes.
For each validation period, the model predicts a return distribution.
We then check whether the realized return falls within the predicted distribution.
"""
results = []
for i in range(window_size, len(returns) - step_size, step_size):
train_window = returns.iloc[i - window_size:i]
test_return = returns.iloc[i:i + step_size].sum()
# Calibrate on training window
drift = train_window.mean() * 252
volatility = train_window.std() * np.sqrt(252)
config = SimulationConfig(n_paths=n_paths, n_steps=step_size)
engine = MonteCarloEngine(config=config, returns=train_window)
paths = engine.generate_paths(PathMethod.JUMP_DIFFUSION)
stats = engine.evaluate_strategy(paths)
# Check if realized return is within predicted percentiles
in_5th = test_return < stats["percentile_5"]
in_95th = test_return > stats["percentile_95"]
in_99th = test_return > stats["percentile_99"] or test_return < stats["percentile_1"]
results.append({
"date": returns.index[i],
"realized_return": test_return,
"predicted_5th": stats["percentile_5"],
"predicted_50th": stats["percentile_50"],
"predicted_95th": stats["percentile_95"],
"outside_90ci": in_5th or in_95th,
"outside_99ci": in_99th
})
df = pd.DataFrame(results)
coverage_90 = 1 - df["outside_90ci"].mean()
coverage_99 = 1 - df["outside_99ci"].mean()
print(f"90% Confidence Interval Coverage: {coverage_90:.1%}")
print(f"99% Confidence Interval Coverage: {coverage_99:.1%}")
print(f"Expected coverage: 90% and 99% respectively")
if abs(coverage_90 - 0.90) > 0.05:
print("⚠️ Model under/over-covers 90% CI — recalibrate jump parameters")
if abs(coverage_99 - 0.99) > 0.02:
print("⚠️ Model under/over-covers 99% CI — consider adding correlation structure")
return df
Limitations and Honest Disclosures
Monte Carlo simulation is a powerful tool, but it is not a crystal ball. Every practitioner must be transparent about the following limitations:
GARCH effects: Standard GBM and jump-diffusion models assume volatility evolves independently of returns. In reality, leverage effects cause volatility to spike when prices fall. GARCH-family models produce more realistic tail distributions, but at the cost of additional calibration complexity.
Regime switching: Market behavior fundamentally differs across regimes (low-vol trending vs. high-vol mean-reverting). A single set of calibrated parameters cannot capture both. Hidden Markov models or regime-switching models address this, but they require substantially more historical data to estimate reliably.
Liquidity constraints: Monte Carlo simulations typically assume infinite liquidity—your strategy can always enter and exit at the simulated price. In practice, large orders move prices, especially in crisis scenarios when liquidity evaporates. Order-book models (available via TickDB's
depthchannel for HK and crypto markets) can partially address this.Behavioral factors: The simulation assumes rational, mechanical execution. In live trading, drawdown-induced capitulation, FOMO-driven position increases, and execution slippage from human delay all degrade realized performance relative to simulation.
Parameter instability: Calibrating to 10 years of data implicitly assumes the statistical properties of returns are stable over that period. They are not. A rolling calibration is more robust but requires more data.
Closing
Ten years of historical data tells you what your strategy did. Monte Carlo simulation tells you what it might do—across 10,000 plausible futures, each generated by a different shuffle of the historical deck, a different ordering of market shocks, a different manifestation of tail risk.
The core insight is not that Monte Carlo is more accurate than backtesting. It is that accuracy is the wrong goal. Backtesting optimizes for fidelity to one past reality. Monte Carlo optimizes for preparedness across all possible futures.
The disciplined approach is to use both: backtesting for parameter optimization and signal validation, Monte Carlo for risk quantification and scenario stress-testing. The two methods answer different questions. Together, they give you a more complete picture of what your strategy is, what it expects, and what could surprise it.
The code in this article integrates TickDB as the data acquisition layer—fetching 10 years of cleaned, aligned US equity OHLCV data via the REST API for parameter calibration, and streaming real-time candles via WebSocket for live risk monitoring. The simulation engine itself is market-data agnostic; it operates on the return distributions you provide, whether those distributions are derived from SPY, BTC-USD, or a custom factor dataset.
Next Steps
If you are evaluating a strategy on US equities, start by fetching 10 years of daily OHLCV data from TickDB and running the validation procedure above. Check whether your Monte Carlo model provides honest coverage of realized outcomes before trusting it for risk management.
If you want to stress-test against current market conditions, configure the LiveMonteCarloMonitor class with your strategy's symbols and set the update_frequency to match your rebalancing cadence. The rolling parameter recalibration ensures your stress-test reflects today's volatility regime, not the one from six months ago.
If you need multi-asset correlation stress-testing for a portfolio of equities, futures, and crypto, reach out to enterprise@tickdb.ai for access to cross-asset historical data covering forex, commodities, and digital assets alongside equities.
If you use AI coding assistants, search for and install the tickdb-market-data SKILL in your AI tool's marketplace for streamlined API integration in your own simulation workflows.
This article does not constitute investment advice. Markets involve risk; past performance does not guarantee future results. Monte Carlo simulation produces probabilistic estimates based on model assumptions; actual market outcomes may differ materially from simulated distributions.