Building Production-Grade Data Infrastructure for a 3-Person Quant Team | API Guide

Three people. Tight budgets. Ambitious alpha targets. The 3-person quantitative trading team occupies a strange middle ground: too small for enterprise infrastructure, too ambitious for spreadsheets and sticky notes, yet far enough from solo operation that collaboration friction becomes the silent productivity killer.

A solo quant developer works fast because decisions flow through one mind. Three people introduce coordination costs that compound invisibly — the wrong data version in a backtest, a hardcoded API key that breaks when one person leaves, a Jupyter notebook that works on one laptop and fails on another. These are not hypothetical problems. They are the daily friction points that steal hours from research and turn team dynamics into blame games.

This article maps a practical infrastructure stack for a 3-person quant team. It covers shared data access via TickDB, Git-based code collaboration patterns, secure API key management, and permission control that scales from the startup phase through growth. Every recommendation includes production-grade code — not toy examples, but the exact configurations your team will deploy on day one.

1. The Core Problem: Data and Code Are Not Atomic

Before diving into tooling, it is worth naming the structural problem precisely.

In a solo quant workflow, data and code are tightly coupled: you download data, you write code, you run backtests, you iterate. The entire loop lives in one person's environment. When you introduce two more collaborators, the loop fragments:

Data fragmentation: Person A pulls today's options flow. Person B runs a backtest on last week's data. Person C cannot replicate either without asking. Results diverge because the underlying data is not shared.
Code fragmentation: Without a disciplined Git workflow, three people writing code produces merge conflicts, stale branches, and the infamous "works on my machine" syndrome.
Credential fragmentation: Hardcoded API keys spread across laptops, pasted into Slack messages, or stored in spreadsheets. Revocation when a team member leaves becomes a security incident.

The 3-person team needs infrastructure that eliminates these fragmentation points without introducing enterprise-level complexity that slows everyone down.

2. Shared Data Architecture with TickDB

2.1 Why TickDB Solves the Data Sharing Problem

The simplest solution to data fragmentation is a centralized data source that every team member queries programmatically. TickDB's REST API and WebSocket endpoints provide exactly this: a single source of truth for market data that your team's code accesses directly.

Each team member runs the same data-fetching code with their own API key. Data is never "downloaded and shared" as a file — it is fetched on-demand from a consistent source. This eliminates version mismatches because everyone is reading from the same live endpoint.

2.2 Centralized Data Fetching Module

Create a shared Python module that all your research scripts import. This module handles authentication, error handling, and response normalization. One person maintains it; everyone benefits.

# quant_team_data/data_client.py
"""
Shared TickDB data client for team use.
All team members import this module instead of writing their own fetch logic.
"""
import os
import time
import logging
from typing import Optional, Dict, Any, List

import requests

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(name)s | %(message)s"
)
logger = logging.getLogger("data_client")


class TickDBError(Exception):
    """Base exception for TickDB-related errors."""
    pass


class RateLimitError(TickDBError):
    """Raised when the API rate limit is exceeded."""
    pass


def get_api_key() -> str:
    """
    Load API key from environment variable.
    Prevents hardcoded keys in shared scripts.
    """
    api_key = os.environ.get("TICKDB_API_KEY")
    if not api_key:
        raise EnvironmentError(
            "TICKDB_API_KEY not set. "
            "Run: export TICKDB_API_KEY='your-key-here'"
        )
    return api_key


def handle_api_response(
    response: requests.Response,
    max_retries: int = 3,
    base_delay: float = 1.0
) -> Dict[str, Any]:
    """
    Handle TickDB API response with retry logic.
    Implements exponential backoff with jitter for rate limit handling.
    """
    # Handle rate limit responses (HTTP 429)
    if response.status_code == 429:
        retry_after = int(response.headers.get("Retry-After", base_delay))
        logger.warning(f"Rate limited. Waiting {retry_after}s before retry.")
        time.sleep(retry_after)
        raise RateLimitError(f"Rate limited. Retry after {retry_after}s.")

    # Parse JSON response
    try:
        data = response.json()
    except ValueError:
        raise TickDBError(f"Invalid JSON response: {response.text[:200]}")

    # Check application-level error codes
    code = data.get("code", 0)
    if code == 3001:
        retry_after = int(response.headers.get("Retry-After", 5))
        logger.warning(f"TickDB rate limit (code 3001). Waiting {retry_after}s.")
        time.sleep(retry_after)
        raise RateLimitError(f"Rate limited (code 3001). Retry after {retry_after}s.")
    elif code in (1001, 1002):
        raise TickDBError("Invalid API key. Check TICKDB_API_KEY environment variable.")
    elif code == 2002:
        raise TickDBError(f"Symbol not found: {data.get('message')}")
    elif code != 0:
        raise TickDBError(f"TickDB error {code}: {data.get('message')}")

    return data.get("data", {})


def fetch_klines(
    symbol: str,
    interval: str = "1h",
    limit: int = 100,
    timeout: tuple = (3.05, 10)
) -> List[Dict[str, Any]]:
    """
    Fetch OHLCV klines for a given symbol.

    Args:
        symbol: Trading pair symbol (e.g., "AAPL.US", "BTC.BITSTAMP")
        interval: Candlestick interval (e.g., "1m", "5m", "1h", "1d")
        limit: Number of candles to fetch (max 1000 for most endpoints)
        timeout: (connect_timeout, read_timeout) in seconds

    Returns:
        List of kline dictionaries with OHLCV data.
    """
    api_key = get_api_key()
    headers = {"X-API-Key": api_key}

    url = "https://api.tickdb.ai/v1/market/kline"
    params = {
        "symbol": symbol,
        "interval": interval,
        "limit": limit
    }

    logger.info(f"Fetching {limit} klines for {symbol} ({interval})")

    response = requests.get(
        url,
        headers=headers,
        params=params,
        timeout=timeout
    )

    return handle_api_response(response)


def fetch_latest_price(
    symbol: str,
    timeout: tuple = (3.05, 10)
) -> Optional[float]:
    """
    Fetch the latest price for a given symbol.
    Useful for real-time monitoring dashboards.
    """
    api_key = get_api_key()
    headers = {"X-API-Key": api_key}

    url = "https://api.tickdb.ai/v1/market/kline/latest"
    params = {"symbol": symbol}

    response = requests.get(
        url,
        headers=headers,
        params=params,
        timeout=timeout
    )

    data = handle_api_response(response)

    if data and "close" in data:
        return float(data["close"])
    return None

2.3 Team Data Access Pattern

Each team member creates their own .env file locally:

# ~/.quant_env
TICKDB_API_KEY=tk_live_xxxxxxxxxxxxxxxxxxxxxxxx

All scripts source this environment variable. The shared module never contains a key — only the environment variable name. When a new team member joins, they receive their own API key from the TickDB dashboard and add one line to their shell profile.

3. Git Workflow for Quantitative Teams

3.1 Repository Structure

A well-structured repository prevents the "everything in one folder" chaos that afflicts small teams. Use a monorepo with clearly separated concerns:

quant-team/
├── src/
│   ├── data/           # Data fetching, cleaning, storage
│   ├── features/       # Feature engineering pipelines
│   ├── strategies/     # Strategy implementations
│   ├── backtest/       # Backtesting framework and results
│   └── utils/          # Shared utilities
├── notebooks/         # Jupyter notebooks (one per researcher)
├── config/            # Environment configs, parameter grids
├── data/              # Local data cache (gitignored)
├── tests/             # Unit and integration tests
├── pyproject.toml
├── .env.example        # Template for local .env
├── .gitignore
└── README.md

The .env.example file is critical. It tells every team member exactly which environment variables their local setup needs:

# .env.example
# Copy this file to .env and fill in your values.
# NEVER commit .env to the repository.

TICKDB_API_KEY=tk_live_your_key_here
ALPACA_API_KEY=your_alpaca_key
ALPACA_SECRET_KEY=your_alpaca_secret

# Research parameters
DEFAULT_BACKTEST_START=2020-01-01
DEFAULT_BACKTEST_END=2024-12-31

3.2 Git Branching Strategy

For a 3-person team, a lightweight branching model works better than GitFlow:

main: Production-ready code. Only merged after code review. Protected branch — requires pull request.
develop: Integration branch for completed features. All team members merge here before promoting to main.
feature/<researcher-name>/<topic>: Individual research branches. Short-lived (1–3 days), frequently rebased on develop.

Example workflow:

# Start a new research branch
git checkout develop
git pull origin develop
git checkout -b feature/alice/earnings-spread-model

# Work on the strategy
# ... write code, run backtests locally ...

# Commit with descriptive messages
git add src/strategies/earnings_spread.py
git commit -m "feat: initial earnings spread model with vol surface"

# Push and create pull request
git push -u origin feature/alice/earnings-spread-model
# ... open PR on GitHub/GitLab, request review from teammates ...

# After review, merge into develop
git checkout develop
git merge feature/alice/earnings-spread-model --no-ff
git push origin develop

3.3 Commit Message Convention

Standardize commit messages so the team can read history efficiently:

<type>: <short description>

[Optional body with context]

Types:
  feat:     New strategy or feature
  fix:      Bug fix
  data:     Data pipeline changes
  test:     Adding or updating tests
  refactor: Code restructuring (no behavior change)
  docs:     Documentation updates

Example:

feat: add earnings gap mean-reversion strategy

- Implements 30-minute post-earnings window detection
- Uses depth channel buy/sell pressure ratio as entry signal
- Backtested on 45 earnings events (2021-2024)

Closes #23

3.4 Preventing Data Files in Git

Your .gitignore must exclude all data files:

# .gitignore

# Environment and secrets
.env
*.pem
*.key

# Data files (never commit market data)
data/
*.csv
*.parquet
*.feather
*.pkl

# Jupyter notebooks outputs
*.ipynb_checkpoints/
.ipynb_checkpoints/

# Python
__pycache__/
*.py[cod]
*$py.class
.venv/
venv/

# IDE
.vscode/
.idea/

# OS
.DS_Store
Thumbs.db

4. API Key Management at Team Scale

4.1 The Key Management Problem

When three people share one hardcoded API key, you face three compounding risks:

Key rotation becomes a coordination nightmare. Changing one key means updating three laptops simultaneously.
Revocation is impossible without disrupting everyone. If one team member leaves, you cannot revoke their access without forcing the other two to update their configuration.
Audit trails disappear. You cannot tell which team member made which API call.

The solution is per-user API keys with role-based access.

4.2 TickDB Team Key Architecture

Each team member creates their own API key via the TickDB dashboard. Assign keys to the minimum set of permissions required for each person's role:

Role	Typical permissions	Example use case
Researcher (2 people)	Read-only market data (kline, depth, trades)	Pulling historical data for backtesting
Infrastructure (1 person)	Read + write if deploying live trading	Managing deployed strategies

Store keys in environment variables, never in code:

# Correct: key loaded from environment
api_key = os.environ.get("TICKDB_API_KEY")

# Wrong: key hardcoded in source file
api_key = "tk_live_abc123..."  # Never do this

4.3 Key Rotation Protocol

When a team member leaves:

Log into the TickDB dashboard as the admin.
Revoke the departing member's API key immediately.
If a shared service key existed, rotate it and update the team's .env files via a secure channel (1Password, Bitwarden, or an encrypted Slack message followed by confirmation).
Document the rotation in the team's internal wiki.

4.4 Local Key Loading for Development

Create a project-level script that loads environment variables from a .env file, making local development seamless:

# load_env.py
"""Load environment variables from .env file for local development."""
import os
from pathlib import Path
from dotenv import load_dotenv

# Find .env in project root
project_root = Path(__file__).parent
env_path = project_root / ".env"

if env_path.exists():
    load_dotenv(env_path)
    print(f"Loaded environment from {env_path}")
else:
    print("Warning: .env file not found. Ensure TICKDB_API_KEY is set in your shell.")

# At the top of every research script
import sys
from pathlib import Path

# Add project root to path
sys.path.insert(0, str(Path(__file__).parent.parent))

from load_env import *  # Loads .env if present

5. Permission Control and Access Governance

5.1 Principle of Least Privilege

Every system your team uses should enforce the principle of least privilege: each member accesses only the data and functionality required for their current role. This is not paranoia — it is operational discipline.

For a 3-person quant team, least privilege means:

Data access: Researchers can read market data but cannot delete historical datasets. Infrastructure can read and write to production databases.
Code access: Everyone can read all code (transparency is a feature). Only senior researchers can merge to main.
Execution access: Only designated infrastructure accounts can place live trades. Individual laptops cannot touch production brokerages.

5.2 Shared Infrastructure Permissions

If your team uses cloud resources (AWS, GCP, or a co-located server), define IAM-style roles:

# infrastructure/permissions.yaml
# Example: define roles for team infrastructure

roles:
  researcher_alice:
    tickdb:
      - read:kline
      - read:depth
    s3_data_bucket:
      - s3:GetObject
      - s3:ListBucket
    trading_database:
      - read:all
      - write:backtest_results

  researcher_bob:
    tickdb:
      - read:kline
      - read:depth
      - read:trades
    s3_data_bucket:
      - s3:GetObject
      - s3:ListBucket
    trading_database:
      - read:all
      - write:backtest_results

  infrastructure_charlie:
    tickdb:
      - read:kline
      - read:depth
      - write:alerts
    s3_data_bucket:
      - s3:*
    trading_database:
      - read:all
      - write:all
    broker_api:
      - trade:paper

5.3 Protecting Production Credentials

Production broker credentials (Alpaca, Interactive Brokers, etc.) should never exist on individual laptops. Use a secrets manager:

# infrastructure/secrets_manager.py
"""
Secure secrets manager for production credentials.
Stores secrets encrypted at rest; decrypts only at runtime.
"""
import os
import json
import base64
from cryptography.fernet import Fernet
from typing import Dict, Any

# In production, load KEY from a secure vault (AWS Secrets Manager, HashiCorp Vault)
# Never store the encryption key in the codebase
ENCRYPTION_KEY = os.environ.get("SECRETS_ENCRYPTION_KEY")
if not ENCRYPTION_KEY:
    raise EnvironmentError("SECRETS_ENCRYPTION_KEY not set in production environment.")

_cipher = Fernet(ENCRYPTION_KEY.encode())


def get_production_credentials(service: str) -> Dict[str, str]:
    """
    Retrieve and decrypt production credentials for a given service.
    In production, these are fetched from a secure vault.
    """
    encrypted_path = f"/secrets/{service}.enc"

    if not os.path.exists(encrypted_path):
        raise FileNotFoundError(f"Encrypted credentials for {service} not found.")

    with open(encrypted_path, "rb") as f:
        encrypted_data = f.read()

    decrypted = _cipher.decrypt(encrypted_data)
    return json.loads(decrypted.decode())


# Usage in production trading code:
# credentials = get_production_credentials("alpaca")
# api_key = credentials["api_key"]
# secret_key = credentials["secret_key"]

6. Shared Backtest Results and Experiment Tracking

6.1 The Results Fragmentation Problem

Backtest results scatter across laptops when teams lack a shared tracking system. Alice has 12 versioned backtest folders. Bob has a spreadsheet. Charlie has results in a Slack thread from three weeks ago. Reproducibility collapses.

6.2 Structured Results Repository

Establish a convention for storing backtest results:

results/
├── earnings_gap/
│   ├── 2024-02-15_nvda_gap_reversion_v1.yaml
│   ├── 2024-03-10_aapl_gap_reversion_v1.yaml
│   └── 2024-03-10_aapl_gap_reversion_v2.yaml
├── momentum/
│   └── ...
└── arbitrage/
    └── ...

Each result file is a structured YAML document:

# results/earnings_gap/2024-03-10_aapl_gap_reversion_v2.yaml

experiment_id: exp_2024_03_10_aapl_v2
strategy: earnings_gap_reversion
version: v2
date: 2024-03-10
analyst: alice

parameters:
  entry_window_minutes: 30
  exit_window_minutes: 120
  pressure_ratio_threshold: 2.5
  max_position_size: 0.05

backtest_config:
  start_date: 2021-01-01
  end_date: 2024-02-28
  symbols: [AAPL.US, MSFT.US, GOOGL.US]
  initial_capital: 100000
  commission: 0.001
  slippage: 0.0005

results:
  total_return: 0.234
  sharpe_ratio: 1.45
  max_drawdown: -0.082
  win_rate: 0.62
  profit_factor: 1.78
  num_trades: 127

conclusion: "Improved entry timing with pressure_ratio > 2.5 reduces false breakouts."
next_steps: "Test on Q1 2024 earnings specifically."

6.3 Automated Results Logging

Integrate results logging into your backtest framework so it happens automatically:

# backtest/runner.py
"""Automated backtest runner with structured result logging."""
import yaml
import json
from datetime import datetime
from pathlib import Path
from typing import Dict, Any
import hashlib

def log_backtest_result(
    results_dir: Path,
    experiment: Dict[str, Any],
    metrics: Dict[str, Any]
) -> str:
    """
    Log backtest results to a structured YAML file.
    Generates a unique experiment ID based on parameters and timestamp.
    """
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    param_hash = hashlib.md5(
        json.dumps(experiment["parameters"], sort_keys=True).encode()
    ).hexdigest()[:8]

    experiment_id = f"exp_{timestamp}_{param_hash}"

    result_doc = {
        "experiment_id": experiment_id,
        "strategy": experiment.get("strategy", "unknown"),
        "timestamp": timestamp,
        "parameters": experiment["parameters"],
        "backtest_config": experiment["backtest_config"],
        "results": metrics
    }

    strategy_dir = results_dir / experiment.get("strategy", "unknown")
    strategy_dir.mkdir(parents=True, exist_ok=True)

    result_file = strategy_dir / f"{timestamp}_{param_hash}.yaml"
    with open(result_file, "w") as f:
        yaml.dump(result_doc, f, default_flow_style=False, sort_keys=False)

    print(f"Results logged: {result_file}")
    return experiment_id

7. Practical Onboarding Checklist for New Team Members

When a third person joins your quant team, the onboarding process should take less than two hours if the infrastructure is properly documented. Here is the checklist:

Day 1 — Setup (approximately 90 minutes):

Clone the team repository: git clone <repo-url>
Copy .env.example to .env and request API keys from the team lead
Set TICKDB_API_KEY in your shell profile
Install dependencies: pip install -r requirements.txt
Run the test suite: pytest tests/
Verify data access: python -c "from quant_team_data.data_client import fetch_latest_price; print(fetch_latest_price('AAPL.US'))"
Read the README.md and docs/architecture.md
Schedule a 30-minute walkthrough with one existing team member

Week 1 — Integration:

Pick up a starter task from the issue tracker (tagged good-first-issue)
Submit your first pull request, even for a small fix
Review one pull request from an existing member
Attend the weekly research sync and present your initial findings

8. Scaling Considerations: From 3 to 10 People

The infrastructure described here scales comfortably to 5–8 people. When you reach 8–10 researchers, consider these additions:

Concern	3-person solution	10-person solution
Data storage	TickDB API (on-demand)	Add S3/GCS data lake for computed features
Code review	Simple PR review	Mandatory review by 2 people
Backtest tracking	YAML files in repo	Weights & Biases, MLflow, or DVC
Secrets management	Encrypted files	HashiCorp Vault or AWS Secrets Manager
Deployment	Manual scripts	CI/CD pipeline (GitHub Actions)
Documentation	README + ad hoc	Sphinx or MkDocs with CI publishing

The jump from 3 to 10 is where operational overhead becomes the primary constraint on research velocity. Investing in infrastructure early pays compounding dividends as the team grows.

Closing

Three people in a quant team is not half of six. It is a fundamentally different mode of operation — small enough for direct communication, large enough to need systems. The goal of this infrastructure is not to add bureaucracy. It is to eliminate the coordination costs that scale superlinearly with team size.

Data lives in a centralized API. Code lives in a disciplined Git repository. Credentials live in environment variables, never in source files. Permissions follow the principle of least privilege. Results are logged in a structured format that everyone can read and build upon.

These are not exotic engineering requirements. They are the same practices used by software teams of any size that want to ship reliable, reproducible work. Your alpha research deserves the same engineering discipline as production trading systems.

Next Steps

If you're setting up a small quant team and need market data infrastructure:

Create a free TickDB account at tickdb.ai
Generate individual API keys for each team member
Clone the example repository structure and adapt it to your strategies
Set up your first shared research branch and run a collaborative backtest

If you're an individual quant developer looking to collaborate:

The patterns in this article apply to any team size. Start with the shared data client module and Git workflow — they will pay off the moment a second person touches your code.

If you need enterprise-grade data for multi-year backtests:

Reach out to enterprise@tickdb.ai for historical OHLCV data spanning 10+ years across 6 asset classes.

This article does not constitute investment advice. Markets involve risk; past performance does not guarantee future results.