Simulations

Run agent-based market simulations and retrieve results.

All simulation methods are available on client.simulation. What you can do depends on your tier.

Free tier

List cached simulations with list_cached()
Download data with get_sim_data(sim_id)

Pro tier

Submit a job with run() — returns a job_id
Track progress with get_job_status(job_id)
Find past jobs with get_jobs()
Retrieve results with get_sim_data()

Free tier

Cached simulations

Free tier users can browse and download a fixed set of pre-run baseline simulations. These cover two HKEX symbols across two calibration dates and two scenarios, with 10 Monte Carlo runs per group.

Available cached simulations

Symbol	Name	Date	Scenario	Runs
700.HK	Tencent	2025-09-01	normal	10
700.HK	Tencent	2025-09-01	flash_crash	10
700.HK	Tencent	2025-09-02	normal	10
700.HK	Tencent	2025-09-02	flash_crash	10
9999.HK	NetEase	2025-09-01	normal	10
9999.HK	NetEase	2025-09-01	flash_crash	10
9999.HK	NetEase	2025-09-02	normal	10
9999.HK	NetEase	2025-09-02	flash_crash	10

List cached simulations

python

# List all available cached simulations
cached = client.simulation.list_cached()
print(f"Found {cached['total']} simulation groups")

for sim in cached["simulations"]:
    print(f"{sim['symbol']} {sim['date']} {sim['scenario']}: {sim['n_runs']} runs")
    print(f"  sim_id: {sim['example_sim_id']}")

# Filter by symbol
cached = client.simulation.list_cached(symbol="700.HK")

# Filter by scenario
cached = client.simulation.list_cached(scenario="flash_crash")

Download cached simulation data

python

# Find a cached simulation
cached = client.simulation.list_cached(symbol="700.HK", scenario="normal")
sim_id = cached["simulations"][0]["example_sim_id"]

# Download full simulation data
df = client.simulation.get_sim_data(sim_id)
print(df.head())

# Download mid-price series
mid_df = client.simulation.get_sim_data(sim_id, "mid_price_by_min.parquet")

Pro tier

Submit a simulation

Use run() to submit a custom simulation job. Jobs run asynchronously and return a job ID immediately.

python

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10
)

job_id = result["job_id"]
sim_ids = result["queued_sim_ids"]
print(f"Submitted job {job_id} with {len(sim_ids)} simulations")

Required parameters

Parameter	Type	Description
symbol	str	Trading symbol (e.g., 9999.HK, 0005.HK)
cal_date	str	Calibration date in YYYY-MM-DD format

Optional parameters

Parameter	Type	Default	Description
n_runs	int	100	Number of Monte Carlo runs
seed	int	42	Random seed for reproducibility
scenario	str	normal	Market scenario to simulate
scenario_params	dict	None	Override scenario defaults
exec_algos	list	None	Execution algorithms to test

Finding calibration dates

Use get_available_symbols() to discover valid symbol and date combinations. Only dates where status = "complete" and stage = "model_calibration" will work — any other date will fail. See Available symbols for the full response schema and filtering example.

Market scenarios

Inject market events into your simulation to stress-test strategies. Set the scenario parameter to one of:

Scenario	Description
normal	No injection — background agents only (default)
flash_crash	Large rapid SELL depleting bid-side liquidity
buy_panic	Large rapid BUY depleting ask-side liquidity
gradual_selloff	Slow sustained SELL over extended period
trending_up	Small steady BUY producing persistent uptrend
trending_down	Small steady SELL producing persistent downtrend

Example: Flash crash

python

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=20,
    scenario="flash_crash",
    scenario_params={
        "start_time": "11:00:00",
        "impact_multiplier": 15.0
    }
)

Scenario parameters

Parameter	Type	Description
impact_multiplier	float	Total volume as multiple of resting liquidity
order_size_ratio	float	Child order size as fraction of liquidity
order_freq	str	Order spacing (e.g., 500ms, 5s, 30s)
start_time	str	Time to begin scenario (e.g., 10:30:00)

Execution algorithms

Test execution strategies by passing an exec_algos list. Each algorithm config must include a type key.

TWAP (Time-Weighted Average Price)

python

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=50,
    exec_algos=[{
        "type": "twap",
        "order_size": 50000,              # Total volume to execute
        "horizon": "02:00:00",            # Execution window (HH:MM:SS)
        "start_time": "09:30:00",         # Optional, defaults to market open
        "frequency": "00:00:30",          # Optional, default 1s
        "side": "sell",                   # Optional, inferred from order_size sign
        "random_offset": True             # Optional, default True
    }]
)

VWAP (Volume-Weighted Average Price)

python

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=50,
    exec_algos=[{
        "type": "vwap",
        "order_size": 100000,
        "horizon": "02:00:00",
        "start_time": "10:00:00",
        "frequency": "00:00:30",
        "random_offset": True
    }]
)

CSS (Custom Static Schedule)

The orders parameter must be a pd.Series with a full datetime index — date and time combined. Use datetime.combine(cal_date, t) to build the start and end datetimes, then pd.date_range(..., periods=n) to space orders evenly across the window. Orders placed outside HKEX continuous trading hours (09:30–12:00, 13:00–16:00) are silently dropped.

python

import numpy as np
import pandas as pd
from datetime import date, datetime

def make_schedule(total_shares, start_t, end_t, cal_date, target_clip=100):
    """Distribute total_shares evenly across a datetime window."""
    start_dt = datetime.combine(cal_date, start_t)
    end_dt   = datetime.combine(cal_date, end_t)

    n_orders = int(np.ceil(total_shares / target_clip))
    idx = pd.date_range(start=start_dt, end=end_dt, periods=n_orders)

    # Spread remainder across first orders so total is exact
    q, r = divmod(total_shares, n_orders)
    sizes = np.full(n_orders, q, dtype=int)
    sizes[:r] += 1

    return pd.Series(sizes, index=idx)

cal_date = date(2025, 9, 1)
schedule = make_schedule(
    total_shares=10_000,
    start_t=datetime.strptime("09:30", "%H:%M").time(),
    end_t=datetime.strptime("12:00", "%H:%M").time(),
    cal_date=cal_date,
)

print(schedule)
print(f"Orders: {len(schedule)}, Total shares: {int(schedule.sum())}")

text

2025-09-01 09:30:00.000000000    100
2025-09-01 09:31:30.909090909    100
2025-09-01 09:33:01.818181818    100
2025-09-01 09:34:32.727272727    100
2025-09-01 09:36:03.636363636    100
                                 ...
2025-09-01 11:56:57.272727272    100
2025-09-01 11:58:28.181818181    100
2025-09-01 11:59:59.090909090    100
2025-09-01 12:00:00.000000000    100
dtype: int64

Orders: 100, Total shares: 10000

python

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10,
    exec_algos=[{
        "type": "css",
        "orders": schedule
    }]
)

Multiple CSS strategies in one job

Pass multiple CSS configs in a single exec_algos list to compare strategies against the same shared baseline. Each strategy gets its own set of Monte Carlo runs within the same job.

python

from datetime import time

# Build a grid of execution windows to compare
windows = [
    (time(9, 30),  time(12, 0)),   # full morning
    (time(9, 30),  time(16, 0)),   # full day
    (time(13, 0),  time(16, 0)),   # full afternoon
]

exec_algos = [
    {"type": "css", "orders": make_schedule(10_000, s, e, cal_date)}
    for s, e in windows
]

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10,
    exec_algos=exec_algos,
)

# sim_ids are ordered: baseline[0:n_runs], strategy_0[n_runs:2*n_runs], ...
sim_ids = result["queued_sim_ids"]

Execution algorithm parameters

Parameter	Type	Required	Description
type	str	Yes	Algorithm type: twap, vwap, or css
order_size	int	twap/vwap	Total volume to execute
horizon	str (HH:MM:SS)	twap/vwap	Execution window duration
orders	pd.Series	css	Order schedule indexed by datetime
start_time	str (HH:MM:SS)	No	Start time, defaults to market open
frequency	str (HH:MM:SS)	No	Order frequency, default 1 second
side	str	No	buy or sell, inferred from order_size sign
random_offset	bool/timedelta	No	Randomize order times, default True

Check job status

Use get_job_status() to track simulation progress. Jobs move through: queued → running → completed (or error).

python

status = client.simulation.get_job_status(job_id)

print(f"Total: {status['total_simulations']}")
print(f"Status: {status['status_summary']}")
print(f"Complete: {status['is_complete']}")

Polling for completion

python

import time

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10
)
job_id = result["job_id"]

while True:
    status = client.simulation.get_job_status(job_id)
    summary = status.get("status_summary", {})
    completed = summary.get("complete", 0) + summary.get("completed", 0)
    total = status.get("total_simulations", len(sim_ids))
    print(f"
  {completed}/{total} complete", end="", flush=True)
    if status.get("is_complete"):
        break

    time.sleep(30)  # Check every 30 seconds

View past jobs

Use get_jobs() to list all your simulation jobs. Useful for retrieving job IDs from previous sessions.

python

result = client.simulation.get_jobs()

print(f"You have {result['total']} jobs")

for job in result["jobs"]:
    print(f"Job {job['job_id']}")
    print(f"  Simulations: {len(job['sim_ids'])}")
    print(f"  Created: {job['created_at']}")

Both tiers

Retrieving results

Use get_sim_data(sim_id) to download output files as Polars DataFrames. This works for both cached sim_ids (free tier) and sim_ids from your own jobs (pro tier). See Download data for the full list of output files, schemas, and bulk download.

Get job results summary (Pro)

python

results = client.simulation.get_job_results(job_id)

print(f"Completed: {results['completed']}/{results['total_simulations']}")

for sim in results["simulations"]:
    if sim["status"] == "completed":
        print(f"Sim {sim['sim_id']}")
        print(f"  Files: {sim['available_files']}")
        print(f"  Metrics: {sim['metrics']}")

Understanding simulation IDs

Each Monte Carlo run has a unique sim_id with the format:

text

{symbol}:{cal_date}:{cal_hash}:{time}:{sim_hash}:{scenario}_{hash}:{algo}:{run}

For example: 9999.HK:2025-09-01:d96cf520:0930-1600:c8961b94:flash_crash_6e6d5b3a:baseline:0000

The sim_id encodes all configuration parameters, making it deterministic and reproducible. Use sim_ids to retrieve specific results from storage.

API key management→