Simulations
Run agent-based market simulations and retrieve results.
All simulation methods are available on client.simulation. What you can do depends on your tier.
Free tier
- List cached simulations with
list_cached() - Download data with
get_sim_data(sim_id)
Pro tier
- Submit a job with
run()— returns a job_id - Track progress with
get_job_status(job_id) - Find past jobs with
get_jobs() - Retrieve results with
get_sim_data()
Free tier
Cached simulations
Free tier users can browse and download a fixed set of pre-run baseline simulations. These cover two HKEX symbols across two calibration dates and two scenarios, with 10 Monte Carlo runs per group.
Available cached simulations
| Symbol | Name | Date | Scenario | Runs |
|---|---|---|---|---|
| 700.HK | Tencent | 2025-09-01 | normal | 10 |
| 700.HK | Tencent | 2025-09-01 | flash_crash | 10 |
| 700.HK | Tencent | 2025-09-02 | normal | 10 |
| 700.HK | Tencent | 2025-09-02 | flash_crash | 10 |
| 9999.HK | NetEase | 2025-09-01 | normal | 10 |
| 9999.HK | NetEase | 2025-09-01 | flash_crash | 10 |
| 9999.HK | NetEase | 2025-09-02 | normal | 10 |
| 9999.HK | NetEase | 2025-09-02 | flash_crash | 10 |
List cached simulations
# List all available cached simulations
cached = client.simulation.list_cached()
print(f"Found {cached['total']} simulation groups")
for sim in cached["simulations"]:
print(f"{sim['symbol']} {sim['date']} {sim['scenario']}: {sim['n_runs']} runs")
print(f" sim_id: {sim['example_sim_id']}")
# Filter by symbol
cached = client.simulation.list_cached(symbol="700.HK")
# Filter by scenario
cached = client.simulation.list_cached(scenario="flash_crash")Download cached simulation data
# Find a cached simulation
cached = client.simulation.list_cached(symbol="700.HK", scenario="normal")
sim_id = cached["simulations"][0]["example_sim_id"]
# Download full simulation data
df = client.simulation.get_sim_data(sim_id)
print(df.head())
# Download mid-price series
mid_df = client.simulation.get_sim_data(sim_id, "mid_price_by_min.parquet")Pro tier
Submit a simulation
Use run() to submit a custom simulation job. Jobs run asynchronously and return a job ID immediately.
result = client.simulation.run(
symbol="9999.HK",
cal_date="2025-09-01",
n_runs=10
)
job_id = result["job_id"]
sim_ids = result["queued_sim_ids"]
print(f"Submitted job {job_id} with {len(sim_ids)} simulations")Required parameters
| Parameter | Type | Description |
|---|---|---|
| symbol | str | Trading symbol (e.g., 9999.HK, 0005.HK) |
| cal_date | str | Calibration date in YYYY-MM-DD format |
Optional parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| n_runs | int | 100 | Number of Monte Carlo runs |
| seed | int | 42 | Random seed for reproducibility |
| scenario | str | normal | Market scenario to simulate |
| scenario_params | dict | None | Override scenario defaults |
| exec_algos | list | None | Execution algorithms to test |
Finding calibration dates
Use get_available_symbols() to discover valid symbol and date combinations. Only dates where status = "complete" and stage = "model_calibration" will work — any other date will fail. See Available symbols for the full response schema and filtering example.
Market scenarios
Inject market events into your simulation to stress-test strategies. Set the scenario parameter to one of:
| Scenario | Description |
|---|---|
| normal | No injection — background agents only (default) |
| flash_crash | Large rapid SELL depleting bid-side liquidity |
| buy_panic | Large rapid BUY depleting ask-side liquidity |
| gradual_selloff | Slow sustained SELL over extended period |
| trending_up | Small steady BUY producing persistent uptrend |
| trending_down | Small steady SELL producing persistent downtrend |
Example: Flash crash
result = client.simulation.run(
symbol="9999.HK",
cal_date="2025-09-01",
n_runs=20,
scenario="flash_crash",
scenario_params={
"start_time": "11:00:00",
"impact_multiplier": 15.0
}
)Scenario parameters
| Parameter | Type | Description |
|---|---|---|
| impact_multiplier | float | Total volume as multiple of resting liquidity |
| order_size_ratio | float | Child order size as fraction of liquidity |
| order_freq | str | Order spacing (e.g., 500ms, 5s, 30s) |
| start_time | str | Time to begin scenario (e.g., 10:30:00) |
Execution algorithms
Test execution strategies by passing an exec_algos list. Each algorithm config must include a type key.
TWAP (Time-Weighted Average Price)
result = client.simulation.run(
symbol="9999.HK",
cal_date="2025-09-01",
n_runs=50,
exec_algos=[{
"type": "twap",
"order_size": 50000, # Total volume to execute
"horizon": "02:00:00", # Execution window (HH:MM:SS)
"start_time": "09:30:00", # Optional, defaults to market open
"frequency": "00:00:30", # Optional, default 1s
"side": "sell", # Optional, inferred from order_size sign
"random_offset": True # Optional, default True
}]
)VWAP (Volume-Weighted Average Price)
result = client.simulation.run(
symbol="9999.HK",
cal_date="2025-09-01",
n_runs=50,
exec_algos=[{
"type": "vwap",
"order_size": 100000,
"horizon": "02:00:00",
"start_time": "10:00:00",
"frequency": "00:00:30",
"random_offset": True
}]
)CSS (Custom Static Schedule)
The orders parameter must be a pd.Series with a full datetime index — date and time combined. Use datetime.combine(cal_date, t) to build the start and end datetimes, then pd.date_range(..., periods=n) to space orders evenly across the window. Orders placed outside HKEX continuous trading hours (09:30–12:00, 13:00–16:00) are silently dropped.
import numpy as np
import pandas as pd
from datetime import date, datetime
def make_schedule(total_shares, start_t, end_t, cal_date, target_clip=100):
"""Distribute total_shares evenly across a datetime window."""
start_dt = datetime.combine(cal_date, start_t)
end_dt = datetime.combine(cal_date, end_t)
n_orders = int(np.ceil(total_shares / target_clip))
idx = pd.date_range(start=start_dt, end=end_dt, periods=n_orders)
# Spread remainder across first orders so total is exact
q, r = divmod(total_shares, n_orders)
sizes = np.full(n_orders, q, dtype=int)
sizes[:r] += 1
return pd.Series(sizes, index=idx)
cal_date = date(2025, 9, 1)
schedule = make_schedule(
total_shares=10_000,
start_t=datetime.strptime("09:30", "%H:%M").time(),
end_t=datetime.strptime("12:00", "%H:%M").time(),
cal_date=cal_date,
)
print(schedule)
print(f"Orders: {len(schedule)}, Total shares: {int(schedule.sum())}")2025-09-01 09:30:00.000000000 100
2025-09-01 09:31:30.909090909 100
2025-09-01 09:33:01.818181818 100
2025-09-01 09:34:32.727272727 100
2025-09-01 09:36:03.636363636 100
...
2025-09-01 11:56:57.272727272 100
2025-09-01 11:58:28.181818181 100
2025-09-01 11:59:59.090909090 100
2025-09-01 12:00:00.000000000 100
dtype: int64
Orders: 100, Total shares: 10000result = client.simulation.run(
symbol="9999.HK",
cal_date="2025-09-01",
n_runs=10,
exec_algos=[{
"type": "css",
"orders": schedule
}]
)Multiple CSS strategies in one job
Pass multiple CSS configs in a single exec_algos list to compare strategies against the same shared baseline. Each strategy gets its own set of Monte Carlo runs within the same job.
from datetime import time
# Build a grid of execution windows to compare
windows = [
(time(9, 30), time(12, 0)), # full morning
(time(9, 30), time(16, 0)), # full day
(time(13, 0), time(16, 0)), # full afternoon
]
exec_algos = [
{"type": "css", "orders": make_schedule(10_000, s, e, cal_date)}
for s, e in windows
]
result = client.simulation.run(
symbol="9999.HK",
cal_date="2025-09-01",
n_runs=10,
exec_algos=exec_algos,
)
# sim_ids are ordered: baseline[0:n_runs], strategy_0[n_runs:2*n_runs], ...
sim_ids = result["queued_sim_ids"]Execution algorithm parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| type | str | Yes | Algorithm type: twap, vwap, or css |
| order_size | int | twap/vwap | Total volume to execute |
| horizon | str (HH:MM:SS) | twap/vwap | Execution window duration |
| orders | pd.Series | css | Order schedule indexed by datetime |
| start_time | str (HH:MM:SS) | No | Start time, defaults to market open |
| frequency | str (HH:MM:SS) | No | Order frequency, default 1 second |
| side | str | No | buy or sell, inferred from order_size sign |
| random_offset | bool/timedelta | No | Randomize order times, default True |
Check job status
Use get_job_status() to track simulation progress. Jobs move through: queued → running → completed (or error).
status = client.simulation.get_job_status(job_id)
print(f"Total: {status['total_simulations']}")
print(f"Status: {status['status_summary']}")
print(f"Complete: {status['is_complete']}")Polling for completion
import time
result = client.simulation.run(
symbol="9999.HK",
cal_date="2025-09-01",
n_runs=10
)
job_id = result["job_id"]
while True:
status = client.simulation.get_job_status(job_id)
summary = status.get("status_summary", {})
completed = summary.get("complete", 0) + summary.get("completed", 0)
total = status.get("total_simulations", len(sim_ids))
print(f"
{completed}/{total} complete", end="", flush=True)
if status.get("is_complete"):
break
time.sleep(30) # Check every 30 secondsView past jobs
Use get_jobs() to list all your simulation jobs. Useful for retrieving job IDs from previous sessions.
result = client.simulation.get_jobs()
print(f"You have {result['total']} jobs")
for job in result["jobs"]:
print(f"Job {job['job_id']}")
print(f" Simulations: {len(job['sim_ids'])}")
print(f" Created: {job['created_at']}")Both tiers
Retrieving results
Use get_sim_data(sim_id) to download output files as Polars DataFrames. This works for both cached sim_ids (free tier) and sim_ids from your own jobs (pro tier). See Download data for the full list of output files, schemas, and bulk download.
Get job results summary (Pro)
results = client.simulation.get_job_results(job_id)
print(f"Completed: {results['completed']}/{results['total_simulations']}")
for sim in results["simulations"]:
if sim["status"] == "completed":
print(f"Sim {sim['sim_id']}")
print(f" Files: {sim['available_files']}")
print(f" Metrics: {sim['metrics']}")Understanding simulation IDs
Each Monte Carlo run has a unique sim_id with the format:
{symbol}:{cal_date}:{cal_hash}:{time}:{sim_hash}:{scenario}_{hash}:{algo}:{run}For example: 9999.HK:2025-09-01:d96cf520:0930-1600:c8961b94:flash_crash_6e6d5b3a:baseline:0000
The sim_id encodes all configuration parameters, making it deterministic and reproducible. Use sim_ids to retrieve specific results from storage.