Simulations

Run agent-based market simulations and retrieve results.

All simulation methods are available on client.simulation. What you can do depends on your tier.

Free tier

  1. List cached simulations with list_cached()
  2. Download data with get_sim_data(sim_id)

Pro tier

  1. Submit a job with run() — returns a job_id
  2. Track progress with get_job_status(job_id)
  3. Find past jobs with get_jobs()
  4. Retrieve results with get_sim_data()

Free tier

Cached simulations

Free tier users can browse and download a fixed set of pre-run baseline simulations. These cover two HKEX symbols across two calibration dates and two scenarios, with 10 Monte Carlo runs per group.

Available cached simulations

SymbolNameDateScenarioRuns
700.HKTencent2025-09-01normal10
700.HKTencent2025-09-01flash_crash10
700.HKTencent2025-09-02normal10
700.HKTencent2025-09-02flash_crash10
9999.HKNetEase2025-09-01normal10
9999.HKNetEase2025-09-01flash_crash10
9999.HKNetEase2025-09-02normal10
9999.HKNetEase2025-09-02flash_crash10

List cached simulations

python
# List all available cached simulations
cached = client.simulation.list_cached()
print(f"Found {cached['total']} simulation groups")

for sim in cached["simulations"]:
    print(f"{sim['symbol']} {sim['date']} {sim['scenario']}: {sim['n_runs']} runs")
    print(f"  sim_id: {sim['example_sim_id']}")

# Filter by symbol
cached = client.simulation.list_cached(symbol="700.HK")

# Filter by scenario
cached = client.simulation.list_cached(scenario="flash_crash")

Download cached simulation data

python
# Find a cached simulation
cached = client.simulation.list_cached(symbol="700.HK", scenario="normal")
sim_id = cached["simulations"][0]["example_sim_id"]

# Download full simulation data
df = client.simulation.get_sim_data(sim_id)
print(df.head())

# Download mid-price series
mid_df = client.simulation.get_sim_data(sim_id, "mid_price_by_min.parquet")

Pro tier

Submit a simulation

Use run() to submit a custom simulation job. Jobs run asynchronously and return a job ID immediately.

python
result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10
)

job_id = result["job_id"]
sim_ids = result["queued_sim_ids"]
print(f"Submitted job {job_id} with {len(sim_ids)} simulations")

Required parameters

ParameterTypeDescription
symbolstrTrading symbol (e.g., 9999.HK, 0005.HK)
cal_datestrCalibration date in YYYY-MM-DD format

Optional parameters

ParameterTypeDefaultDescription
n_runsint100Number of Monte Carlo runs
seedint42Random seed for reproducibility
scenariostrnormalMarket scenario to simulate
scenario_paramsdictNoneOverride scenario defaults
exec_algoslistNoneExecution algorithms to test

Finding calibration dates

Use get_available_symbols() to discover valid symbol and date combinations. Only dates where status = "complete" and stage = "model_calibration" will work — any other date will fail. See Available symbols for the full response schema and filtering example.

Market scenarios

Inject market events into your simulation to stress-test strategies. Set the scenario parameter to one of:

ScenarioDescription
normalNo injection — background agents only (default)
flash_crashLarge rapid SELL depleting bid-side liquidity
buy_panicLarge rapid BUY depleting ask-side liquidity
gradual_selloffSlow sustained SELL over extended period
trending_upSmall steady BUY producing persistent uptrend
trending_downSmall steady SELL producing persistent downtrend

Example: Flash crash

python
result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=20,
    scenario="flash_crash",
    scenario_params={
        "start_time": "11:00:00",
        "impact_multiplier": 15.0
    }
)

Scenario parameters

ParameterTypeDescription
impact_multiplierfloatTotal volume as multiple of resting liquidity
order_size_ratiofloatChild order size as fraction of liquidity
order_freqstrOrder spacing (e.g., 500ms, 5s, 30s)
start_timestrTime to begin scenario (e.g., 10:30:00)

Execution algorithms

Test execution strategies by passing an exec_algos list. Each algorithm config must include a type key.

TWAP (Time-Weighted Average Price)

python
result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=50,
    exec_algos=[{
        "type": "twap",
        "order_size": 50000,              # Total volume to execute
        "horizon": "02:00:00",            # Execution window (HH:MM:SS)
        "start_time": "09:30:00",         # Optional, defaults to market open
        "frequency": "00:00:30",          # Optional, default 1s
        "side": "sell",                   # Optional, inferred from order_size sign
        "random_offset": True             # Optional, default True
    }]
)

VWAP (Volume-Weighted Average Price)

python
result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=50,
    exec_algos=[{
        "type": "vwap",
        "order_size": 100000,
        "horizon": "02:00:00",
        "start_time": "10:00:00",
        "frequency": "00:00:30",
        "random_offset": True
    }]
)

CSS (Custom Static Schedule)

The orders parameter must be a pd.Series with a full datetime index — date and time combined. Use datetime.combine(cal_date, t) to build the start and end datetimes, then pd.date_range(..., periods=n) to space orders evenly across the window. Orders placed outside HKEX continuous trading hours (09:30–12:00, 13:00–16:00) are silently dropped.

python
import numpy as np
import pandas as pd
from datetime import date, datetime

def make_schedule(total_shares, start_t, end_t, cal_date, target_clip=100):
    """Distribute total_shares evenly across a datetime window."""
    start_dt = datetime.combine(cal_date, start_t)
    end_dt   = datetime.combine(cal_date, end_t)

    n_orders = int(np.ceil(total_shares / target_clip))
    idx = pd.date_range(start=start_dt, end=end_dt, periods=n_orders)

    # Spread remainder across first orders so total is exact
    q, r = divmod(total_shares, n_orders)
    sizes = np.full(n_orders, q, dtype=int)
    sizes[:r] += 1

    return pd.Series(sizes, index=idx)

cal_date = date(2025, 9, 1)
schedule = make_schedule(
    total_shares=10_000,
    start_t=datetime.strptime("09:30", "%H:%M").time(),
    end_t=datetime.strptime("12:00", "%H:%M").time(),
    cal_date=cal_date,
)

print(schedule)
print(f"Orders: {len(schedule)}, Total shares: {int(schedule.sum())}")
text
2025-09-01 09:30:00.000000000    100
2025-09-01 09:31:30.909090909    100
2025-09-01 09:33:01.818181818    100
2025-09-01 09:34:32.727272727    100
2025-09-01 09:36:03.636363636    100
                                 ...
2025-09-01 11:56:57.272727272    100
2025-09-01 11:58:28.181818181    100
2025-09-01 11:59:59.090909090    100
2025-09-01 12:00:00.000000000    100
dtype: int64

Orders: 100, Total shares: 10000
python
result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10,
    exec_algos=[{
        "type": "css",
        "orders": schedule
    }]
)

Multiple CSS strategies in one job

Pass multiple CSS configs in a single exec_algos list to compare strategies against the same shared baseline. Each strategy gets its own set of Monte Carlo runs within the same job.

python
from datetime import time

# Build a grid of execution windows to compare
windows = [
    (time(9, 30),  time(12, 0)),   # full morning
    (time(9, 30),  time(16, 0)),   # full day
    (time(13, 0),  time(16, 0)),   # full afternoon
]

exec_algos = [
    {"type": "css", "orders": make_schedule(10_000, s, e, cal_date)}
    for s, e in windows
]

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10,
    exec_algos=exec_algos,
)

# sim_ids are ordered: baseline[0:n_runs], strategy_0[n_runs:2*n_runs], ...
sim_ids = result["queued_sim_ids"]

Execution algorithm parameters

ParameterTypeRequiredDescription
typestrYesAlgorithm type: twap, vwap, or css
order_sizeinttwap/vwapTotal volume to execute
horizonstr (HH:MM:SS)twap/vwapExecution window duration
orderspd.SeriescssOrder schedule indexed by datetime
start_timestr (HH:MM:SS)NoStart time, defaults to market open
frequencystr (HH:MM:SS)NoOrder frequency, default 1 second
sidestrNobuy or sell, inferred from order_size sign
random_offsetbool/timedeltaNoRandomize order times, default True

Check job status

Use get_job_status() to track simulation progress. Jobs move through: queued → running → completed (or error).

python
status = client.simulation.get_job_status(job_id)

print(f"Total: {status['total_simulations']}")
print(f"Status: {status['status_summary']}")
print(f"Complete: {status['is_complete']}")

Polling for completion

python
import time

result = client.simulation.run(
    symbol="9999.HK",
    cal_date="2025-09-01",
    n_runs=10
)
job_id = result["job_id"]

while True:
    status = client.simulation.get_job_status(job_id)
    summary = status.get("status_summary", {})
    completed = summary.get("complete", 0) + summary.get("completed", 0)
    total = status.get("total_simulations", len(sim_ids))
    print(f"
  {completed}/{total} complete", end="", flush=True)
    if status.get("is_complete"):
        break

    time.sleep(30)  # Check every 30 seconds

View past jobs

Use get_jobs() to list all your simulation jobs. Useful for retrieving job IDs from previous sessions.

python
result = client.simulation.get_jobs()

print(f"You have {result['total']} jobs")

for job in result["jobs"]:
    print(f"Job {job['job_id']}")
    print(f"  Simulations: {len(job['sim_ids'])}")
    print(f"  Created: {job['created_at']}")

Both tiers

Retrieving results

Use get_sim_data(sim_id) to download output files as Polars DataFrames. This works for both cached sim_ids (free tier) and sim_ids from your own jobs (pro tier). See Download data for the full list of output files, schemas, and bulk download.

Get job results summary (Pro)

python
results = client.simulation.get_job_results(job_id)

print(f"Completed: {results['completed']}/{results['total_simulations']}")

for sim in results["simulations"]:
    if sim["status"] == "completed":
        print(f"Sim {sim['sim_id']}")
        print(f"  Files: {sim['available_files']}")
        print(f"  Metrics: {sim['metrics']}")

Understanding simulation IDs

Each Monte Carlo run has a unique sim_id with the format:

text
{symbol}:{cal_date}:{cal_hash}:{time}:{sim_hash}:{scenario}_{hash}:{algo}:{run}

For example: 9999.HK:2025-09-01:d96cf520:0930-1600:c8961b94:flash_crash_6e6d5b3a:baseline:0000

The sim_id encodes all configuration parameters, making it deterministic and reproducible. Use sim_ids to retrieve specific results from storage.