Skip to content

CausalRL

MDP OPE Script

MDP OPE Tutorial (Script)¶

This mirrors examples/quickstart/mdp_ope.py and focuses on trajectory-based estimators in a finite-horizon MDP.

Run the script¶

python -m examples.quickstart.mdp_ope

What it does¶

Generates synthetic trajectories with known ground truth.
Declares Markov and overlap assumptions.
Compares IS, WIS, PDIS, DR, and FQE.
Prints a summary table for quick inspection.

Interpretation tips¶

Horizon effects: ESS often decays over time.
DR/FQE: lower variance if the value model is well specified.
PDIS: more stable than full-trajectory IS in long horizons.

Next steps¶

See the long-horizon comparison notebook: notebooks/11_mdp_long_horizon_comparison.ipynb
Review estimator selection guidance: Estimator Selection Guide