Skip to content

CausalRL

CausalRL Banner

Estimand-first causal RL and off-policy evaluation

Know what you're estimating. Know when to trust it. Know how it was produced.


Package: causalrl · Import: crl · Version: 0.2.0 · GitHub


Why CausalRL?

  • Estimand-first Design


    Every estimator is tied to a formal estimand with explicit identification assumptions. Know what you're estimating.

  • Diagnostics by Default


    Overlap, ESS, weight tails, and shift checks run automatically. Know when to trust your estimates.

  • 20+ Estimators


    IS, WIS, DR, WDR, MAGIC, MRDR, MIS, FQE, DualDICE, GenDICE, DRL—all in a unified pipeline.

  • Sensitivity Analysis


    Bounded-confounding curves for bandits and sequential settings. Quantify robustness to hidden confounders.

  • D4RL & RL Unplugged


    Built-in adapters for standard RL benchmarks. Load datasets with one line of code.

  • Audit-Ready Reports


    HTML reports with tables, figures, and full metadata bundles. Share reproducible results.

  • Ground-Truth Benchmarks


    Synthetic bandit/MDP suites with known true values. Validate estimators before deployment.

  • Production Ready


    Type-checked, tested, with deterministic seeding throughout. Built for research reliability.


60-Second Quickstart

# Install from PyPI
pip install causalrl

# With all optional extras
pip install "causalrl[all]"

# Or install from source
git clone https://github.com/gsaco/causalrl
cd causalrl
pip install -e .
from crl.benchmarks.bandit_synth import SyntheticBandit, SyntheticBanditConfig
from crl.ope import evaluate_ope

# Create benchmark with known ground truth
benchmark = SyntheticBandit(SyntheticBanditConfig(seed=0))
dataset = benchmark.sample(num_samples=1000, seed=1)

# Run end-to-end evaluation
report = evaluate_ope(dataset=dataset, policy=benchmark.target_policy)

# View results and generate report
print(report.summary_table())
report.save_html("report.html")
# Bandit OPE demo
python -m examples.quickstart.bandit_ope

# MDP evaluation
python -m examples.quickstart.mdp_ope

# Full benchmark suite
python -m experiments.run_benchmarks --suite all --out results/

Scope

The current evaluate pipeline assumes discrete action spaces for importance sampling estimators. See Limitations for details on continuous actions.


The Three Pillars

Pillar Why It Matters What You Get
Estimands Know what quantity you're estimating—not just which estimator Explicit estimands with identification assumptions via AssumptionSet
Diagnostics Know when an estimate is fragile before acting on it Overlap checks, ESS, weight tails, shift diagnostics, sensitivity curves
Evidence Know how results were produced for auditing Versioned configs, deterministic seeds, structured report bundles

See the full Results Gallery


Why Trust CausalRL?

  • Explicit Assumptions

    Every estimand declares its identification assumptions via AssumptionSet—no hidden requirements.

  • Deterministic Benchmarks

    Synthetic generators with fixed seeds produce identical results across runs.

  • Comprehensive Testing

    Test suite covering estimators, diagnostics, and full pipeline integration.

  • Docs ↔ Code Parity

    Automated checks keep formulas and APIs aligned with documentation.


Data Contracts

Use the dataset contracts in crl.data and follow the shape rules exactly:

Data Type Class Use Case
Bandits LoggedBanditDataset Single-step contextual decisions
Trajectories TrajectoryDataset Episode-based sequential data
Transitions TransitionDataset Step-by-step (s, a, r, s') tuples

Dataset Format and Validation


Learn by Example


Estimator Selection

Not sure which estimator to use? See the Estimator Selection Guide for a practical decision tree and recommended defaults.