Skip to content

Quickstart (Bandit)

This tutorial shows a complete bandit OPE workflow using the built-in synthetic benchmark. It mirrors the notebook so you can follow along in Markdown or run it interactively.

What you will do

  • Generate a logged bandit dataset with known ground truth.
  • Run crl.ope.evaluate with multiple estimators.
  • Inspect estimator diagnostics and plots.

Walkthrough

from crl.benchmarks.bandit_synth import SyntheticBandit, SyntheticBanditConfig
from crl.ope import evaluate_ope

benchmark = SyntheticBandit(SyntheticBanditConfig(seed=0))
dataset = benchmark.sample(num_samples=1000, seed=1)
report = evaluate_ope(dataset=dataset, policy=benchmark.target_policy)

Plot the estimator comparison and export a report:

fig = report.plot_estimator_comparison(truth=benchmark.true_policy_value(benchmark.target_policy))
report.save_html("report.html")
Bandit estimator comparison with uncertainty
Estimator comparison rendered with the CausalRL grayscale style.

Notebook