Skip to content

OPE Pipeline

End-to-end off-policy evaluation pipeline.

OpeReport dataclass

Aggregate report for an OPE evaluation run.

plot_effective_sample_size(weights, by_time=False)

Plot effective sample size diagnostics.

plot_estimator_comparison(truth=None)

Plot estimator comparison with confidence intervals.

plot_importance_weights(weights, logy=True)

Plot importance weight distribution.

save_bundle(output_dir)

Write a bundle containing HTML, JSON, CSV, and figures.

save_html(out_path)

Write a self-contained HTML report.

summary_table()

Return a pandas DataFrame summary if pandas is available.

to_dataframe()

Alias for summary_table().

to_html(out_path=None)

Return a self-contained HTML report string (optionally write to file).

to_report_data()

Build a structured report payload.

evaluate(dataset, policy=None, estimand=None, estimators='default', diagnostics='default', inference=None, sensitivity=None, seed=0)

evaluate(spec: EvaluationSpec) -> Any
evaluate(dataset: BanditDataset | TrajectoryDataset | TransitionDataset, policy: Policy, estimand: PolicyValueEstimand | None = None, estimators: Iterable[str | OPEEstimator] | str = 'default', diagnostics: list[str] | str = 'default', inference: dict[str, Any] | None = None, sensitivity: SensitivityPolicyValueEstimand | None = None, seed: int = 0) -> OpeReport

Run an end-to-end OPE evaluation with reporting.

Note

Passing an EvaluationSpec is deprecated; use run_spec(spec) instead.

evaluate_ope(dataset, policy, estimand=None, estimators='default', diagnostics='default', inference=None, sensitivity=None, seed=0)

Run an end-to-end OPE evaluation with reporting.

run_spec(spec)

Run an EvaluationSpec and return an EvaluationResult.