OPE Pipeline¶
End-to-end off-policy evaluation pipeline.
OpeReport
dataclass
¶
Aggregate report for an OPE evaluation run.
plot_effective_sample_size(weights, by_time=False)
¶
Plot effective sample size diagnostics.
plot_estimator_comparison(truth=None)
¶
Plot estimator comparison with confidence intervals.
plot_importance_weights(weights, logy=True)
¶
Plot importance weight distribution.
save_bundle(output_dir)
¶
Write a bundle containing HTML, JSON, CSV, and figures.
save_html(out_path)
¶
Write a self-contained HTML report.
summary_table()
¶
Return a pandas DataFrame summary if pandas is available.
to_dataframe()
¶
Alias for summary_table().
to_html(out_path=None)
¶
Return a self-contained HTML report string (optionally write to file).
to_report_data()
¶
Build a structured report payload.
evaluate(dataset, policy=None, estimand=None, estimators='default', diagnostics='default', inference=None, sensitivity=None, seed=0)
¶
evaluate(dataset: BanditDataset | TrajectoryDataset | TransitionDataset, policy: Policy, estimand: PolicyValueEstimand | None = None, estimators: Iterable[str | OPEEstimator] | str = 'default', diagnostics: list[str] | str = 'default', inference: dict[str, Any] | None = None, sensitivity: SensitivityPolicyValueEstimand | None = None, seed: int = 0) -> OpeReport
Run an end-to-end OPE evaluation with reporting.
Note
Passing an EvaluationSpec is deprecated; use run_spec(spec) instead.
evaluate_ope(dataset, policy, estimand=None, estimators='default', diagnostics='default', inference=None, sensitivity=None, seed=0)
¶
Run an end-to-end OPE evaluation with reporting.
run_spec(spec)
¶
Run an EvaluationSpec and return an EvaluationResult.