Per-Decision Importance Sampling (PDIS)¶
Implementation: crl.estimators.importance_sampling.PDISEstimator
Assumptions¶
- Sequential ignorability
- Overlap/positivity
Requires¶
TrajectoryDatasetbehavior_action_probsfor logged actions
Diagnostics to check¶
overlap.support_violationsess.ess_ratioweights.tail_fraction
Formula¶
$\hat V = \frac{1}{n} \sum_i \sum_t \gamma^t \rho_{i,t} r_{it}$.
Uncertainty¶
- Normal-approximation CI by default.
- Bootstrap CI available via
bootstrap=True.
Failure modes¶
- Variance grows with horizon under weak overlap.
- Stepwise ratios can still be heavy-tailed.
Minimal example¶
from crl.estimators.importance_sampling import PDISEstimator
report = PDISEstimator(estimand, clip_rho=10.0).estimate(dataset)
References¶
- Precup, Sutton, Singh (2000)