Behavior Policy Known¶

This assumption states that the behavior policy propensities used by estimators are known (logged) or correctly specified if estimated. Without it, importance- weighted estimators can be biased even when overlap holds.

When it matters¶

IS / WIS / PDIS
DR / WDR / MRDR / MAGIC
High-confidence OPE bounds

If you estimate propensities, document the model and treat this assumption as additional modeling risk rather than a guarantee.