Skip to content

Behavior Policies

Behavior policy estimation utilities.

BehaviorPolicyFit dataclass

Container for estimated behavior propensities and diagnostics.

apply(dataset)

Return a dataset with estimated propensities attached.

behavior_diagnostics(action_probs, actions, propensities, *, clip_min=0.001, num_bins=10)

Aggregate diagnostics for behavior policy estimation.

fit_behavior_policy(dataset, *, method='logit', model=None, model_kwargs=None, clip_min=0.001, seed=0, store_action_probs=False)

Estimate behavior policy propensities from logged data.