Logged Bandit from a DataFrame¶
This guide shows how to build a LoggedBanditDataset from a pandas DataFrame.
Expected columns¶
You need:
- context features (one or more columns)
actioncolumn with integer action indicesrewardcolumn- optional
behavior_probcolumn for logged propensities
Example¶
from crl.data.datasets import LoggedBanditDataset
# df columns: ["x1", "x2", "action", "reward", "behavior_prob"]
bandit = LoggedBanditDataset.from_dataframe(
df,
context_columns=["x1", "x2"],
action_column="action",
reward_column="reward",
behavior_prob_column="behavior_prob",
)
Common pitfalls¶
- Action indices are not 0..K-1: remap your actions before creating the dataset.
- Propensities missing: IS/WIS require
behavior_action_probs. See Behavior Propensities Missing. - Contexts are mixed types: cast to numeric or one-hot encode categorical features.