Configuration Schemas¶
OPE config (configs/ope.yaml)¶
benchmark:
type: bandit | mdp
seed: 0
num_samples: 1000 # bandit only
num_trajectories: 200 # mdp only
config: {} # passed to SyntheticBanditConfig/SyntheticMDPConfig
estimators: ["is", "wis", "double_rl"]
diagnostics: default
seed: 0