Skip to content

Q-Model RealizabilityΒΆ

Assumption - The function approximator used for Q or reward models is well specified.

Applies to - FQE and other model-based estimators. - DR-family estimators benefit when the Q-model is accurate, but they do not require this assumption for consistency.

Definition - The regression model class contains the true conditional expectations.

Diagnostics - Residual plots, cross-validation, and sensitivity to model class.

Failure modes - Bias if model misspecification persists across folds.

References - Kallus and Uehara (2020).