Skip to content

Fitted Q Evaluation (FQE)

Implementation: crl.estimators.fqe.FQEEstimator

Assumptions

  • Sequential ignorability
  • Overlap/positivity
  • Markov property
  • Q-model realizability

Requires

  • TrajectoryDataset
  • behavior_action_probs optional (used only for diagnostics)

Diagnostics to check

  • model.q_model_mse
  • model.bellman_residual_mse
  • overlap.support_violations (if propensities provided)

Formula

FQE fits a Q-function by iterative Bellman regression on logged data, then estimates $V^\pi$ by averaging $\hat V(s_0)$ under the target policy.

Uncertainty

  • Normal-approximation CI by default.
  • Block bootstrap recommended for sequential dependence (bootstrap=True).

Failure modes

  • Extrapolation error for out-of-distribution state-action pairs.
  • Sensitive to model capacity and optimization.

Minimal example

from crl.estimators.fqe import FQEEstimator

report = FQEEstimator(estimand).estimate(dataset)

Bootstrap notes

  • IID bootstrap ignores temporal dependence and can be optimistic.
  • Trajectory or block bootstraps are preferable for sequential data.

References

  • Le et al. (2019)
  • Hao et al. (2021) for bootstrap inference

Notebook