Skip to content

References

Off-policy evaluation

  • Precup, D. (2000). Eligibility Traces for Off-Policy Policy Evaluation. PhD thesis, University of Massachusetts Amherst.
  • Mahmood, A. R., Yu, H., Sutton, R. S., and Szepesvari, C. (2014). Weighted Importance Sampling for Off-Policy Learning with Linear Function Approximation. NeurIPS.
  • Jiang, N. and Li, L. (2016). Doubly Robust Off-Policy Value Evaluation for Reinforcement Learning. ICML.
  • Thomas, P. S. and Brunskill, E. (2016). Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning. ICML.
  • Farajtabar, M., Chow, Y., Ghavamzadeh, M., Pineau, J., and Precup, D. (2018). More Robust Doubly Robust Off-Policy Evaluation. ICML.
  • Xie, T., Ma, Y., Wang, Y., and Xie, L. (2019). Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling. NeurIPS.
  • Nachum, O., Schuurmans, D., and Liu, Y. (2019). DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections. NeurIPS.
  • Kallus, N. and Uehara, M. (2020). Double Reinforcement Learning for Off-Policy Evaluation. NeurIPS.
  • Thomas, P. S., Theocharous, G., and Ghavamzadeh, M. (2015). High-Confidence Off-Policy Evaluation. AAAI.
  • Hao, B., Ji, X., Duan, Y., Lu, H., Szepesvari, C., and Wang, M. (2021). Bootstrapping Statistical Inference for Off-Policy Evaluation. arXiv.
  • Zhang, R., Zhang, X., Ni, C., and Wang, M. (2022). Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory. ICML.
  • Uehara, M., Kiyohara, T., Fujimoto, S., and Hachiya, H. (2022). A Taxonomy of Off-Policy Evaluation. arXiv.

Causal RL overview

  • Deng, Y., Jiang, N., Long, J., and Zhang, C. (2023). Causal Reinforcement Learning: A Survey. arXiv:2307.01452.
  • Bennett, A., Kallus, N., and Uehara, M. (2021). Proximal Causal Inference for Reinforcement Learning. arXiv.

Sequential ignorability

  • Robins, J. M., Hernan, M. A., and Brumback, B. (2000). Marginal Structural Models and Causal Inference in Epidemiology. Epidemiology.

Documentation tooling

  • Diataxis documentation framework: https://diataxis.fr/
  • mkdocs-jupyter: https://github.com/danielfrg/mkdocs-jupyter
  • SciencePlots: https://github.com/garrettj403/SciencePlots