Temporal Abstraction
Policy Search
Deep Reinforcement Learning
The optimistic principle for online planning in Markov decision processes
Reinforcement Learning
Monte-Carlo Planning: Basic Principles and Recent Progress
Off-policy Model-based Learning under Unknown Factored Dynamics
Future Information Minimization as PAC Bayes regularization in Reinforcement Learning
Dynamic Programming and Stochastic Control
How to discount Information: Information flow in sensing-acting systems and the emergence of heirarchies