Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning.
Kai WangSanket ShahHaipeng ChenAndrew PerraultFinale Doshi-VelezMilind TambePublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- sequential decision problems
- learning algorithm
- learning process
- state space
- function approximation
- markov decision processes
- learning problems
- partially observable
- model free
- active exploration
- reinforcement learning algorithms
- supervised learning
- feature space
- learning tasks
- active learning
- finite state
- dynamic programming
- policy search
- case study
- model based reinforcement learning
- continuous state and action spaces
- machine learning