Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning.
Kai WangSanket ShahHaipeng ChenAndrew PerraultFinale Doshi-VelezMilind TambePublished in: NeurIPS (2021)
Keyphrases
- reinforcement learning
- sequential decision making
- markov decision processes
- learning process
- learning algorithm
- partially observable
- state space
- function approximation
- policy search
- supervised learning
- machine learning
- optimal policy
- temporal difference
- data mining
- bayesian networks
- model free
- dynamic programming
- reinforcement learning algorithms
- cost function
- function approximators
- artificial intelligence
- model based reinforcement learning
- interactive dynamic influence diagrams