Policy Learning of MDPs with Mixed Continuous/Discrete Variables: A Case Study on Model-Free Control of Markovian Jump Systems.
Joao Paulo Jansch-PortoBin HuGeir E. DullerudPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- model free
- discrete variables
- policy iteration
- optimal policy
- markov decision processes
- average reward
- action selection
- machine learning
- markov decision problems
- learning algorithm
- active learning
- action space
- rl algorithms
- continuous variables
- temporal difference
- reinforcement learning algorithms
- partially observable
- function approximation
- learning problems
- supervised learning
- state space