Impacts of Mathematical Optimizations on Reinforcement Learning Policy Performance.
Sam GreenCraig M. VineyardÇetin Kaya KoçPublished in: IJCNN (2018)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- action selection
- markov decision process
- reinforcement learning problems
- actor critic
- action space
- partially observable
- reinforcement learning algorithms
- policy gradient
- function approximators
- state space
- markov decision processes
- partially observable environments
- control policies
- reward function
- control policy
- policy evaluation
- continuous state spaces
- average reward
- partially observable domains
- dynamic programming
- temporal difference
- approximate dynamic programming
- infinite horizon
- partially observable markov decision processes
- long run
- policy iteration
- state action
- markov decision problems
- learning algorithm
- transition model
- state and action spaces
- model free
- function approximation
- machine learning
- mathematical expressions
- temporal difference learning
- policy gradient methods
- agent learns
- inverse reinforcement learning
- state dependent
- rl algorithms
- decision problems
- transfer learning
- learning problems
- learning process
- multiple agents
- multi agent
- robotic control
- decision making
- approximate policy iteration