Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning.
Anoopkumar SonarVincent PacelliAnirudha MajumdarPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- optimal policy
- action selection
- policy search
- global optimization
- optimization problems
- optimization algorithm
- partially observable environments
- action space
- temporal difference
- dynamic programming
- function approximation
- control policies
- approximate dynamic programming
- markov decision process
- state space
- markov decision problems
- affine transformation
- neural network
- reinforcement learning problems
- average reward
- reinforcement learning algorithms
- policy evaluation
- optimization process
- function approximators
- reward function
- constrained optimization
- optimization method
- markov chain
- learning process
- multi agent
- learning algorithm
- genetic algorithm