Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry.
Siddharth MysoreBassel MabsoutRenato MancusoKate SaenkoPublished in: CoRR (2021)
Keyphrases
- actor critic
- reinforcement learning
- policy gradient
- approximate dynamic programming
- neuro fuzzy
- temporal difference
- optimal control
- gradient method
- reinforcement learning algorithms
- policy iteration
- decision making
- function approximation
- average reward
- least squares
- machine learning
- evaluation function
- markov decision processes