Improving Actor-Critic Reinforcement Learning Via Hamiltonian Monte Carlo Method.
Duo XuFaramarz FekriPublished in: ICASSP (2022)
Keyphrases
- actor critic
- reinforcement learning
- monte carlo method
- temporal difference
- monte carlo
- markov chain
- policy gradient
- reinforcement learning algorithms
- optimal control
- approximate dynamic programming
- state space
- function approximation
- neuro fuzzy
- policy iteration
- gradient method
- markov decision processes
- genetic algorithm
- average reward
- posterior distribution
- simulated annealing
- dynamic programming
- model free
- machine learning
- action selection
- finite state
- learning algorithm
- learning problems