Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning.
Qin YangRamviyas ParasuramanPublished in: SAC (2024)
Keyphrases
- actor critic
- reinforcement learning
- directed acyclic
- temporal difference
- policy gradient
- approximate dynamic programming
- function approximation
- reinforcement learning algorithms
- optimal control
- gradient method
- bayesian networks
- neuro fuzzy
- policy iteration
- graphical models
- model free
- control problems
- linear program
- semi supervised
- policy gradient methods
- machine learning
- state space
- linear programming
- markov decision processes
- nearest neighbor
- average reward
- dynamic programming
- data structure
- multi agent
- evaluation function