Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs.
Qingyang ZhangYiming YangJingqing RuanXuantang XiongDengpeng XingBo XuPublished in: CoRR (2023)
Keyphrases
- hierarchical reinforcement learning
- balancing exploration and exploitation
- reinforcement learning
- state abstraction
- model free
- latent variables
- weighted graph
- undirected graph
- markov decision process
- state space
- relevance feedback
- machine learning
- pairwise
- average reward
- support vector
- multi agent
- learning algorithm