Learning Two-Step Hybrid Policy for Graph-Based Interpretable Reinforcement Learning.
Tongzhou MuKaixiang LinFeiyang NiuGovind ThattaiPublished in: Trans. Mach. Learn. Res. (2022)
Keyphrases
- reinforcement learning
- learning algorithm
- learning process
- supervised learning
- learning systems
- action selection
- policy search
- learning tasks
- learning problems
- function approximation
- partially observable environments
- partially observable
- hybrid learning
- learning capabilities
- markov decision process
- state action
- machine learning
- real robot
- learning agents
- temporal difference learning
- online learning