Modular deep reinforcement learning from reward and punishment for robot navigation.
Jiexin WangStefan ElfwingEiji UchibePublished in: Neural Networks (2021)
Keyphrases
- robot navigation
- reinforcement learning
- agent receives
- continuous state
- autonomous mobile robot
- function approximation
- autonomous robots
- initially unknown
- landmark recognition
- real time stereo
- model free
- reward function
- optimal policy
- markov decision processes
- map building
- scene understanding
- state space
- temporal difference
- partially observable
- partially observable environments
- machine learning
- eligibility traces
- policy iteration
- topological map
- learning algorithm
- average reward
- video surveillance
- dynamic programming
- multi agent
- policy gradient
- action selection
- feature vectors
- high quality