MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation.
Aaron M. RothJing LiangRam D. SriramElham TabassiDinesh ManochaPublished in: CoRR (2022)
Keyphrases
- robot navigation
- reinforcement learning
- continuous state
- policy search
- optimal policy
- autonomous mobile robot
- autonomous robots
- real time stereo
- markov decision processes
- markov decision process
- scene understanding
- topological map
- action selection
- state space
- function approximation
- initially unknown
- action space
- dynamic programming
- average reward
- landmark recognition
- reinforcement learning problems
- partially observable
- map building
- control policy
- reward function
- model free
- d objects
- high resolution