Outperformance of Mall-Receptionist Android as Inverse Reinforcement Learning is Transitioned to Reinforcement Learning.
Zhichao ChenYutaka NakamuraHiroshi IshiguroPublished in: IEEE Robotics Autom. Lett. (2023)
Keyphrases
- inverse reinforcement learning
- partially observable environments
- reinforcement learning
- reward function
- bayesian nonparametric
- temporal difference
- preference elicitation
- reinforcement learning algorithms
- partially observable
- markov decision processes
- function approximation
- markov decision process
- state space
- optimal policy
- decision making
- dynamic systems
- model free
- simple examples
- dynamical systems
- probabilistic model
- machine learning