Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning.
Xizhou BuWenjuan LiZhengxiong LiuZhiqiang MaPanfeng HuangPublished in: IEEE Robotics Autom. Lett. (2024)
Keyphrases
- reinforcement learning
- state space
- learning algorithm
- cooperative
- confidence level
- function approximation
- human behavior
- multi agent
- high confidence
- human interaction
- learning rate
- temporal difference learning
- neural network
- confidence measure
- human subjects
- human experts
- markov decision processes
- mobile robot
- computer vision