Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards.
Heriberto CuayáhuitlDonghyeon LeeSeonghan RyuSungja ChoiInchul HwangJihie KimPublished in: IJCNN (2019)
Keyphrases
- reinforcement learning
- behavioural cloning
- reward function
- perceptual aliasing
- action selection
- markov decision processes
- partially observable
- reward signal
- state space
- reinforcement learning algorithms
- function approximation
- state and action spaces
- action space
- agent learns
- multi agent
- machine learning
- multiagent reinforcement learning
- agent behavior
- human operators
- human subjects
- sensory inputs
- learned knowledge
- learning agent
- state action
- optimal policy
- learning algorithm
- reinforcement learning methods
- partial observability
- initially unknown
- reasoning about actions
- markov decision process
- emotional state
- human activities
- partially observable domains