Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue.
Huifang DuShuqin LiMinghao WuXuejing FengYuan-Fang LiHaofen WangPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- dialogue system
- tutorial dialogue
- dialogue management
- function approximation
- human computer
- mixed initiative
- natural language
- state space
- natural language dialogue
- spoken language
- spoken dialogue systems
- optimal policy
- optimal control
- human machine
- reinforcement learning algorithms
- markov decision processes
- description language
- learning algorithm
- learning process
- model free
- temporal difference
- robotic control
- business intelligence
- speech acts
- temporal difference learning
- easy to follow
- action selection
- conversational agent
- multi agent reinforcement learning
- multi agent
- machine learning