Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable Reward Function.
A. B. SiddiqueMuhammad Hasan MaqboolKshitija TaywadeHassan ForooshPublished in: CIKM (2022)
Keyphrases
- dialog systems
- reward function
- dialogue system
- natural language generation
- human computer
- markov decision processes
- inverse reinforcement learning
- reinforcement learning
- state space
- natural language
- reinforcement learning algorithms
- optimal policy
- multiple agents
- partially observable
- conversational agents
- natural language interfaces
- dialogue management
- transition probabilities
- speech recognition
- description language
- human communication
- state variables
- learning agent
- human computer interaction
- user profiles
- generative model
- probabilistic model
- markov models