Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable Reward Function.
A. B. SiddiqueMuhammad Hasan MaqboolKshitija TaywadeHassan ForooshPublished in: CoRR (2023)
Keyphrases
- dialog systems
- reward function
- dialogue system
- natural language generation
- human computer
- markov decision processes
- natural language
- reinforcement learning
- inverse reinforcement learning
- reinforcement learning algorithms
- state space
- optimal policy
- conversational agents
- partially observable
- transition probabilities
- natural language interfaces
- human communication
- multiple agents
- dialogue management
- description language
- state variables
- speech recognition
- user profiles
- natural language interface
- hidden markov models
- knowledge representation
- generative model
- machine learning