Login / Signup
Multi-Action Dialog Policy Learning from Logged User Feedback.
Shuo Zhang
Junzhou Zhao
Pinghui Wang
Tianxiang Wang
Zi Liang
Jing Tao
Yi Huang
Junlan Feng
Published in:
AAAI (2023)
Keyphrases
</>
user feedback
learning process
action selection
learning algorithm
reinforcement learning
user interaction
user preferences
active learning
supervised learning
similarity measure
relational databases
user interface
state action
amazon mechanical turk