Phrase-Level Action Reinforcement Learning for Neural Dialog Response Generation.

Takato Yamazaki Akiko Aizawa

Published in: ACL/IJCNLP (Findings) (2021)

Keyphrases

reinforcement learning
fitted q iteration
action selection
sensory inputs
higher level
levels of abstraction
markov decision processes
neural network
network architecture
generation process
learning process
user interface
optimal policy
learning classifier systems
state action
learning algorithm
reward shaping
partially observable domains
machine learning