Login / Signup
Dialogue POMDP components (Part II): learning the reward function.
Hamidreza Chinaei
Brahim Chaib-draa
Published in:
Int. J. Speech Technol. (2014)
Keyphrases
</>
reinforcement learning
learning algorithm
reward function
inverse reinforcement learning
machine learning
partially observable
markov decision process
hierarchical reinforcement learning
active learning
state space
dynamical systems