Login / Signup
Sample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences.
Olivier Pietquin
Matthieu Geist
Senthilkumar Chandramohan
Published in:
IJCAI (2011)
Keyphrases
</>
temporal difference
e learning
optimal policy
model free
machine learning
reinforcement learning
evolutionary algorithm
dynamic programming