Login / Signup

Sample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences.

Olivier PietquinMatthieu GeistSenthilkumar Chandramohan
Published in: IJCAI (2011)
Keyphrases
  • temporal difference
  • e learning
  • optimal policy
  • model free
  • machine learning
  • reinforcement learning
  • evolutionary algorithm
  • dynamic programming