Login / Signup
Diverse Exploration for Fast and Safe Policy Improvement.
Andrew Cohen
Lei Yu
Robert Wright
Published in:
AAAI (2018)
Keyphrases
</>
action selection
wide variety
significant improvement
optimal policy
data mining
database
information systems
reinforcement learning
dynamic programming
management system
expected cost
asymptotically optimal
active exploration