Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning.

Sam Lobel Akhil Bagaria George Konidaris

Published in: ICML (2023)

Keyphrases

reinforcement learning
active exploration
action selection
exploration strategy
learning algorithm
function approximation
machine learning
autonomous learning
learning process
markov decision processes
model free
exploration exploitation
balancing exploration and exploitation
model based reinforcement learning
robotic control
temporal difference learning
estimation process
markov decision process
accurate estimation
reinforcement learning algorithms
temporal difference
case study
search engine