Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning.
Sam LobelAkhil BagariaGeorge KonidarisPublished in: ICML (2023)
Keyphrases
- reinforcement learning
- active exploration
- action selection
- exploration strategy
- learning algorithm
- function approximation
- machine learning
- autonomous learning
- learning process
- markov decision processes
- model free
- exploration exploitation
- balancing exploration and exploitation
- model based reinforcement learning
- robotic control
- temporal difference learning
- estimation process
- markov decision process
- accurate estimation
- reinforcement learning algorithms
- temporal difference
- case study
- search engine