Deterministic Sequencing of Exploration and Exploitation for Reinforcement Learning.
Piyush GuptaVaibhav SrivastavaPublished in: CDC (2022)
Keyphrases
- exploration exploitation tradeoff
- reinforcement learning
- active exploration
- exploration strategy
- function approximation
- objective function
- action selection
- deterministic domains
- relevance feedback
- exploration exploitation
- model based reinforcement learning
- markov decision processes
- temporal difference
- autonomous learning
- state space
- dynamic programming
- robotic control
- initially unknown
- partially observable domains
- reinforcement learning algorithms
- optimal control
- partially observable
- optimal policy
- active learning
- multi agent systems
- search engine
- learning algorithm
- function approximators
- machine learning
- black box
- learning process