Online Learning for Unknown Partially Observable MDPs.
Mehdi Jafarnia-JahromiRahul JainAshutosh NayyarPublished in: CoRR (2021)
Keyphrases
- partially observable
- online learning
- markov decision processes
- state space
- markov decision problems
- reinforcement learning
- initially unknown
- dynamical systems
- decision problems
- partial observability
- reward function
- infinite horizon
- partially observable environments
- partial observations
- belief state
- action models
- optimal policy
- probabilistic planning
- active learning
- dynamic programming
- e learning
- finite state
- partially observable markov decision process
- temporal difference
- planning under uncertainty
- policy iteration
- partially observable markov decision processes
- bayesian networks
- reinforcement learning algorithms
- machine learning
- sufficient conditions
- decision making