Information theoretic reward shaping for curiosity driven learning in POMDPs.
Nassim MafiFarnaz AbtahiIan R. FaselPublished in: ICDL-EPIROB (2011)
Keyphrases
- information theoretic
- driven learning
- reward shaping
- reinforcement learning
- markov decision problems
- mutual information
- state space
- partially observable markov decision processes
- reinforcement learning algorithms
- partially observable
- policy search
- complex domains
- linear programming
- optimal policy
- markov decision processes
- semi supervised learning
- dynamic programming
- domain theory
- model free
- belief state
- word alignment
- multi agent
- markov decision process
- supervised learning
- temporal difference
- infinite horizon
- transition probabilities
- finite state
- learning algorithm
- utility function
- bayesian networks
- training data
- pairwise
- semi supervised
- markov chain
- reward function
- dynamical systems