Potential-based reward shaping for finite horizon online POMDP planning.
Adam EckLeen-Kiat SohSam DevlinDaniel KudenkoPublished in: Auton. Agents Multi Agent Syst. (2016)
Keyphrases
- finite horizon
- reward shaping
- optimal policy
- markov decision problems
- infinite horizon
- markov decision process
- reinforcement learning
- markov decision processes
- partially observable
- state space
- partially observable markov decision processes
- reinforcement learning algorithms
- reward function
- planning problems
- average cost
- decision problems
- initial state
- dynamic programming
- long run
- optimal control
- complex domains
- finite state
- policy iteration
- multistage
- planning under uncertainty
- control policies
- state dependent
- finite number
- continuous state
- decision theoretic
- dynamical systems
- sufficient conditions