A Dynamic and Task-Independent Reward Shaping Approach for Discrete Partially Observable Markov Decision Processes.
Sepideh NahaliHajer AyadiJimmy X. HuangEsmat PakizehMir Mohsen PedramLeila SafariPublished in: PAKDD (2) (2023)
Keyphrases
- partially observable markov decision processes
- continuous state
- reinforcement learning
- dynamic environments
- finite state
- dynamical systems
- belief state
- reward shaping
- planning under uncertainty
- dynamic programming
- dynamic systems
- decision problems
- markov decision processes
- optimal policy
- planning problems
- policy search
- state space