A Huber reward function-driven deep reinforcement learning solution for cart-pole balancing problem.
Mishra ShailiAnuja AroraPublished in: Neural Comput. Appl. (2023)
Keyphrases
- reward function
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- state space
- optimal policy
- inverse reinforcement learning
- small number of iterations
- partially observable
- policy search
- transition model
- hierarchical reinforcement learning
- model free
- markov decision process
- dynamic programming
- learning algorithm
- multiple agents
- initially unknown
- temporal difference
- markov decision problems
- function approximation
- em algorithm
- image segmentation