Maneuver Decision-Making Through Automatic Curriculum Reinforcement Learning Without Handcrafted Reward functions.
Hong-Peng ZhangPublished in: CoRR (2023)
Keyphrases
- reward function
- reinforcement learning
- decision making
- policy search
- markov decision processes
- reinforcement learning algorithms
- state space
- optimal policy
- markov decision process
- function approximation
- inverse reinforcement learning
- action selection
- partially observable
- decision makers
- dynamic programming
- multiple agents
- hand crafted
- state action
- data mining
- learning algorithm
- transition model
- markov decision problems
- multi agent