Reward-Machine-Guided, Self-Paced Reinforcement Learning.
Cevahir KöprülüUfuk TopcuPublished in: AAMAS (2023)
Keyphrases
- reinforcement learning
- function approximation
- state space
- eligibility traces
- learning algorithm
- reward function
- model free
- learning problems
- multi agent
- batch processing
- markov decision processes
- partially observable environments
- robotic control
- optimal policy
- machine learning
- learning process
- temporal difference
- reinforcement learning algorithms
- dynamic programming
- control policy
- genetic algorithm
- reward shaping
- initially unknown
- policy search
- multi agent reinforcement learning
- transfer learning
- mobile robot
- action selection
- online learning
- optimal control
- flowshop