SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning.
Somjit NathRicha VermaAbhik RayHarshad KhadilkarPublished in: AAMAS (2021)
Keyphrases
- reinforcement learning
- markov decision processes
- function approximation
- adaptive control
- dynamic programming
- reinforcement learning algorithms
- learning algorithm
- optimal policy
- state space
- reward function
- machine learning
- supervised learning
- multi agent reinforcement learning
- significant improvement
- learning process
- learning capabilities
- multi agent
- actor critic
- model free
- function approximators
- learning problems
- control policy
- temporal difference learning
- policy search
- robot control
- optimal control
- real time
- markov chain
- information retrieval
- data sets