Deep Q-Learning versus Proximal Policy Optimization: Performance Comparison in a Material Sorting Task.
Reuf KozlicaStefan WegenkittlSimon HiränderPublished in: ISIE (2023)
Keyphrases
- optimal policy
- reinforcement learning
- action selection
- global optimization
- learning algorithm
- optimization process
- multi agent
- state space
- state action
- constrained optimization
- optimization method
- optimization problems
- dynamic programming
- cooperative
- policy iteration
- policy search
- neural network
- multi agent reinforcement learning
- agent receives
- stochastic approximation
- temporal difference
- infinite horizon
- machine learning