Deep Q-Learning versus Proximal Policy Optimization: Performance Comparison in a Material Sorting Task.
Reuf KozlicaStefan WegenkittlSimon HirländerPublished in: CoRR (2023)
Keyphrases
- optimal policy
- action selection
- cooperative
- reinforcement learning
- state space
- learning algorithm
- global optimization
- optimization problems
- optimization algorithm
- neural network
- policy iteration
- multi agent
- function approximation
- constrained optimization
- markov decision processes
- markov decision process
- long run
- optimization process
- optimization method
- genetic algorithm