An Action-Selection Policy Generator for Reinforcement Learning Hardware Accelerators.
Gian Carlo CardarilliLuca Di NunzioRocco FazzolariDaniele GiardinoMarco MattaMarco ReSergio SpanòPublished in: ApplePies (2020)
Keyphrases
- action selection
- reinforcement learning
- basal ganglia
- temporal difference
- robot soccer
- action space
- state space
- human robot
- function approximation
- decision making
- total reward
- optimal policy
- continuous state and action spaces
- control policy
- policy search
- reinforcement learning algorithms
- reward function
- rl algorithms
- control policies
- model free
- markov decision problems
- continuous state
- optimal control
- dynamic programming
- neural network