Dyna-PPO reinforcement learning with Gaussian process for the continuous action decision-making in autonomous driving.
Guanlin WuWenqi FangJi WangPin GeJiang CaoYang PingPeng GouPublished in: Appl. Intell. (2023)
Keyphrases
- gaussian process
- autonomous driving
- continuous action
- policy search
- reinforcement learning
- continuous state
- temporal difference learning
- function approximation
- action space
- grand challenge
- function approximators
- regression model
- action selection
- stereo vision
- temporal difference
- reinforcement learning algorithms
- partially observable markov decision processes
- latent variables
- reinforcement learning methods
- bayesian framework
- approximate inference
- model selection
- rl algorithms
- semi supervised
- hyperparameters
- state space
- model free
- machine learning
- real valued
- markov decision processes
- learning process
- dynamic programming
- multi agent
- control policies
- object recognition
- feature selection
- policy gradient
- learning algorithm
- computer vision
- learning problems
- prior knowledge
- supervised learning
- optimal policy
- random variables
- learning tasks