Login / Signup
A CNN-based policy for optimizing continuous action control by learning state sequences.
Tianyi Huang
Min Li
Xiaolong Qin
William Zhu
Published in:
Neurocomputing (2022)
Keyphrases
</>
learning algorithm
robotic systems
policy search
reinforcement learning
hidden markov models
supervised learning
action selection
multi agent
search space
state space