Action-Conditioned Contrastive Policy Pretraining.

Qihang Zhang Zhenghao Peng Bolei Zhou

Published in: CoRR (2022)

Keyphrases

action selection
optimal policy
state action
joint action
partially observable domains
data sets
neural network
computer vision
website
markov chain
human actions
action space
allocation policy
discounted reward