Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards.

Yuhang Song Jianyi Wang Thomas Lukasiewicz Zhenghua Xu Shangtong Zhang Mai Xu

Published in: CoRR (2019)

Keyphrases

human level
human level intelligence
reinforcement learning
machine intelligence
artificial general intelligence
bandit problems
reward function
intelligent systems
general intelligence
web intelligence
human level ai
artificial intelligence
human intelligence
expected reward
ai systems
cognitive science
cognitive psychology
cognitive architecture
markov decision processes
total reward
optimal policy
control system
database systems
machine learning
neural network