Online reinforcement learning with sparse rewards through an active inference capsule.

Alejandro Daniel Noel Charel van Hoof Beren Millidge

Published in: CoRR (2021)

Keyphrases

reinforcement learning
markov decision processes
online learning
state space
function approximation
reward shaping
optimal policy
model free
reinforcement learning algorithms
high dimensional
reward function
optimal control
balancing exploration and exploitation
inference process
sparse data
real time
probabilistic inference
bayesian networks
learning algorithm
learning classifier systems
action selection
temporal difference
multi agent
markov decision process
average reward
batch mode
machine learning