Joint Inference of Reward Machines and Policies for Reinforcement Learning.

Zhe Xu Ivan Gavran Yousef Ahmad Rupak Majumdar Daniel Neider Ufuk Topcu Bo Wu

Published in: CoRR (2019)

Keyphrases

reinforcement learning
joint inference
reward function
optimal policy
total reward
entity resolution
information extraction
probabilistic model
markov decision process
reinforcement learning algorithms
semantic role labeling
average reward
conditional random fields
markov decision processes
partially observable markov decision processes
natural language understanding
graphical models
control policy
state space
dynamic programming
joint segmentation
learning algorithm
supervised learning
natural language
exact inference
variational inference
record linkage
data integration
generative model