Joint Inference of Reward Machines and Policies for Reinforcement Learning.
Zhe XuIvan GavranYousef AhmadRupak MajumdarDaniel NeiderUfuk TopcuBo WuPublished in: ICAPS (2020)
Keyphrases
- reinforcement learning
- joint inference
- optimal policy
- reward function
- total reward
- information extraction
- probabilistic model
- entity resolution
- semantic role labeling
- partially observable markov decision processes
- reinforcement learning algorithms
- state space
- conditional random fields
- markov decision process
- average reward
- graphical models
- control policy
- markov decision processes
- natural language understanding
- bi directional
- machine learning
- joint segmentation
- policy gradient
- learning algorithm
- dynamic programming
- dependency parsing
- database systems
- relation extraction
- bayesian inference
- variational inference
- pairwise
- domain specific