Login / Signup
MS-Ranker: Accumulating evidence from potentially correct candidates via reinforcement learning for answer selection.
Yingxue Zhang
Fandong Meng
Peng Li
Ping Jian
Jie Zhou
Published in:
Neurocomputing (2021)
Keyphrases
</>
reinforcement learning
final answer
correct answers
markov decision processes
function approximation
model free
reinforcement learning algorithms
data mining
dynamic programming
empirical evidence
machine learning
learning process
optimal policy
selection algorithm
robotic control