Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning.
Yifang ChenShuohang WangZiyi YangHiteshi SharmaNikos KarampatziakisDonghan YuKevin G. JamiesonSimon Shaolei DuYelong ShenPublished in: CoRR (2024)
Keyphrases
- cost effective
- model construction
- active learning
- reward function
- cost effectiveness
- low cost
- influence diagrams
- optimal policy
- expected reward
- pose estimation
- reinforcement learning
- decision problems
- learning algorithm
- semi supervised
- training set
- markov decision processes
- dynamic programming
- supervised learning
- object recognition
- machine learning
- control policy
- computer vision
- data sets
- real time