What about Inputting Policy in Value Function: Policy Representation and Policy-Extended Value Function Approximator.
Hongyao TangZhaopeng MengJianye HaoChen ChenDaniel GravesDong LiChangmin YuHangyu MaoWulong LiuYaodong YangWenyuan TaoLi WangPublished in: AAAI (2022)
Keyphrases
- function approximators
- optimal policy
- reinforcement learning
- policy gradient
- reinforcement learning problems
- function approximation
- markov decision problems
- neural network
- decision making
- training set
- policy search
- policy gradient methods
- control policy
- action space
- action selection
- learning tasks
- evolutionary computation
- active learning
- evolutionary algorithm
- feature selection