Towards Multi-Objective Object Push-Grasp Policy Based on Maximum Entropy Deep Reinforcement Learning under Sparse Rewards.
Tengteng ZhangHongwei MoPublished in: Entropy (2024)
Keyphrases
- maximum entropy
- reinforcement learning
- multi objective
- optimal policy
- reward function
- policy search
- maximum entropy principle
- markov decision processes
- markov models
- control policy
- markov decision process
- action selection
- evolutionary algorithm
- maximum entropy model
- d objects
- state space
- principle of maximum entropy
- conditional random fields
- policy iteration
- reinforcement learning algorithms
- class conditional
- markov decision problems
- iterative scaling
- partially observable markov decision processes
- temporal difference
- transformation based learning
- minimum cross entropy
- average reward
- machine learning
- model free
- generative model
- particle swarm optimization
- image processing
- bregman divergences
- expected reward
- reward shaping
- least squares
- similarity measure
- learning algorithm