Reward-Punishment Reinforcement Learning with Maximum Entropy.
Jiexin WangEiji UchibePublished in: CoRR (2024)
Keyphrases
- maximum entropy
- reinforcement learning
- agent receives
- maximum entropy principle
- markov models
- state space
- eligibility traces
- reward function
- random fields
- principle of maximum entropy
- maximum entropy model
- markov decision processes
- machine learning
- iterative scaling
- transformation based learning
- supervised learning
- model free
- temporal difference
- class conditional
- learning agent
- learning algorithm
- reinforcement learning algorithms
- transfer learning
- conditional random fields
- optimal policy
- average reward
- partially observable markov decision processes
- ground truth
- hidden markov models
- bayesian networks
- minimum cross entropy