ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback.
Ju-Seung ByunJiyun ChunJihyung KilAndrew PerraultPublished in: CoRR (2024)
Keyphrases
- multi modal
- fine tuning
- reinforcement learning
- knowledge representation and reasoning
- knowledge representation
- supervised learning
- learning algorithm
- artificial intelligence
- machine learning
- viable alternative
- fine tune
- fine tuned
- multi modality
- expert systems
- audio visual
- state space
- knowledge base
- high dimensional
- relevance feedback
- case based reasoning
- cross modal
- truth maintenance systems
- semantic concepts
- fusing multiple
- smart room