Login / Signup

MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation.

Zihao WangHaoxuan LiuJiaxing YuTao ZhangYan LiuKejun Zhang
Published in: CoRR (2024)
Keyphrases
  • high level
  • image alignment
  • shape description
  • real world
  • decision trees
  • three dimensional
  • generation process
  • word alignment