From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations.
Evonne NgJavier RomeroTimur M. BagautdinovShaojie BaiTrevor DarrellAngjoo KanazawaAlexander RichardPublished in: CoRR (2024)
Keyphrases
- multimedia
- cognitive systems
- artificial systems
- human communication
- signal processing
- ai researchers
- visual information
- human behavior
- audio visual
- artificial agents
- content analysis
- audio video
- audio stream
- real world
- neural network
- human agent interaction
- music score
- conversational speech
- data sets
- broadcast news
- cross modal
- human subjects
- visual features