Replacing Human Audio with Synthetic Audio for on-Device Unspoken Punctuation Prediction.
Daria SobolevaOndrej SkopekMárius SajgalíkVictor CarbuneFelix WeissenbergerJulia ProskurniaBogdan PrisacariDaniel ValcarceJustin LuRohit PrabhavalkarBalint MiklosPublished in: ICASSP (2021)
Keyphrases
- multimedia
- audio video
- signal processing
- audio visual
- music score
- audio signals
- visual information
- prediction accuracy
- cross modal
- music information retrieval
- human language
- emotion recognition
- broadcast news
- feature vectors
- artificial intelligence
- human activities
- prediction model
- human behavior
- multimedia information
- e learning
- real images are presented