Shivam Mehta

Publication Activity (10 Years)

Years Active: 2020-2024
Publications (10 Years): 19

Top Topics

Prosodic Features

Sequence Labeling

Spontaneous Speech

Artificial Neural

Top Venues

Publications

Shivam Mehta, Ruibo Tu, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter
Unified Speech and Gesture Synthesis Using Flow Matching. ICASSP (2024)
Shivam Mehta, Harm Lameris, Rajiv Punmiya, Jonas Beskow, Éva Székely, Gustav Eje Henter
Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech. CoRR (2024)
Shivam Mehta, Ruibo Tu, Jonas Beskow, Éva Székely, Gustav Eje Henter
Matcha-TTS: A Fast TTS Architecture with Conditional Flow Matching. ICASSP (2024)
Shivam Mehta, Anna Deichler, Jim O'Regan, Birger Moëll, Jonas Beskow, Gustav Eje Henter, Simon Alexanderson
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis. CoRR (2024)
Shivam Mehta, Ruibo Tu, Jonas Beskow, Éva Székely, Gustav Eje Henter
Matcha-TTS: A fast TTS architecture with conditional flow matching. CoRR (2023)
Harm Lameris, Shivam Mehta, Gustav Eje Henter, Joakim Gustafson, Éva Székely
Prosody-Controllable Spontaneous TTS with Neural HMMS. ICASSP (2023)
Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis. CoRR (2023)
Ronald Cumbal, Agnes Axelsson, Shivam Mehta, Olov Engwall
Stereotypical nationality representations in HRI: perspectives from international young adults. Frontiers Robotics AI 10 (2023)
Ambika Kirkland, Shivam Mehta, Harm Lameris, Gustav Eje Henter, Éva Székely, Joakim Gustafson
Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation. SSW (2023)
Shivam Mehta, Ruibo Tu, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter
Unified speech and gesture synthesis using flow matching. CoRR (2023)
Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis. SSW (2023)
Anna Deichler, Shivam Mehta, Simon Alexanderson, Jonas Beskow
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation. CoRR (2023)
Shivam Mehta, Ambika Kirkland, Harm Lameris, Jonas Beskow, Éva Székely, Gustav Eje Henter
OverFlow: Putting flows on top of neural transducers for better TTS. INTERSPEECH (2023)
Anna Deichler, Shivam Mehta, Simon Alexanderson, Jonas Beskow
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation. ICMI (2023)
Shivam Mehta, Ambika Kirkland, Harm Lameris, Jonas Beskow, Éva Székely, Gustav Eje Henter
OverFlow: Putting flows on top of neural transducers for better TTS. CoRR (2022)
Harm Lameris, Shivam Mehta, Gustav Eje Henter, Joakim Gustafson, Éva Székely
Prosody-controllable spontaneous TTS with neural HMMs. CoRR (2022)
Shivam Mehta, Éva Székely, Jonas Beskow, Gustav Eje Henter
Neural HMMS Are All You Need (For High-Quality Attention-Free TTS). ICASSP (2022)
Shivam Mehta, Éva Székely, Jonas Beskow, Gustav Eje Henter
Neural HMMs are all you need (for high-quality attention-free TTS). CoRR (2021)
Shivam Mehta, Ivan Smettanikov
Finding the Blank with Sequence Labeling for English Learning. CCRIS (2020)