Login / Signup
Shivam Mehta
ORCID
Publication Activity (10 Years)
Years Active: 2020-2024
Publications (10 Years): 19
Top Topics
Prosodic Features
Sequence Labeling
Spontaneous Speech
Artificial Neural
Top Venues
CoRR
ICASSP
SSW
CCRIS
</>
Publications
</>
Shivam Mehta
,
Ruibo Tu
,
Simon Alexanderson
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
Unified Speech and Gesture Synthesis Using Flow Matching.
ICASSP
(2024)
Shivam Mehta
,
Harm Lameris
,
Rajiv Punmiya
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech.
CoRR
(2024)
Shivam Mehta
,
Ruibo Tu
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
Matcha-TTS: A Fast TTS Architecture with Conditional Flow Matching.
ICASSP
(2024)
Shivam Mehta
,
Anna Deichler
,
Jim O'Regan
,
Birger Moëll
,
Jonas Beskow
,
Gustav Eje Henter
,
Simon Alexanderson
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis.
CoRR
(2024)
Shivam Mehta
,
Ruibo Tu
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
Matcha-TTS: A fast TTS architecture with conditional flow matching.
CoRR
(2023)
Harm Lameris
,
Shivam Mehta
,
Gustav Eje Henter
,
Joakim Gustafson
,
Éva Székely
Prosody-Controllable Spontaneous TTS with Neural HMMS.
ICASSP
(2023)
Shivam Mehta
,
Siyang Wang
,
Simon Alexanderson
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis.
CoRR
(2023)
Ronald Cumbal
,
Agnes Axelsson
,
Shivam Mehta
,
Olov Engwall
Stereotypical nationality representations in HRI: perspectives from international young adults.
Frontiers Robotics AI
10 (2023)
Ambika Kirkland
,
Shivam Mehta
,
Harm Lameris
,
Gustav Eje Henter
,
Éva Székely
,
Joakim Gustafson
Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation.
SSW
(2023)
Shivam Mehta
,
Ruibo Tu
,
Simon Alexanderson
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
Unified speech and gesture synthesis using flow matching.
CoRR
(2023)
Shivam Mehta
,
Siyang Wang
,
Simon Alexanderson
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis.
SSW
(2023)
Anna Deichler
,
Shivam Mehta
,
Simon Alexanderson
,
Jonas Beskow
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation.
CoRR
(2023)
Shivam Mehta
,
Ambika Kirkland
,
Harm Lameris
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
OverFlow: Putting flows on top of neural transducers for better TTS.
INTERSPEECH
(2023)
Anna Deichler
,
Shivam Mehta
,
Simon Alexanderson
,
Jonas Beskow
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation.
ICMI
(2023)
Shivam Mehta
,
Ambika Kirkland
,
Harm Lameris
,
Jonas Beskow
,
Éva Székely
,
Gustav Eje Henter
OverFlow: Putting flows on top of neural transducers for better TTS.
CoRR
(2022)
Harm Lameris
,
Shivam Mehta
,
Gustav Eje Henter
,
Joakim Gustafson
,
Éva Székely
Prosody-controllable spontaneous TTS with neural HMMs.
CoRR
(2022)
Shivam Mehta
,
Éva Székely
,
Jonas Beskow
,
Gustav Eje Henter
Neural HMMS Are All You Need (For High-Quality Attention-Free TTS).
ICASSP
(2022)
Shivam Mehta
,
Éva Székely
,
Jonas Beskow
,
Gustav Eje Henter
Neural HMMs are all you need (for high-quality attention-free TTS).
CoRR
(2021)
Shivam Mehta
,
Ivan Smettanikov
Finding the Blank with Sequence Labeling for English Learning.
CCRIS
(2020)