Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation.
Sravya PopuriPeng-Jen ChenChanghan WangJuan PinoYossi AdiJiatao GuWei-Ning HsuAnn LeePublished in: INTERSPEECH (2022)
Keyphrases
- data sets
- speech recognition
- synthetic data
- data analysis
- image data
- database
- data points
- data mining techniques
- labelled data
- speech signal
- data quality
- missing data
- data collection
- data processing
- input data
- small number
- data structure
- high quality
- knowledge discovery
- probability distribution
- xml documents
- training samples
- training data
- original data
- database systems
- training process
- social networks
- translation model
- neural network
- audio stream
- hearing impaired