Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?
Qingkai FangShaolei ZhangZhengrui MaMin ZhangYang FengPublished in: ACL (1) (2024)
Keyphrases
- high quality
- speech recognition
- data sets
- audio stream
- audio visual
- raw data
- endpoint detection
- database
- prior knowledge
- speech signal
- experimental data
- recognition engine
- emotion recognition
- data points
- knowledge discovery
- image quality
- data processing
- low quality
- synthetic data
- spatial data
- data sources
- dialogue system
- training data
- original data
- data structure
- statistical analysis
- data collection
- acoustic signals
- clustering algorithm
- xml documents
- text to speech
- databases
- automatic speech recognition
- data distribution
- data analysis
- high dimensional data
- computer systems