SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations.
Ioannis TsiamasJosé A. R. FonollosaMarta R. Costa-jussàPublished in: EMNLP (Findings) (2023)
Keyphrases
- data sets
- raw data
- data quality
- database
- statistical analysis
- data processing
- data analysis
- original data
- image segmentation
- spatial data
- data points
- multiscale
- databases
- data structure
- hidden markov models
- data collection
- computer systems
- training data
- segmentation algorithm
- probability distribution
- data distribution
- multimedia data
- data sources
- spectral clustering
- image analysis