IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages.
Tahir JavedJanki NawaleEldho Ittan GeorgeSakshi JoshiKaushal Santosh BhogaleDeovrat MehendaleIshvinder Virender SethiAparna AnanthanarayananHafsah FaquihPratiti PalitSneha RavishankarSaranya SukumaranTripura PanchagnulaSunjay MuraliKunal Sharad GandhiAmbujavalli RManickam K. MC. Venkata VaijayanthiKrishnan Srinivasa Raghavan KarunganniPratyush KumarMitesh M. KhapraPublished in: ACL (Findings) (2024)
Keyphrases
- indian languages
- multi lingual
- cross lingual
- language identification
- spoken language
- cross lingual information retrieval
- document images
- speech recognition
- audio visual
- english text
- speaker identification
- information access
- machine learning
- dialogue system
- cross language
- digital libraries
- automatic speech recognition
- broadcast news
- wordnet
- document image analysis
- maximum likelihood
- feature selection
- search engine