Languages Through the Looking Glass of BPE Compression.
Ximena Gutierrez-VasquesChristian BentzTanja SamardzicPublished in: Comput. Linguistics (2023)
Keyphrases
- compression ratio
- expressive power
- data compression
- image compression
- compression algorithm
- compression scheme
- language independent
- machine learning
- multi lingual
- database
- neural network
- database systems
- text summarization
- compressed data
- databases
- text compression
- language identification
- lossless compression
- search engine
- syntactic and semantic dependencies