Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning.
Bingbing LiZhenglun KongTianyun ZhangJi LiZhengang LiHang LiuCaiwen DingPublished in: CoRR (2020)
Keyphrases
- low cost
- parallel architectures
- real world
- tree construction
- natural language
- hardware and software
- real time
- fault diagnosis
- pruning strategy
- computing systems
- programming language
- fuzzy logic
- computer systems
- language learning
- small scale
- search space
- massively parallel
- relational databases
- search algorithm
- knowledge base
- semantic representations
- high voltage
- effective pruning