Login / Signup

How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider Transformer Models.

Xin LuYanyan ZhaoBing Qin
Published in: CoRR (2024)
Keyphrases