Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention.
Biao ZhangIvan TitovRico SennrichPublished in: EMNLP/IJCNLP (1) (2019)
Keyphrases
- fuzzy logic
- depth information
- fault diagnosis
- focus of attention
- visual attention
- depth map
- evolutionary algorithm
- real time
- probabilistic model
- deep learning
- depth data
- view synthesis
- time of flight
- search algorithm
- database systems
- three dimensional
- e learning
- information systems
- computer vision
- information retrieval
- data mining
- real world