Transformer-Evolution-Paper
search
⌘Ctrlk
Transformer-Evolution-Paper
  • README
  • 数学符号
  • Act
  • Arch
  • FFN
  • Head
  • Memory
  • MHA
  • Normalize_And_Residual
  • Pe
  • Pretrain
  • Softmax
  • Others
    • Accelerating Neural Transformer via an Average Attention Network
    • Do Transformer Modifications Transfer Across Implementations and Applications?
    • Object-Centric Learning with Slot Attention
    • Do Transformer Modifications Transfer Across Implementations and Applications?
    • Why self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
  • LongConv
  • Rnn
  • CrossAttention
  • Inference
  • Peft
  • LLM
gitbookPowered by GitBook
block-quoteOn this pagechevron-down

Others

Accelerating Neural Transformer via an Average Attention Networkchevron-rightDo Transformer Modifications Transfer Across Implementations and Applications?chevron-rightObject-Centric Learning with Slot Attentionchevron-rightDo Transformer Modifications Transfer Across Implementations and Applications?chevron-rightWhy self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetrieschevron-right
PreviousNormalized Attention Without Probability Cagechevron-leftNextAccelerating Neural Transformer via an Average Attention Networkchevron-right

Last updated 3 years ago