Transformer-Evolution-Paper
search
⌘Ctrlk
Transformer-Evolution-Paper
  • README
  • 数学符号
  • Act
  • Arch
  • FFN
  • Head
  • Memory
  • MHA
    • FFT
    • LocalGlobal
    • MatrixMethod
    • RightProduct
      • Kronecker Attention Networks
      • An Attention Free Transformer
      • Transformer with Fourier Integral Attentions
      • Linear Complexity Randomized Self-attention Mechanism
      • UFO-ViT: High Performance Linear Vision Transformer without Softmax
      • XCiT: Cross-Covariance Image Transformers
      • SimpleTRON: Simple Transformer with O(N) Complexity
      • A Dot Product Attention Free Transformer
      • On Learning the Transformer Kernel
      • Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
    • SparseOrLowRank
    • Others
  • Normalize_And_Residual
  • Pe
  • Pretrain
  • Softmax
  • Others
  • LongConv
  • Rnn
  • CrossAttention
  • Inference
  • Peft
  • LLM
gitbookPowered by GitBook
block-quoteOn this pagechevron-down
  1. MHA

RightProduct

Kronecker Attention Networkschevron-rightAn Attention Free Transformerchevron-rightTransformer with Fourier Integral Attentionschevron-rightLinear Complexity Randomized Self-attention Mechanismchevron-rightUFO-ViT: High Performance Linear Vision Transformer without Softmaxchevron-rightXCiT: Cross-Covariance Image Transformerschevron-rightSimpleTRON: Simple Transformer with O(N) Complexitychevron-rightA Dot Product Attention Free Transformerchevron-rightOn Learning the Transformer Kernelchevron-rightMomentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearizationchevron-right
PreviousIs Attention Better Than Matrix Decompositionchevron-leftNextKronecker Attention Networkschevron-right

Last updated 3 years ago