Transformer-Evolution-Paper
Ctrlk
  • README
  • 数学符号
  • Act
  • Arch
  • FFN
  • Head
  • Memory
  • MHA
    • FFT
    • LocalGlobal
    • MatrixMethod
    • RightProduct
      • Kronecker Attention Networks
      • An Attention Free Transformer
      • Transformer with Fourier Integral Attentions
      • Linear Complexity Randomized Self-attention Mechanism
      • UFO-ViT: High Performance Linear Vision Transformer without Softmax
      • XCiT: Cross-Covariance Image Transformers
      • SimpleTRON: Simple Transformer with O(N) Complexity
      • A Dot Product Attention Free Transformer
      • On Learning the Transformer Kernel
      • Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
    • SparseOrLowRank
    • Others
  • Normalize_And_Residual
  • Pe
  • Pretrain
  • Softmax
  • Others
  • LongConv
  • Rnn
  • CrossAttention
  • Inference
  • Peft
  • LLM
Powered by GitBook
On this page
  1. MHA

RightProduct

Kronecker Attention NetworksAn Attention Free TransformerTransformer with Fourier Integral AttentionsLinear Complexity Randomized Self-attention MechanismUFO-ViT: High Performance Linear Vision Transformer without SoftmaxXCiT: Cross-Covariance Image TransformersSimpleTRON: Simple Transformer with O(N) ComplexityA Dot Product Attention Free TransformerOn Learning the Transformer KernelMomentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
PreviousIs Attention Better Than Matrix DecompositionNextKronecker Attention Networks

Last updated 3 years ago