Transformer-Evolution-Paper
Ctrlk
  • README
  • 数学符号
  • Act
  • Arch
  • FFN
  • Head
  • Memory
  • MHA
  • Normalize_And_Residual
  • Pe
  • Pretrain
  • Softmax
  • Others
  • LongConv
    • Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks
    • Parallelizing Legendre Memory Unit Training
    • Simplified State Space Layers for Sequence Modeling
    • Pretraining Without Attention
    • What Makes Convolutional Models Great on Long Sequence Modeling?
    • Hungry Hungry Hippos: Towards Language Modeling with State Space Models
    • Hyena Hierarchy: Towards Larger Convolutional Language Models
    • RWKV
    • Simple Hardware-Efficient Long Convolutions for Sequence Modeling
    • Time-aware large kernel convolutions
    • Resurrecting Recurrent Neural Networks for Long Sequences
    • CKConv: Continuous Kernel Convolution For Sequential Data
    • FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
    • Towards a General Purpose CNN for Long Range Dependencies in ND
  • Rnn
  • CrossAttention
  • Inference
  • Peft
  • LLM
Powered by GitBook
On this page

LongConv

Legendre Memory Units: Continuous-Time Representation in Recurrent Neural NetworksParallelizing Legendre Memory Unit TrainingSimplified State Space Layers for Sequence ModelingPretraining Without AttentionWhat Makes Convolutional Models Great on Long Sequence Modeling?Hungry Hungry Hippos: Towards Language Modeling with State Space ModelsHyena Hierarchy: Towards Larger Convolutional Language ModelsRWKVSimple Hardware-Efficient Long Convolutions for Sequence ModelingTime-aware large kernel convolutionsResurrecting Recurrent Neural Networks for Long SequencesCKConv: Continuous Kernel Convolution For Sequential DataFlexConv: Continuous Kernel Convolutions with Differentiable Kernel SizesTowards a General Purpose CNN for Long Range Dependencies in ND
PreviousWhy self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from SymmetriesNextLegendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks

Last updated 3 years ago