DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling & DeLighT: Deep and Light-weight Transformer
PreviousHyperMixer An MLP-based Green AI Alternative to TransformersNextWhen Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism
Last updated