Transformer-Evolution-Paper
Ctrlk
  • README
  • 数学符号
  • Act
  • Arch
  • FFN
  • Head
  • Memory
  • MHA
  • Normalize_And_Residual
  • Pe
  • Pretrain
  • Softmax
  • Others
  • LongConv
  • Rnn
  • CrossAttention
  • Inference
  • Peft
  • LLM
    • LLM Details Summary
    • What Language Model to Train if You Have One Million GPU Hours?
Powered by GitBook
On this page

LLM

LLM Details SummaryWhat Language Model to Train if You Have One Million GPU Hours?
PreviousMake Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-TuningNextLLM Details Summary

Last updated 2 years ago