Transformer-Evolution-Paper
search
⌘Ctrlk
Transformer-Evolution-Paper
  • README
  • 数学符号
  • Act
  • Arch
  • FFN
  • Head
  • Memory
  • MHA
  • Normalize_And_Residual
  • Pe
  • Pretrain
  • Softmax
  • Others
  • LongConv
  • Rnn
  • CrossAttention
  • Inference
  • Peft
  • LLM
    • LLM Details Summary
    • What Language Model to Train if You Have One Million GPU Hours?
gitbookPowered by GitBook
block-quoteOn this pagechevron-down

LLM

LLM Details Summarychevron-rightWhat Language Model to Train if You Have One Million GPU Hours?chevron-right
PreviousMake Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuningchevron-leftNextLLM Details Summarychevron-right

Last updated 2 years ago