Transformer-Evolution-Paper

Ctrlk

README
数学符号
Act
Arch
FFN
Head
Memory
MHA
Normalize_And_Residual
Pe
Pretrain
Softmax
Others
LongConv
Rnn
CrossAttention
Inference
Peft
LLM
- LLM Details Summary
- What Language Model to Train if You Have One Million GPU Hours?

Powered by GitBook

On this page

LLM

LLM Details Summary What Language Model to Train if You Have One Million GPU Hours?

PreviousMake Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning NextLLM Details Summary

Last updated 2 years ago