机器学习分类

2023

07-12

Transformer推理加速方法-KV缓存(KV Cache)

07-11

详解PyTorch FSDP数据并行(Fully Sharded Data Parallel)

07-10

详解MegatronLM序列模型并行训练(Sequence Parallel)

07-09

详解MegatronLM Tensor模型并行训练(Tensor Parallel)

07-08

详解MegatronLM流水线模型并行训练(Pipeline Parallel)

07-01

Megatron-LM源码系列(一)：模型并行初始化

06-29

LLM大模型训练加速利器FlashAttention详解

06-27

详解大模型微调方法LoRA Adapter(内附实现代码)

06-24

详解大模型微调方法Prompt Tuning(内附实现代码)

06-19

GPT-3(Language Models are Few-Shot Learners)论文阅读