Search Results

Found 56 repositories(showing 30)

gpt-neox

EleutherAI

💛88

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

7.4k

1.1k

Apache-2.0

Python

Updated 2 hours ago

deepspeed-librarygpt-3language-model+1

Megatron-DeepSpeed

bigscience-workshop

🧡54

Ongoing research training transformer language models at scale, including: BERT & GPT-2

1.4k

226

NOASSERTION

Python

Updated 1 week ago

Megatron-DeepSpeed-Llama

genggui001

❤️25

No description available

NOASSERTION

Python

Updated 8 months ago

EasyLLM

ModelTC

❤️40

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Apache-2.0

Python

Updated 5 months ago

FastLLM

FreedomIntelligence

❤️45

Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];

Python

Updated 1 month ago

datastates-llm

DataStates

❤️45

LLM checkpointing for DeepSpeed/Megatron

MIT

C++

Updated 1 month ago

Megatron-DeepSpeed

Anonymous1252022

❤️25

No description available

NOASSERTION

Python

Updated 3 months ago

GLM-Pretrain-in-Megatron-DeepSpeed

yuguo-Jack

❤️40

GLM-Pretrain in Megatron-Deepspeed for DCU

Apache-2.0

Python

Updated 9 months ago

minLLMTrain

SulRash

❤️40

Minimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP

MIT

Python

Updated 1 year ago

deepspeedfsdphuggingface+3

Megatron-DeepSpeed-ABCI

kojimano

❤️10

No description available

NOASSERTION

Python

Updated 2 years ago

Megatron-DeepSpeed-Slurm

woojinsoh

❤️35

Execute Megatron-DeepSpeed using Slurm for multi-nodes distributed training

Shell

Updated 1 year ago

megatron-deepspeed-turing-techblog

okoge-kaz

❤️40

Turing Tech Blog Repository

NOASSERTION

Python

Updated 8 months ago

Megatron-DeepSpeed

llm-jp

❤️40

microsoft/Megatron-DeepSpeed のフォークです。

NOASSERTION

Python

Updated 9 months ago

Megatron-DeepSpeed_ViT

Eugene29

❤️35

Fork of Megatron-DeepSpeed with VIT bug fixes and model parallelisms (TP, TP-SP, Ulysses, etc) enabled for VIT. Pipeline Parallelism is not yet enabled.

NOASSERTION

Python

Updated 5 months ago

GPU Memory Calculator for LLM Training - Calculate GPU memory requirements for training Large Language Models with support for multiple training engines including PyTorch DDP, DeepSpeed ZeRO, Megatron-LM, and FSDP.

MIT

Python

Updated 1 month ago

ai-infrastructuredeep-learningdeepspeed+17

Finetune_llama2

wangbluo

❤️35

Build a llama fine-tuning script from scratch using PyTorch and transformers API. It needs to support 4 optional features: gradient checkpointing, mixed precision, data parallelism, tensor parallelism. Do not use ColossalAI/Megatron/DeepSpeed frameworks, you can refer to the code.

Python

Updated 1 year ago

Megatron-Deepspeed-LUMI

henkdr

❤️40

Megatron-Deepspeed benchmark on LUMI

NOASSERTION

Python

Updated 1 year ago

turing-techblog-megatron-deepspeed

okoge-kaz

❤️40

環境構築方法の詳細は以下のLinkから

NOASSERTION

Python

Updated 2 years ago

transformer-checkpoint

kungfu-team

❤️35

Checkpoint structure with Deepspeed and Megatron-LM

Updated 1 year ago

Megatron-Deepspeed

hannawong

❤️30

No description available

NOASSERTION

Python

Updated 2 years ago

Megatron-Deepspeed-Experiment

Gabriel4256

❤️30

No description available

NOASSERTION

Python

Updated 2 years ago

Chinese-Megatron-Deepspeed

jianbangzhang

❤️35

支持中文版的大模型

Updated 12 months ago

Megatron-DeepSpeed

t0-0

❤️35

Ongoing research training transformer language models at scale, including: BERT & GPT-2

NOASSERTION

Python

Updated 7 months ago

Megatron-DeepSpeed-Copy

Xuweijia-buaa

❤️40

No description available

NOASSERTION

Python

Updated 1 month ago

Mast-Megatron-DeepSpeed

Eutenacity

❤️45

No description available

NOASSERTION

Python

Updated 1 week ago

alcf-megatron-deepspeed-build

zhenghh04

❤️25

No description available

Shell

Updated 7 months ago

BloomZ560M-Megatron-Deepspeed

hannawong

❤️25

No description available

Updated 2 years ago

BCD-Megatron

quorvath

🧡50

This repository is based on megatron-deepspeed, incorporating the block coordinate descent method for training large-scale models.

NOASSERTION

Python

Updated 2 months ago

parallel-pytorch

jmerizia

❤️40

A (WIP) lightweight implementation of DeepSpeed/Megatron-ML style 3D parallelism, along with some models and helpful utilities.

MIT

Updated 1 year ago

llm-pretrain-100b

anilatambharii

❤️40

LLM Pretraining Framework (100B+ Params): Megatron-LM + DeepSpeed + FSDP. Open-source, HPC-ready system with tiny GPT simulation, distributed training, tokenizer tools, dataset pipelines, and deployment scripts for Slurm, AWS, Azure, and Docker.

MIT

Updated 6 months ago

GitHub Explorer

Search Results

gpt-neox

Megatron-DeepSpeed

Megatron-DeepSpeed-Llama

EasyLLM

FastLLM

datastates-llm

Megatron-DeepSpeed

GLM-Pretrain-in-Megatron-DeepSpeed

minLLMTrain

Megatron-DeepSpeed-ABCI

Megatron-DeepSpeed-Slurm

megatron-deepspeed-turing-techblog

Megatron-DeepSpeed

Megatron-DeepSpeed_ViT

gpu-mem-calculator

Finetune_llama2

Megatron-Deepspeed-LUMI

turing-techblog-megatron-deepspeed

transformer-checkpoint

Megatron-Deepspeed

Megatron-Deepspeed-Experiment

Chinese-Megatron-Deepspeed

Megatron-DeepSpeed

Megatron-DeepSpeed-Copy

Mast-Megatron-DeepSpeed

alcf-megatron-deepspeed-build

BloomZ560M-Megatron-Deepspeed

BCD-Megatron

parallel-pytorch

llm-pretrain-100b

gpt-neox

Megatron-DeepSpeed

Megatron-DeepSpeed-Llama

EasyLLM

FastLLM

datastates-llm

Megatron-DeepSpeed

GLM-Pretrain-in-Megatron-DeepSpeed

minLLMTrain

Megatron-DeepSpeed-ABCI

Megatron-DeepSpeed-Slurm

megatron-deepspeed-turing-techblog

Megatron-DeepSpeed

Megatron-DeepSpeed_ViT

gpu-mem-calculator

Finetune_llama2

Megatron-Deepspeed-LUMI

turing-techblog-megatron-deepspeed

transformer-checkpoint

Megatron-Deepspeed

Megatron-Deepspeed-Experiment

Chinese-Megatron-Deepspeed

Megatron-DeepSpeed

Megatron-DeepSpeed-Copy

Mast-Megatron-DeepSpeed

alcf-megatron-deepspeed-build

BloomZ560M-Megatron-Deepspeed

BCD-Megatron

parallel-pytorch

llm-pretrain-100b