Search Results

Found 9 repositories(showing 9)

DeepSpeed-MII

deepspeedai

🧡69

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

2.1k

190

Apache-2.0

Python

Updated 6 days ago

deep-learninginferencepytorch

mii-dev

hzg0601

❤️40

dev for deepspeed-mii

Apache-2.0

Python

Updated 1 year ago

ds-mii-deepdive

heiko-hotz

❤️40

Experiments with DeepSpeed MII library

MIT

Jupyter Notebook

Updated 3 years ago

mii-testing

heiko-hotz

❤️40

Testing DeepSpeed MII

MIT

Jupyter Notebook

Updated 3 years ago

UniSpar-DeepSpeed-MII

PenguinQwQ

❤️40

DeepSpeed-MII for UniSpar Project

Apache-2.0

Roff

Updated 7 months ago

deepspeed-mii

sfc-gh-vichan

❤️35

test

Python

Updated 1 year ago

DeepSpeed-MII

tonyzhao-jt

❤️40

Modificiation of MII

Apache-2.0

Python

Updated 2 years ago

Launch your own high-performance DeepSpeed-MII server for seamless local LLM deployment. This repository provides a Dockerized solution to serve Hugging Face models (e.g., Mistral-7B) with an OpenAI-compatible API, enabling GPU-accelerated, low-latency inference out of the box.

MIT

Dockerfile

Updated 10 months ago

containerdeepspeeddocker+4

llm-k8s-benchmark

henryekeocha

🧡60

Reproducible benchmarking suite for LLM inference stacks (vLLM, TGI, llama.cpp, Ollama, DeepSpeed-MII) on Kubernetes. Measures throughput, latency, GPU utilisation, and cost-per-token under production-grade K8s conditions including auto-scaling and pod scheduling overhead.

Apache-2.0

Updated 2 weeks ago

All 9 repositories loaded

GitHub Explorer

Search Results

DeepSpeed-MII

mii-dev

ds-mii-deepdive

mii-testing

UniSpar-DeepSpeed-MII

deepspeed-mii

DeepSpeed-MII

deepspeed-mii-container

llm-k8s-benchmark

DeepSpeed-MII

mii-dev

ds-mii-deepdive

mii-testing

UniSpar-DeepSpeed-MII

deepspeed-mii

DeepSpeed-MII

deepspeed-mii-container

llm-k8s-benchmark