Search Results

Found 69 repositories(showing 30)

lmdeploy

InternLM

💛80

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

7.8k

681

Apache-2.0

Python

Updated 5 hours ago

codellamacuda-kernelsdeepspeed+8

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋

3.7k

562

AGPL-3.0

Python

Updated 11 hours ago

asrchatchat-application+11

Llama3-Tutorial

SmartFlowAI

🧡56

Llama3-Tutorial（XTuner、LMDeploy、OpenCompass）

511

Python

Updated 1 week ago

openmodelz

tensorchord

🧡51

Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)

282

Apache-2.0

Updated 2 weeks ago

cluster-managerhacktoberfestinference+3

gpt_server

shell-nlp

🧡60

gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。

250

Apache-2.0

Python

Updated 2 weeks ago

asrembeddingfastchat+13

Roleplay-with-XiYou

JimmyMa99

🧡60

基于《西游记》原文、白话文、ChatGPT生成数据制作的，以InternLM2微调的角色扮演多LLM聊天室。本项目将介绍关于角色扮演类 LLM 的一切，从数据获取、数据处理，到使用 XTuner 微调并部署至 OpenXLab，再到使用 LMDeploy 部署，以 openai api 的方式接入简单的聊天室，并可以观看不同角色的 LLM 互相交流、互怼。

109

Apache-2.0

Python

Updated 1 week ago

LMDeploy-Jetson

cavedweller509

🧡55

Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.

107

Apache-2.0

Updated 3 weeks ago

FunGPT

Alannikos

💛70

In this fast-paced world, we all need a little something to spice up life. Whether you need a glass of sweet talk to lift your spirits or a dose of sharp retorts to let off steam, FunGPT has got you covered 🎉!

MIT

Python

Updated 4 days ago

aiasrbanterbot+10

DeepSparkInference

Deep-Spark

🧡65

DeepSparkInference has selected 216 inference models of both small and large sizes. The small models cover fields such as computer vision, natural language processing, and speech recognition; the LLMs involve various frameworks including vLLM, TGI and LMDeploy. This repository is the mirror of Gitee.

Apache-2.0

Python

Updated 1 day ago

gpgpuinferencellm+1

BentoLMDeploy

bentoml

❤️25

Self-host LLMs with LMDeploy and BentoML

Python

Updated 3 months ago

lmdeploy-v100

zh-nj

🧡60

This project is specifically developed for V100, based on lmdeploy 0.12.1, and supports mainstream open-source models from Q4 2025 to Q1 2026. It does not account for compatibility with other architectures and has only been tested on an 8-card V100 32GB setup.

Apache-2.0

Python

Updated 6 days ago

lmdeploy-build

zhyncs

❤️40

Nightly Build for LMDeploy

MIT

PowerShell

Updated 11 months ago

bench360

slinusc

❤️40

Bench360 is a modular benchmarking suite for local LLM deployments. It offers a full-stack, extensible pipeline to evaluate the latency, throughput, quality, and cost of LLM inference on consumer and enterprise GPUs. Bench360 supports flexible backends, tasks and scenarios, enabling fair and reproducible comparisons for researchers & practitioners.

MIT

Python

Updated 1 month ago

bench360benchmarkdeployment+15

lmdeploy-x

nguyen599

🧡50

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Apache-2.0

Updated 1 month ago

Jetson-LMDeploy

LemonKAI

❤️25

No description available

Updated 7 months ago

lmdeploy

coolccds

🧡50

根据lmdeploy官方pr 4389分支，在其对turbomind基础上，修改部分文件适配2080ti、T10等sm75架构，仅支持文本

Apache-2.0

Python

Updated 1 month ago

my_internlm_model_lmdeploy

abs7798

❤️40

使用lmdeploy

Apache-2.0

Python

Updated 1 year ago

Llama3.1-Lmdeploy

NagatoYuki0943

❤️35

run Llama3.1 with Lmdeploy

Python

Updated 1 year ago

modal-llm-serving

wtlow003

❤️40

Examples of serving LLM on Modal.

MIT

Python

Updated 1 year ago

llmlmdeploymodal+5

multi-llm-img-to-text-on-modal

gysi

🧡60

A multi-modal Large Language Model (LLM) application deployed on Modal.com that can process images and generate text descriptions using InternVL with LMDeploy.

Apache-2.0

Python

Updated 1 week ago

zero-crash-inference

ZCI-Tech

❤️40

Predictive GPU failure detection for inference startups. S++9 Blackwell thermal: 12-min advance warning before throttling. 125M events/sec throughput. Framework-agnostic (vLLM, SGLang, LMDeploy).

Rust

Updated 1 month ago

lmdeploy-dev

Trangle

❤️40

lmdeploy-dev

Apache-2.0

Python

Updated 1 year ago

lmdeploy

Isekai-Creation

❤️10

No description available

Apache-2.0

Python

Updated 3 months ago

lmdeploy

galadriel-ai

❤️30

No description available

Apache-2.0

Python

Updated 9 months ago

LMDeploy

hui1feng

❤️25

No description available

Updated 2 years ago

lmdeploy

sjzhou4

❤️25

No description available

Updated 2 years ago

lmdeploy

llxcfamily

❤️30

No description available

Apache-2.0

C++

Updated 2 years ago

-LMDeploy-

Lb1002

❤️35

第五课作业

Updated 1 year ago

lmdeploy

LiyanJin

❤️30

No description available

Apache-2.0

Python

Updated 1 year ago

lmdeploy

rsjeeva

❤️25

No description available

HTML

Updated 4 years ago

GitHub Explorer

Search Results

lmdeploy

Streamer-Sales

Llama3-Tutorial

openmodelz

gpt_server

Roleplay-with-XiYou

LMDeploy-Jetson

FunGPT

DeepSparkInference

BentoLMDeploy

lmdeploy-v100

lmdeploy-build

bench360

lmdeploy-x

Jetson-LMDeploy

lmdeploy

my_internlm_model_lmdeploy

Llama3.1-Lmdeploy

modal-llm-serving

multi-llm-img-to-text-on-modal

zero-crash-inference

lmdeploy-dev

lmdeploy

lmdeploy

LMDeploy

lmdeploy

lmdeploy

-LMDeploy-

lmdeploy

lmdeploy

lmdeploy

Streamer-Sales

Llama3-Tutorial

openmodelz

gpt_server

Roleplay-with-XiYou

LMDeploy-Jetson

FunGPT

DeepSparkInference

BentoLMDeploy

lmdeploy-v100

lmdeploy-build

bench360

lmdeploy-x

Jetson-LMDeploy

lmdeploy

my_internlm_model_lmdeploy

Llama3.1-Lmdeploy

modal-llm-serving

multi-llm-img-to-text-on-modal

zero-crash-inference

lmdeploy-dev

lmdeploy

lmdeploy

LMDeploy

lmdeploy

lmdeploy

-LMDeploy-

lmdeploy

lmdeploy