Search Results

Found 83 repositories(showing 30)

Qwen3_autothink_adapter

hellangleZ

🧡55

Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The inference framework can be sglang, or it can be adapted/modified to use vLLM

Python

Updated 2 weeks ago

domyn-swarm

igeniusai

❤️35

Platform-agnostic toolkit to spin up vLLM endpoints and submit high-throughput jobs (DataFrame or scripts) across Slurm and DGX Cloud Lepton.

Apache-2.0

Python

Updated 2 months ago

Unsloth-VLLM-RTX5090-Ubuntu

oteroantoniogom

❤️45

Automated bash script to set up a high-performance environment on Ubuntu Linux with RTX5090, including installations of PyTorch, Unsloth, vLLM, Triton, Xformers. This script handles system dependencies, creates a Python virtual environment, compiles libraries from source, and verifies installations to ensure an optimal AI and deep learning setup.

MIT

Shell

Updated 1 month ago

vllm-ci

sasha0552

❤️25

CI scripts designed to build a Pascal-compatible version of vLLM.

MIT

Python

Updated 1 year ago

nvidiavllm

spark-setup

JetBrains-Hardware

💛70

DGX Spark setup and vLLM deployment scripts for Qwen, GPT-OSS, and Nemotron 3.

Apache-2.0

Shell

Updated 1 day ago

AI-Cluster-Distribution

0x53c

❤️45

A terraform based bootstrap script which allows the standup of vllm + ray for Apple Silicon based workloads

HCL

Updated 2 months ago

vLLM-Starter

DFKI-NLP

❤️25

Scripts to run large language models with text generation using vLLM.

Apache-2.0

Python

Updated 5 months ago

vllm-dev

diabloneo

❤️45

My dev scripts and documents about vLLM development

Shell

Updated 1 month ago

sglang-vllm-benchmark

brendanmckeag

❤️45

Benchmarking scripts to run on RunPod to compare/contrast vLLM vs SGLang on the same prompts.

Jupyter Notebook

Updated 1 month ago

transform_pixtral

stt-anth

❤️35

A script for transforming pixtral von HF transformers version to mistral/vLLM version

Python

Updated 6 months ago

arc-pro-b70-inference-setup-ubuntu-server

Hal9000AIML

🧡65

Ubuntu Server edition: automated setup script for Intel Arc Pro B70 GPU LLM inference server with vLLM tensor parallelism. 140 tok/s on 2x B70, 540 tok/s on 4x B70. For Windows, see arc-pro-b70-inference-setup-windows.

Shell

Updated 48 minutes ago

runpod-vllm-scripts

ashleykleynhans

❤️40

OpenAI Compatible API scripts for RunPod vLLM Worker

GPL-3.0

Python

Updated 8 months ago

vllm-inference-slurm

SURF-ML

🧡50

vLLM inference scripts with SLURM Apptainer

MIT

Shell

Updated 1 month ago

vllm-wizard

vashkelis

🧡60

vLLM configuration script. Easily find an optimal configuration for your vLLM and GPU, evaluate VRAM usage, and token throughput.

Apache-2.0

Python

Updated 1 week ago

vllm-hpc-installer

AI-DarwinLabs

❤️25

🚀 Automated installation script for vLLM on HPC systems with ROCm support, optimized for AMD MI300X GPUs.

MIT

Shell

Updated 6 months ago

amdcondaflash-attention+7

containerised-nvidia-dynamo

belalyahouni

🧡65

Containerised NVIDIA Dynamo with vLLM Backend Ready-to-use Docker environment for running NVIDIA’s Dynamo inference framework with vLLM. Includes pre-installed dependencies, service setup (etcd, nats-server), and example scripts for running prompts via batch commands. Streamlines LLM inference without local setup overhead.

Python

Updated 3 days ago

llm-benchmark

minkim26

❤️40

A simple, configurable Bash script to benchmark and compare inference performance between Llama.cpp and vLLM using the OpenAI-compatible `/v1/chat/completions` API

MIT

Shell

Updated 9 months ago

vllm-model-bash

openshift-psap

❤️15

Scripts for vllm-model-bash efforts

Python

Updated 4 months ago

vllm-multi-node

bosung

❤️30

Scripts for serving vllm on multi node

Python

Updated 1 year ago

TACC_vllm

g-jaffe

❤️45

Scripts for programatic launching of vllm backends at TACC

Shell

Updated 1 month ago

spirl

teja-rao

❤️35

simple test scripts for RL between vllm and torchtitan

Updated 4 months ago

sera-setup

clawd-xsl

❤️45

SERA (AI2 Open Coding Agent) setup scripts - vLLM deployment for GPU servers

Shell

Updated 2 months ago

llm-rocm-benchmarks

kevinbazira

🧡50

Standalone LLM inference benchmarking pipelines on AMD GPUs using ROCm, vLLM, MAD, and data visualization scripts.

MIT

Python

Updated 1 month ago

amd-gpugpu-benchmarkinginference-optimization+8

llama-3.2-3b-openhermes

kunjcr2

❤️40

This repo contains a fine-tuned LLaMA 3.2B model served using vLLM and Docker. The project includes a custom OpenAI-style API endpoint, benchmarking scripts, performance metrics, and monitoring setup. Designed for low-latency inference and production-ready LLM deployment.

Apache-2.0

Python

Updated 7 months ago

vllm-scripts

leideng

🧡60

vllm sand vllm-ascend scripts

MIT

Python

Updated 1 week ago

vllm-scripts

tolgaakar

❤️35

Small repo for storing vllm server related scripts and files.

Jinja

Updated 1 year ago

vllm-scripts

tomasruizt

❤️35

No description available

Jupyter Notebook

Updated 1 month ago

vllm-scripts

XinyiQiao

❤️35

No description available

Python

Updated 2 months ago

vllm-scripts

galtay

❤️30

No description available

MIT

Updated 1 year ago

vllm-scripts

manekiyong

🧡55

No description available

Shell

Updated 1 day ago

GitHub Explorer

Search Results

Qwen3_autothink_adapter

domyn-swarm

Unsloth-VLLM-RTX5090-Ubuntu

vllm-ci

spark-setup

AI-Cluster-Distribution

vLLM-Starter

vllm-dev

sglang-vllm-benchmark

transform_pixtral

arc-pro-b70-inference-setup-ubuntu-server

runpod-vllm-scripts

vllm-inference-slurm

vllm-wizard

vllm-hpc-installer

containerised-nvidia-dynamo

llm-benchmark

vllm-model-bash

vllm-multi-node

TACC_vllm

spirl

sera-setup

llm-rocm-benchmarks

llama-3.2-3b-openhermes

vllm-scripts

vllm-scripts

vllm-scripts

vllm-scripts

vllm-scripts

vllm-scripts

Qwen3_autothink_adapter

domyn-swarm

Unsloth-VLLM-RTX5090-Ubuntu

vllm-ci

spark-setup

AI-Cluster-Distribution

vLLM-Starter

vllm-dev

sglang-vllm-benchmark

transform_pixtral

arc-pro-b70-inference-setup-ubuntu-server

runpod-vllm-scripts

vllm-inference-slurm

vllm-wizard

vllm-hpc-installer

containerised-nvidia-dynamo

llm-benchmark

vllm-model-bash

vllm-multi-node

TACC_vllm

spirl

sera-setup

llm-rocm-benchmarks

llama-3.2-3b-openhermes

vllm-scripts

vllm-scripts

vllm-scripts

vllm-scripts

vllm-scripts

vllm-scripts