Found 59 repositories(showing 30)
lyogavin
AirLLM 70B inference with single 4GB GPU
nickzsche21
Run OpenClaw AI agent with zero API cost, local LLM via AirLLM
ManuelSLemos
Run 70B+ LLMs on a single 4GB GPU — no quantization required.
lyogavin
Moved to here: https://github.com/lyogavin/airllm
Thirumurugan240
No description available
velumkai
Run 70B+ LLMs on consumer GPUs using layer streaming
howardleegeek
AirLLM++: Faster than AirLLM with super-block residency and deep prefetching
yhinsson
🚀 Optimize memory for large language models, enabling 70B models on a 4GB GPU and 405B Llama3.1 on 8GB VRAM without compression techniques.
vinay00011
this is my first personal project
Web3Tester2023
AirLLM 70B inference with single 4GB GPU
nahharris
A TUI for AirLLM
JockDaRock
No description available
BretMcDanel
OpenAI compatible server for AirLLM
rileyseaburg
High-performance layer-wise LLM inference in Rust. Memory-efficient inference for models that don't fit in GPU memory.
MasterX1582
OpenAI-compatible API server for running Qwen3.5-397B-A17B locally using AirLLM.
demersaj
Quick test of AirLLM - running 405B parameter model on a laptop
den-rgb
AIRLLM support for amd graphics cards
rileyseaburg
Run GLM-4.7 (358B) on 8GB VRAM with AirLLM
erwinnicholas
Rust Layer-Streamed Inference Runtime (AirLLM-style control plane)
Meghanaths
No description available
davidjoffe
airllm test
priyanshjain117
No description available
SunilRathodRivulet
No description available
balic-AI-ML-R-D-Resources
No description available
xencon
AirLLM Docker container
Mahin-katariya
Local-first AI debugging assistant for hardware verification (Buscraft++)
MahinK
this is a sub repo of the main project
pratikbhande
No description available
vegeta03
No description available
chinkan
No description available