Found 5 repositories(showing 5)
BittnerPierre
Multi‑agent research AI workflow with cloud API and llama.cpp support, OpenAI vector_search or ChromaDB retrieval, Docker stacks (local & NVIDIA DGX Spark), and model benchmarking.
nerdpudding
Local LLM serving made manageable: llama.cpp in Docker with model profiles, interactive dashboard, benchmarking, and integration with Claude Code and AI tools
shamily
Dockerized inference server and benchmarks for Gemma 4 26B on the NVIDIA DGX Spark (GB10). Features ARM64 CUDA 13 builds using llama.cpp.
A dockerized option to benchmark your Llama.cpp server.
shuvanon
Run and benchmark Large Language Models (LLMs) locally with llama.cpp on GPU (Docker + WSL2). Includes helper scripts, quantisation benchmarks, and an OpenAI-compatible API server.
All 5 repositories loaded