Found 470 repositories(showing 30)
abetlen
Python bindings for llama.cpp
Woolverine94
a self-hosted webui for 30+ generative ai
nomic-ai
Official supported Python bindings for llama.cpp + gpt4all
TheBlewish
A Python-based web-assisted large language model (LLM) search assistant using Llama.cpp
jasonacox
Setup and run a local LLM and Chatbot using consumer grade hardware.
thomasantony
Python bindings for llama.cpp
ddh0
Python package wrapping llama.cpp for on-device LLM inference
Wheels for llama-cpp-python compiled with cuBLAS support
Aesthisia
Gradio based tool to run opensource LLM models directly from Huggingface
unixwzrd
Information on optimizing python libraries specifically for oobabooga to take advantage of Apple Silicon and Accelerate Framework.
HairlessPrimate
Bridging wrapper for llama-cpp-python within ComfyUI
absadiki
Python bindings for llama.cpp
mlc-delgado
An open source, Gradio-based chatbot app that combines the best of retrieval augmented generation and prompt engineering into an intelligent assistant for modern professionals.
dougeeai
Pre-built wheels for llama-cpp-python across platforms and CUDA versions
herrera-luis
Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.
blackcon
This project support a WEB UI with Vicuna13B (using llama-cpp-python, chatbot-ui)
vatsalsaglani
A GenAI Search powered by llama-cpp-python and small language model Phi-3
nicholasyager
A guidance compatibility layer for llama-cpp-python
ossirytk
Local character AI chatbot with chroma vector store memory and some scripts to process documents for Chroma
fidecastro
Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!
Talnz007
GPU-accelerated LLaMA inference wrapper for legacy Vulkan-capable systems a Pythonic way to run AI with knowledge (Ilm) on fire (Vulkan).
daskol
Python bindings to llama.cpp
notolog
Notolog Markdown Editor
viniciusarruda
Wrapper around llama-cpp-python for chat completion with LLaMA v2 models.
Granddyser
A comprehensive, step-by-step guide for successfully installing and running llama-cpp-python with CUDA GPU acceleration on Windows. This repository provides a definitive solution to the common installation challenges, including exact version requirements, environment setup, and troubleshooting tips.
BrunoArsioli
Lightweight Python tool using Optuna for tuning llama.cpp flags: towards optimal tok/s for your machine
Run fast LLM Inference using Llama.cpp in Python
boneylizard
CUDA 12.8 accelerated prebuilt wheel of llama-cpp-python with full Gemma 3 model support for Windows 10/11 (x64). Built by boneylizard.
CuaOS
This repository is a CUA (computer use agent) system that, using the Qwen3-VL model on Ubuntu computers, aims to perform tasks on your behalf using the keyboard and mouse in a local Sandbox environment in GGUF format, based on the commands you provide.
timopb
A simple inference web UI for llama.cpp / lama-cpp-python