Found 3,649 repositories(showing 30)
ggml-org
LLM inference in C/C++
h2oai
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
abetlen
Python bindings for llama.cpp
intel
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
serge-chat
A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
ngxson
Real-time webcam demo with SmolVLM and llama.cpp server
johnbean393
A native macOS app that allows users to chat with a local LLM that can respond with information from files, folders and websites on your Mac without installing any other software. Powered by llama.cpp.
mostlygeek
Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc
Mobile-Artificial-Intelligence
Maid is a free and open source application for interfacing with llama.cpp models locally, and with Anthropic, DeepSeek, Ollama, Mistral and OpenAI models remotely.
withcatai
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
ikawrakow
llama.cpp fork with additional SOTA quants and improved performance
gotzmann
llama.go is like llama.cpp in pure Golang!
RahulSChand
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
milanglacier
💃 Dance with Intelligence in Your Code. Minuet offers code completion as-you-type from popular LLMs including OpenAI, Gemini, Claude, Ollama, Llama.cpp, Codestral, and more.
ngxson
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
nomic-ai
Official supported Python bindings for llama.cpp + gpt4all
mybigday
React Native binding of llama.cpp
go-skynet
LLama.cpp golang bindings
Atome-FE
Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.
devoxx
DevoxxGenie is a plugin for IntelliJ IDEA that uses local LLM's (Ollama, LMStudio, GPT4All, Jan and Llama.cpp) and Cloud based LLMs to help review, test, explain your project code. Latest version now also supports Spec Driven Development with CLI Runners.
Maximilian-Winter
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
keldenl
A llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI.
michaelneale
reference impl with llama.cpp compiled to distributed inference across machines, with real end to end demo
kelindar
Go library for embedded vector search and semantic embeddings using llama.cpp
utilityai
No description available
psugihara
llama.cpp based AI chat app for macOS
lxe
A simple "Be My Eyes" web app with a llama.cpp/llava backend
hecrj
A local AI chat app powered by 🦀 Rust, 🧊 iced, 🤗 Hugging Face, and 🦙 llama.cpp
mdrokz
LLama.cpp rust bindings
kherud
Java Bindings for llama.cpp - A Port of Facebook's LLaMA model in C/C++