Found 98 repositories(showing 30)
freelw
A C++ implementation of Transformer without special library dependencies, supporting CPU, CUDA, and Apple Metal, including training and inference.
A comprehensive guide for running Large Language Models on your local hardware using popular frameworks like llama.cpp, Ollama, HuggingFace Transformers, vLLM, and LM Studio. Includes optimization techniques, performance comparisons, and step-by-step setup instructions for privacy-focused, cost-effective AI without cloud dependencies.
dianhsu
Swin Transformer C++ Implementation
go-skynet
Binding to transformers in ggml
dianhsu
用C++实现一个简单的Transformer模型。 Attention Is All You Need。
KrishM123
TransformerCPP is a minimal C++ machine learning library with autograd and tensor ops, inspired by PyTorch. It includes a from-scratch Transformer model demo, optimized for CPU and multithreaded performance.
samolego
Android app for running transformers locally using LLama.cpp & Whisper.cpp
innightwolfsleep
Connect llama-cpp, transformers or text-generation-webui to telegram bot api.
DAMO-NLP-SG
A chatbot UI for RAG, multimodal, text completion. (support Transformers, llama.cpp, MLX, vLLM)
Peter-Chou
transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)
linksplatform
LinksPlatform's Platform.RegularExpressions.Transformer.CSharpToCpp Class Library
ollewelin
Minimal transformer in C/C++ no dependancy of machine learning library just plain C/C++ code
Braxvang
Repository that demonstrates how to build a digital assistant using the Mistral Instruct Large Language Model (LLM) and LLAMA CPP. The system combines the LLM with Sentence Transformers (embeddings) in a Retrevial Augmented Generation Approach (RAG).
projektjoe
LLaMA2 model inference in pure C++
nailtu30
Diffusion Transformer in pure C++
experimento educacional de baixo nível para entender como os Grandes Modelos de Linguagem (LLMs) funcionam "por baixo do capô". Este projeto implementa uma arquitetura estilo GPT-3 escrita puramente em JavaScript/Node.js, sem dependências externas de machine learning.
jwjohns
LoRA/QLoRA fine-tuning pipeline for NVIDIA Nemotron-H hybrid Mamba2+Transformer models using gguf from the llama.cpp integration
linksplatform
LinksPlatform's Platform.RegularExpressions.Transformer.CppToJava Class Library
venomous-maker
Sat solver using tseitin Transformer using CPP
JPaulDuncan
A pure C# LLM inference engine built from scratch — no Python, no llama.cpp bindings, no ONNX Runtime. SharpInfer loads GGUF and Safetensors models directly, dequantizes weights in managed code, and runs the full transformer forward pass natively on .NET 8.
hammercui
致敬原作: qmd (https://github.com/tobi/qmd) - TypeScript实现的混合搜索引擎 Python重写版本 - 专为Windows稳定性优化和更高质量的检索体验而生 为什么重写? 原版qmd使用node-llama-cpp在Windows上存在严重的稳定性问题(随机崩溃)。本项目改用transformers + PyTorch技术栈
ServiceX Transformer that converts ATLAS xAOD files into columnwise data
Dodesimo
CPP transformer
ChRotsides
This project is focused on exploring the transformer model architecture, specifically focusing on the implementation of the feed-forward network component in C++.
chensongpoixs
使用 C++ 实现完整 Transformer 训练实现中英文翻译的模型,蒸馏,微调模型、LORA、RAG、向量数据库等等
PRITHIVSAKTHIUR
A C++ CLI tool for downloading, resharding, and re-uploading large Hugging Face models. It uses pybind11 to connect with Python libraries like transformers, huggingface_hub, and torch, enabling version conversion and configurable shard sizes.
A simple implementation of CLIPTokenizer in the transformer Python library
Azaria-Yonas
No description available
adithyanraj03
GPU accelerated Transformer decoder built entirely from scratch in C++/CUDA , no PyTorch, no TensorFlow. Hand-coded multi head attention, BPE tokenizer, AdamW optimizer, full backprop, custom CUDA kernels with memory pooling. Trained on Shakespeare for text generation. Pure mathematical implementation.
No description available