Found 76,785 repositories(showing 30)
huggingface
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
ggml-org
LLM inference in C/C++
vllm-project
A high-throughput and memory-efficient inference and serving engine for LLMs
meta-llama
Inference code for Llama models
facebookresearch
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
ggml-org
Port of OpenAI's Whisper model in C/C++
colinhacks
TypeScript-first schema validation with static type inference
deepspeedai
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
hpcaitech
Making large AI models cheaper, faster and more accessible
microsoft
Official inference framework for 1-bit LLMs
huggingface
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
google-ai-edge
Cross-platform, customizable ML solutions for live and streaming media.
sgl-project
SGLang is a high-performance serving framework for large language models and multimodal models.
black-forest-labs
Official inference repo for FLUX.1 models
Tencent
ncnn is a high-performance neural network inference framework optimized for the mobile platform
SYSTRAN
Faster Whisper transcription with CTranslate2
FunAudioLLM
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
microsoft
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
karpathy
Inference Llama 2 in one file of pure C
facebookresearch
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
NVIDIA
Run OpenClaw more securely inside NVIDIA OpenShell with managed inference
meta-llama
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
cheahjs
A list of free LLM inference resources accessible via API.
mlc-ai
High-performance In-browser LLM Inference Engine
stas00
Machine Learning Engineering Open Book
kvcache-ai
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
meta-llama
Inference code for CodeLlama models
lyogavin
AirLLM 70B inference with single 4GB GPU
gvergnaud
🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.
alibaba
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.