Search Results

Found 77 repositories(showing 30)

distributed-llama

b4rtaz

💛70

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.

2.9k

225

MIT

C++

Updated 1 hour ago

distributed-computingdistributed-llmllama2+6

mesh-llm

michaelneale

🧡67

reference impl with llama.cpp compiled to distributed inference across machines, with real end to end demo

730

Apache-2.0

Rust

Updated 7 minutes ago

llamacpp-distributed-inference

ADT109119

🧡55

一個基於 llama.cpp 的分佈式 LLM 推理程式，讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理，使用 Electron 的製作跨平台桌面應用程式操作 UI。

Apache-2.0

JavaScript

Updated 1 week ago

distributed-inferencedistributed-llmgguf+4

finetuning-on-aws

tsaol

❤️40

🚀 Fine-tune Large Language Models on AWS SageMaker using LLaMA Factory - End-to-end pipeline for distributed LLM training, evaluation & deployment

Jupyter Notebook

Updated 2 months ago

distributed-inference-llm

Github-Scalers-AI

❤️30

Serve Llama 2 (7B/13B/70B) Large Language Models efficiently at scale by leveraging heterogeneous Dell™ PowerEdge™ Rack servers in a distributed manner.

Python

Updated 11 months ago

coldstart-recs-on-aws-trainium

aws-samples

❤️30

End-to-end solution for cold-start recommendations using vLLM, DeepSeek Llama (8B & 70B), and FAISS on AWS Trainium (Trn1) with the Neuron SDK and NeuronX Distributed. Includes LLM-based interest expansion, embedding comparisons (T5 & SentenceTransformers), and scalable retrieval workflows.

MIT

Jupyter Notebook

Updated 8 months ago

spawnverse

sajosam

🧡55

Self-spawning AI agents born from tasks. Zero pre-built agents. Distributed memory. 4-layer guardrails. Fossil record. Groq + LLaMA.

NOASSERTION

Python

Updated 12 hours ago

agent-frameworkagentic-aiagents+10

zosma-llama2-server

zosma-ai

❤️30

Task Manager for Distributed LLaMA 2 inference network

Updated 1 year ago

rpc_manager

arseniy0924

🧡55

Web UI for orchestrating distributed llama.cpp RPC GPU clusters with auto node discovery, telemetry, and one-click deployment.

MIT

JavaScript

Updated 2 weeks ago

ai-clusterai-cluster-automationdistributed-inference+10

llm-hpc-course

HichamAgueny

🧡65

LLM course for distributed fine-tuning and inference on HPC systems using PyTorch and LLaMA model for summarization & QA.

MIT

Python

Updated 4 days ago

distributed-llama2-server

Siritao

❤️40

Deploy llama2 serving on multiple GPUs via flask

Apache-2.0

Python

Updated 2 years ago

zosma-llama2-worker

zosma-ai

❤️30

LLAMA-2 inference node that works with distributed cluster

Python

Updated 1 year ago

Rookery

Ptchwir3

❤️45

Turn any Kubernetes Cluster into a private LLM endpoint. One Helm command deploys distributed inference across commodity hardware. Raspberry Pi's, old servers, mixed architectures. OpenAI-Compatible API Powered by llama.cpp RPC

Dockerfile

Updated 1 month ago

aicluster-computingdecentralized+17

llama3-distributed-serving

LambdaLabsML

❤️25

No description available

Updated 1 year ago

vllm-llama3.1-distributed-inference

saakethtypes

❤️20

No description available

Python

Updated 1 year ago

hybrid-distributed-llama

Romyull-Islam

❤️30

No description available

MIT

C++

Updated 11 months ago

distributed-llama3-8b-full-finetuning

stillandcalm

❤️35

Full FineTuning of Llama-3-8B on distributed GPU nodes using Deepspeed

Python

Updated 8 months ago

distributed-inference-llama.cpp

fabiofalopes

❤️25

No description available

Dockerfile

Updated 1 year ago

fibr

stafel

❤️35

A distributed language model service for Alpaca / Llama

Updated 1 year ago

helix_chatbot

himanishpuri

🧡55

Local RAG chatbot with semantic search (ChromaDB), Redis-backed caching and queues, distributed workers, and real-time streaming via SSE — powered by llama.cpp

Python

Updated 2 weeks ago

F1-Neural-Copilot

mnouira02

❤️45

A distributed, local-first AI Race Engineer for F1 202x. Uses Computer Vision, UDP Telemetry, and Llama 3.2 to provide real-time strategy without cloud latency.

Python

Updated 1 month ago

computer-visiondistributed-systemsf1-2022+6

storagex

bar6132

🧡55

AI-powered distributed video platform using FastAPI, RabbitMQ, and Next.js 16. Features a local Generative AI pipeline (Llama 3.2 + Moondream) for video summarization, zero-hallucination analysis, and dynamic transcoding.

TypeScript

Updated 1 week ago

llama-finetune-go

rammaruboina-rgb

❤️40

Production-grade LLaMA fine-tuning framework with Go orchestration. Features LoRA/QLoRA adapters, 4-bit quantization, distributed training, and seamless deployment via vLLM, BentoML, and cloud platforms. Optimized for domain-specific AI models.

MIT

Updated 4 months ago

Open_Cluster_AI_Station_beta

rinoScremin

❤️45

High-performance distributed matrix computation for AI workloads. Supports CPUs, Vulkan/Metal GPUs, PyTorch CUDA nodes, and LLaMA/ggml backends. Uses shard-based distribution with ZeroMQ networking, RAM/disk storage, and flexible environment-based configuration for multi-node clusters.

C++

Updated 1 month ago

clusterclustering-algorithmclusters+2

Open-Cluster

rinoScremin

❤️45

Updated 1 month ago

MainStream

ssr9857

❤️30

Distributed inference llama

C++

Updated 9 months ago

distributed-llama

rzredg

❤️25

No description available

HTML

Updated 4 months ago

llama-distributed

llamasearchai

❤️30

No description available

MIT

Python

Updated 12 months ago

distributed-llama

vedantjh2

❤️25

No description available

Updated 1 year ago

Distributed_Llama_Py

fromthefox

❤️25

No description available

Python

Updated 8 months ago

GitHub Explorer

Search Results

distributed-llama

mesh-llm

llamacpp-distributed-inference

finetuning-on-aws

distributed-inference-llm

coldstart-recs-on-aws-trainium

spawnverse

zosma-llama2-server

rpc_manager

llm-hpc-course

distributed-llama2-server

zosma-llama2-worker

Rookery

llama3-distributed-serving

vllm-llama3.1-distributed-inference

hybrid-distributed-llama

distributed-llama3-8b-full-finetuning

distributed-inference-llama.cpp

fibr

helix_chatbot

F1-Neural-Copilot

storagex

llama-finetune-go

Open_Cluster_AI_Station_beta

Open-Cluster

MainStream

distributed-llama

llama-distributed

distributed-llama

Distributed_Llama_Py

distributed-llama

mesh-llm

llamacpp-distributed-inference

finetuning-on-aws

distributed-inference-llm

coldstart-recs-on-aws-trainium

spawnverse

zosma-llama2-server

rpc_manager

llm-hpc-course

distributed-llama2-server

zosma-llama2-worker

Rookery

llama3-distributed-serving

vllm-llama3.1-distributed-inference

hybrid-distributed-llama

distributed-llama3-8b-full-finetuning

distributed-inference-llama.cpp

fibr

helix_chatbot

F1-Neural-Copilot

storagex

llama-finetune-go

Open_Cluster_AI_Station_beta

Open-Cluster

MainStream

distributed-llama

llama-distributed

distributed-llama

Distributed_Llama_Py