Found 6 repositories(showing 6)
Ali-Meh619
An interactive reference guide for software engineers cracking System Design & ML interviews. Covers 43 topics from foundational distributed systems to modern LLM architectures.
tuanthi
๐ Production ML Engineering: The Complete Guide to Distributed LLM Training & Serving Master the art of building, optimizing, and deploying large-scale ML systems in production environments ๐ฏ This repository is your complete handbook for becoming a production LLM machine engineer.
dlorp
S.Y.N.A.P.S.E. ENGINE is a distributed orchestration platform for local language models that enables sophisticated orchestration of multiple quantized LLM models. The system implements Contextually Guided Retrieval Augmented Generation (CGRAG), integrated web search, and features a dense, terminal inspired UI with real time visualizations.
rohit07cf
A comprehensive guide and implementation of distributed LLM fine-tuning (Full, LoRA, QLoRA) using LLaMA-Factory and DeepSpeed ZeRO stages. Includes memory optimization strategies and multi-GPU launch configurations.
Full-stack system for simulating distributed scheduling, API workflows, and latency optimization with LLM-guided decision support.
sahilagr123
Building a multi-agent RL system where two LLMs learn mathematical reasoning through peer discussion, guided by a coach model. Implemented disagreement detection, GRPO-based distributed training, and verifiable reward pipelines using PyTorch on cloud GPUs (H100s).
All 6 repositories loaded