Found 1,723 repositories(showing 30)
THUDM
slime is an LLM post-training framework for RL Scaling.
hiyouga
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
PeterGriffinJin
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
hkust-nlp
Simple RL training for reasoning
yifan123
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
langfengQ
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
gensyn-ai
A fully open source framework for creating RL training swarms over the internet.
PRIME-RL
[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
meta-pytorch
An interface library for RL post training with environments.
microsoft
โTextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.
Improbable-AI
Sim-to-real RL training and deployment tools for the Unitree Go1 robot.
PrimeIntellect-ai
Async RL Training at Scale
araffin
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Osilly
[ICLR2026] This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning capability.
williamFalcon
Hacks for training RL systems from John Schulman's lecture at Deep RL Bootcamp (Aug 2017)
huawei-noah
Scalable Multi-Agent RL Training School for Autonomous Driving
JudgmentLabs
The open source post-building layer for agents. Our environment data and evals power agent post-training (RL, SFT) and monitoring.
BytedTsinghua-SIA
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
AgileRL
Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools, with 10x faster training through evolutionary hyperparameter optimization.
TideDra
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
pat-jj
[EMNLP'25] s3 - โก Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)
NVIDIA-NeMo
Build RL environments for LLM training
NVlabs
ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.
WooooDyy
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.
SalesforceAIResearch
MCP-Universe is a comprehensive framework designed for RL training, benchmarking, and developing AI agents for general tool-use.
humanplane
RL agent fusing real-time Binance futures data into Polymarket prediction markets. On-device training with MLX on Apple Silicon.
antonilo
Code for training locomotion policies with RL
LeslieTrue
Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
yaof20
Implementation for FP8/INT8 Rollout for RL training without performence drop.
facebookresearch
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks