Search Results

Found 1,723 repositories(showing 30)

slime

THUDM

💛72

slime is an LLM post-training framework for RL Scaling.

5.1k

690

Apache-2.0

Python

Updated 2 hours ago

EasyR1

hiyouga

🧡58

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

4.8k

366

Apache-2.0

Python

Updated 1 minute ago

aideepseekgpt+5

Search-R1

PeterGriffinJin

💛78

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

4.4k

375

Apache-2.0

Python

Updated 1 hour ago

simpleRL-reason

hkust-nlp

💛72

Simple RL training for reasoning

3.8k

289

MIT

Python

Updated 2 days ago

flow_grpo

yifan123

💛74

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

2.2k

146

MIT

Python

Updated 5 hours ago

verl-agent

langfengQ

💛73

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

1.8k

163

Apache-2.0

Python

Updated 8 hours ago

agent-frameworkdeepseek-r1gigpo+5

rl-swarm

gensyn-ai

💛73

A fully open source framework for creating RL training swarms over the internet.

1.7k

624

MIT

Python

Updated 2 days ago

SimpleVLA-RL

PRIME-RL

🧡68

[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

1.6k

MIT

Python

Updated 17 hours ago

reasoningrlvla

OpenEnv

meta-pytorch

🧡64

An interface library for RL post training with environments.

1.5k

289

BSD-3-Clause

Python

Updated 30 minutes ago

TextWorld

microsoft

🧡68

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

1.4k

194

NOASSERTION

Jupyter Notebook

Updated 1 day ago

reinforcement-learningtext-based-adventuretext-based-game

walk-these-ways

Improbable-AI

💛73

Sim-to-real RL training and deployment tools for the Unitree Go1 robot.

1.3k

216

NOASSERTION

Python

Updated 15 hours ago

go1reinforcement-learningrobotics+2

prime-rl

PrimeIntellect-ai

🧡69

Async RL Training at Scale

1.3k

252

Apache-2.0

Python

Updated 1 hour ago

rl-baselines-zoo

araffin

💛73

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

1.2k

210

MIT

Python

Updated 1 day ago

gymhyperparameter-optimizationhyperparameter-search+10

[ICLR2026] This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning capability.

1.1k

Python

Updated 1 hour ago

DeepRLHacks

williamFalcon

🧡57

Hacks for training RL systems from John Schulman's lecture at Deep RL Bootcamp (Aug 2017)

1.1k

119

Updated 4 weeks ago

SMARTS

huawei-noah

🧡53

Scalable Multi-Agent RL Training School for Autonomous Driving

1.1k

218

MIT

Python

Updated 1 day ago

autonomous-drivingpythonreinforcement-learning+1

judgeval

JudgmentLabs

💛72

The open source post-building layer for agents. Our environment data and evals power agent post-training (RL, SFT) and monitoring.

1.0k

Apache-2.0

Python

Updated 3 hours ago

agentagentic-aiagents+12

MemAgent

BytedTsinghua-SIA

🧡67

A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

976

Apache-2.0

Python

Updated 13 hours ago

AgileRL

💛72

Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools, with 10x faster training through evolutionary hyperparameter optimization.

905

Apache-2.0

Python

Updated 13 hours ago

agilerlautomldeep-learning+17

lmm-r1

TideDra

💛71

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

845

Apache-2.0

Python

Updated 2 days ago

s3

pat-jj

💛72

[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)

829

139

Apache-2.0

Python

Updated 1 day ago

agentic-aiefficiencygpt-5+6

Gym

NVIDIA-NeMo

🧡52

Build RL environments for LLM training

803

106

Apache-2.0

Python

Updated 7 hours ago

gymgym-environmentreinforcement-learning+4

ToolOrchestra

NVlabs

💛72

ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.

697

Apache-2.0

Python

Updated 1 day ago

agentagentic-aideep-learning+1

AgentGym-RL

WooooDyy

🧡66

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.

666

MIT

Python

Updated 6 hours ago

agentllmllm-based-agent+1

MCP-Universe

SalesforceAIResearch

💛71

MCP-Universe is a comprehensive framework designed for RL training, benchmarking, and developing AI agents for general tool-use.

577

Apache-2.0

Python

Updated 1 day ago

cross-market-state-fusion

humanplane

💛71

RL agent fusing real-time Binance futures data into Polymarket prediction markets. On-device training with MLX on Apple Silicon.

366

MIT

Python

Updated 18 minutes ago

rl_locomotion

antonilo

💛71

Code for training locomotion policies with RL

330

GPL-3.0

Python

Updated 2 days ago

SFTvsRL

LeslieTrue

🧡66

Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

320

Python

Updated 4 days ago

Flash-RL

yaof20

💛71

Implementation for FP8/INT8 Rollout for RL training without performence drop.

299

MIT

Python

Updated 1 day ago

reinforcement-learningvllm

sweet_rl

facebookresearch

🧡55

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

264

NOASSERTION

Python

Updated 1 week ago

GitHub Explorer

Search Results

slime

EasyR1

Search-R1

simpleRL-reason

flow_grpo

verl-agent

rl-swarm

SimpleVLA-RL

OpenEnv

TextWorld

walk-these-ways

prime-rl

rl-baselines-zoo

Vision-R1

DeepRLHacks

SMARTS

judgeval

MemAgent

AgileRL

lmm-r1

s3

Gym

ToolOrchestra

AgentGym-RL

MCP-Universe

cross-market-state-fusion

rl_locomotion

SFTvsRL

Flash-RL

sweet_rl

slime

EasyR1

Search-R1

simpleRL-reason

flow_grpo

verl-agent

rl-swarm

SimpleVLA-RL

OpenEnv

TextWorld

walk-these-ways

prime-rl

rl-baselines-zoo

Vision-R1

DeepRLHacks

SMARTS

judgeval

MemAgent

AgileRL

lmm-r1

s3

Gym

ToolOrchestra

AgentGym-RL

MCP-Universe

cross-market-state-fusion

rl_locomotion

SFTvsRL

Flash-RL

sweet_rl