GitHub Explorer

by Alexey Ratnikov

GitHub Explorer

GitHub Explorer|TRENDING COMPARE|FEEDBACK

Back to search

calisweetleaf/Reinforcement-Learning-Full-Pipeline - GitHub Explorer | GitHub Explorer | Trending | Compare

Back to search

Reinforcement-Learning-Full-Pipeline

calisweetleaf•PUBLIC

View on GitHub

This repository provides a production-grade implementation of the Reinforcement Learning from Human Feedback (RLHF) pipeline. It mirrors the post-training infrastructure used by major research labs, optimized for consumer hardware — including CPU-only environments with zero GPU requirement.

dpogrpopporlhfsft

GNU General Public License v3.0

Created on Jan 31, 2026

Updated on Mar 27, 2026

Stars

Forks

Watchers

Open Issues

Repository Health Score

🧡

60/100

Fair

Overall repository health assessment

Score Breakdown

Activity

Regular updates - updated this month

20/30

67%

Recent Commits

Delete SPEC.md

Christian Trey Rowell•1 week ago

7cb9724View on GitHub

Experimental internal rlhf pipeline now pushed, docs updated. MCTS, a*, hidden CoT, and final test time compute paradigms.

Elryse Blackfyre•1 week ago

d48fc72View on GitHub

Inference optimizations now runtime rl-'able' for use with model_merging for one base weights. Currently used tested and will be released on qwen3_1.7b f16, full MaggiePie 300k, and rlhf.py.

Elryse Blackfyre•1 month ago

9cc9e78View on GitHub

Refactor checkpoint saving logic to ensure final checkpoints are created for all training runs; add memory-safe configuration for Qwen3-1.7B VPS and update RLHFOrchestrator to support auto-saving final models.

Elryse Blackfyre•1 month ago

1023259View on GitHub

Enhance PagedKVCache with sequence length tracking and improve MCTSGenerator device handling; add SFT model testing script

Elryse Blackfyre•1 month ago

f9e8f6aView on GitHub

Merge branch 'main' of https://github.com/calisweetleaf/Reinforcement-Learning-Full-Pipeline

Elryse Blackfyre•1 month ago

5e91001View on GitHub

fixed /docs syntax.

Elryse Blackfyre•1 month ago

ec4c545View on GitHub

Fix README badges and remove duplicate DOI badge