GitHub Explorer

by Alexey Ratnikov

GitHub Explorer

GitHub Explorer|TRENDING COMPARE|FEEDBACK

Back to search

Copyright (c) 2026 Alexey Ratnikov

Dylsimple60/RLHF_learn - GitHub Explorer | GitHub Explorer | Trending | Compare

RLHF_learn

Dylsimple60•PUBLIC

🤖 Enhance reinforcement learning stability and efficiency with advanced algorithms like TRPO, PPO, DPO, GRPO, DAPO, and GSPO for optimized policy training.

ai-safetyattention-mechanismsdatasetsdeep-learningdeep-reinforcement-learninggpt

Created on Aug 16, 2025

Updated on Apr 6, 2026

Stars

5

Forks

0

Watchers

5

Open Issues

0

Repository Health Score

🧡

65/100

Fair

Overall repository health assessment

Score Breakdown

Activity

Active development - updated this week

30/30

100%

human-feedback

large-language-models

openai-o1

python

raylib

regular-expression

reinforcement-learning

reinforcement-learning-from-human-feedback

rlhf

safe-rlhf

safety

transformer

transformers

vllm

Community

5 stars, 0 forks

0/30

0%

Documentation

Has description, wiki

15/20

75%

Maintenance

0.0% issue ratio

20/20

100%

Health score is calculated based on activity, community engagement, documentation quality, and maintenance practices

Languages

Python

100.0%

Dependencies

No package.json found

This might not be a Node.js project

Top Contributors

1

haohaoXhang

User

57

commits

2

Dylsimple60

User

5

commits

Recent Commits

Update README.md

Dylsimple60•1 hour ago

e0fdb86View on GitHub

Update README.md

Dylsimple60•1 month ago

092269aView on GitHub

Update README.md

Dylsimple60•1 month ago

b10c49dView on GitHub

Update README.md

Dylsimple60•2 months ago

0e95018View on GitHub

Update README.md

Dylsimple60•2 months ago

4aba393View on GitHub

Update RLHF训练框架总结.md

BIG_MOUSE•3 months ago

48f2181View on GitHub

Update WorkerGroup explanation in RLHF framework

BIG_MOUSE•3 months ago

a8c6457View on GitHub

Update README.md

BIG_MOUSE•3 months ago

14d4ee3View on GitHub

Update README.md

BIG_MOUSE•3 months ago

f86729aView on GitHub

Update README.md

BIG_MOUSE•3 months ago

86c2439View on GitHub

Revise sequence-level importance ratio definition

BIG_MOUSE•3 months ago

145afeeView on GitHub

Fix typo in README.md for 'Critic' network

BIG_MOUSE•3 months ago

6a755b1View on GitHub

Update README.md

BIG_MOUSE•3 months ago

e84c86eView on GitHub

Update README.md

BIG_MOUSE•3 months ago

a9362d6View on GitHub

Update README.md

BIG_MOUSE•3 months ago

c03b549View on GitHub

View all commits