Found 15 repositories(showing 15)
GanjinZero
[NIPS2023] RRHF & Wombat
DaehanKim
EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets
chengq1001
[COLING'25] RRHF-V: Ranking Responses to Mitigate Hallucinations in Multimodal Large Language Models with Human Feedback
ajxv
Python Library for interfacing with the RRHFOEM04/RRHFOEM07-USB RFID Reader
ssbuild
No description available
sillylilfox
C library for RRHFOEM04 Card readers, renewed
annakijas1
Rock and Roll Hall of Fame Inductee Data (1986-2018)
nameswer
fghfghf
rrhfemsxkdlrj
No description available
lizet96
No description available
AngelL327
No description available
marwan-fahad
No description available
LRCHub
宇多田ヒカル「First Love」(HIKARU UTADA SCIENCE FICTION TOUR 2024)
No description available
lamalmeida
Surveyed Reinforcement Learning from Human Feedback techniques in order to find out how AI systems can better align with human values. Reviewed and implemented key techniques such as AIHF, Christiano's method, and RRHF using the CartPoleV1 task, analyzing the strengths of each technique regarding scalability, efficiency, and robustness.
All 15 repositories loaded