Found 186 repositories(showing 30)
WooooDyy
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.
VsonicV
This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"
benstaf
Code for the paper "FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents" arXiv:2502.07393
qianlima-lab
TPAMI 2026 | This repository collects awesome survey, resource, and paper for lifelong learning LLM agents
knoveleng
[AAAI 2026] - Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
XiaoxinHe
Official Implementation of ICLR 2024 paper "Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning"
yxf203
Survey and paper list on efficiency-guided LLM agents (memory, tool learning, planning).
LaVi-Lab
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
WxxShirley
[NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"
AGI-Edgerunners
Must-read Papers on Large Language Model (LLM) Continual Learning
expectorlin
Code of the paper "NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning" (TPAMI 2025)
This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers in this list investigate the learning behavior, generalization ability, and other properties of language models through theoretical analysis, empirical analysis, or a combination of both.
WooooDyy
Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.
yxbian23
[ICML 2024] Official repo for paper "Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning"
declare-lab
Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
NineAbyss
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
IBM
Code for our NeurIPS2023 accepted paper: RADAR: Robust AI-Text Detection via Adversarial Learning. We tested RADAR on 8 LLMs including Vicuna and LLaMA. The results show that RADAR can attain good detection performance on LLM-generated AI-text while being robust against paraphrasing.
fzp0424
[EMNLP'25] Code for paper "MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning"
A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and post-inference stages.
langfengQ
Official code for paper "TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning"
GeminiLight
[WWW '25 Oral - GenMentor] Official code of our paper "LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System", accepted by WWW 2025 (Industry Track) as an Oral Presentation.
microsoft
official repo for the paper "Learning From Mistakes Makes LLM Better Reasoner"
elated-sawyer
Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
mingyin0312
Official implementation for the paper "Toward Scientific Reasoning in LLMs: Training from Expert Discussions via Reinforcement Learning"
minghchen
Code for NeurIPS 2024 paper "AutoManual: Constructing Instruction Manuals by LLM Agents via Interactive Environmental Learning"
ryuryukke
[AAAI 2024] The official repository for our paper, "OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples"
gszfwsb
Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning".
Shenzhi-Wang
The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning."
YangRui2015
Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"
gao-g
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".