Search Results

Found 186 repositories(showing 30)

AgentGym-RL

WooooDyy

🧡66

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.

674

MIT

Python

Updated 25 minutes ago

agentllmllm-based-agent+1

es-fine-tuning-paper

VsonicV

💛71

This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"

343

NOASSERTION

Python

Updated 20 hours ago

FinRL_DeepSeek

benstaf

🧡61

Code for the paper "FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents" arXiv:2502.07393

317

104

MIT

Jupyter Notebook

Updated 2 weeks ago

awesome-lifelong-llm-agent

qianlima-lab

🧡60

TPAMI 2026 | This repository collects awesome survey, resource, and paper for lifelong learning LLM agents

295

Python

Updated 1 hour ago

agentcontinual-learningincremental-learning+4

open-rs

knoveleng

🧡66

[AAAI 2026] - Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"

276

MIT

Python

Updated 5 days ago

low-resourcereasoning-language-modelsreinforcement-learning

TAPE

XiaoxinHe

💛71

Official Implementation of ICLR 2024 paper "Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning"

268

MIT

Python

Updated 6 days ago

Awesome-Efficient-Agents

yxf203

🧡65

Survey and paper list on efficiency-guided LLM agents (memory, tool learning, planning).

223

Updated 1 hour ago

agent-memoryawesomeefficiency+3

Video-3D-LLM

LaVi-Lab

🧡55

[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.

207

Apache-2.0

Python

Updated 1 week ago

GNN4TaskPlan

WxxShirley

❤️45

[NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"

151

MIT

Python

Updated 1 month ago

autonomous-agentsgraph-learninggraph-neural-networks+3

LLM-Continual-Learning-Papers

AGI-Edgerunners

🧡55

Must-read Papers on Large Language Model (LLM) Continual Learning

149

Updated 1 week ago

awesome-listcontinual-learninglarge-language-models+5

NavCoT

expectorlin

🧡55

Code of the paper "NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning" (TPAMI 2025)

134

Python

Updated 2 days ago

awesome-language-model-analysis

Furyton

💛70

This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers in this list investigate the learning behavior, generalization ability, and other properties of language models through theoretical analysis, empirical analysis, or a combination of both.

CC0-1.0

Python

Updated 1 hour ago

aianalysisanalytics+9

BAPO

WooooDyy

🧡55

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python

Updated 1 week ago

llmreasoningrl+1

aLLM4TS

yxbian23

🧡50

[ICML 2024] Official repo for paper "Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning"

Python

Updated 2 weeks ago

trust-align

declare-lab

🧡50

Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Python

Updated 2 weeks ago

ragretrieval-augmented-generation

S2R

NineAbyss

🧡50

This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"

MIT

Python

Updated 1 month ago

RADAR

IBM

🧡55

Code for our NeurIPS2023 accepted paper: RADAR: Robust AI-Text Detection via Adversarial Learning. We tested RADAR on 8 LLMs including Vicuna and LLaMA. The results show that RADAR can attain good detection performance on LLM-generated AI-text while being robust against paraphrasing.

Apache-2.0

Jupyter Notebook

Updated 1 week ago

ai-text-detector

MT-R1-Zero

fzp0424

🧡60

[EMNLP'25] Code for paper "MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning"

Apache-2.0

Python

Updated 2 weeks ago

learning-from-rewards-llm-papers

bobxwu

💛70

A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and post-inference stages.

MIT

Updated 1 day ago

guided-decodinglarge-language-modelsllm+9

TimeMaster

langfengQ

🧡65

Official code for paper "TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning"

Apache-2.0

Python

Updated 22 hours ago

multimodal-large-language-modelsreasoningreinforcement-learning+1

gen-mentor

GeminiLight

💛70

[WWW '25 Oral - GenMentor] Official code of our paper "LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System", accepted by WWW 2025 (Industry Track) as an Oral Presentation.

CC0-1.0

Python

Updated 23 hours ago

educationgoal-learningintelligent-tutoring-system+2

LEMA

microsoft

❤️35

official repo for the paper "Learning From Mistakes Makes LLM Better Reasoner"

MIT

Python

Updated 3 months ago

WALL-E

elated-sawyer

🧡65

Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Python

Updated 1 day ago

RL4GenomeBench

mingyin0312

🧡55

Official implementation for the paper "Toward Scientific Reasoning in LLMs: Training from Expert Discussions via Reinforcement Learning"

Python

Updated 1 week ago

automanual

minghchen

🧡55

Code for NeurIPS 2024 paper "AutoManual: Constructing Instruction Manuals by LLM Agents via Interactive Environmental Learning"

PDDL

Updated 3 weeks ago

OUTFOX

ryuryukke

🧡60

[AAAI 2024] The official repository for our paper, "OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples"

Apache-2.0

Python

Updated 1 week ago

aaai2024adversarial-learningai-generated-content+9

Data-Whisperer

gszfwsb

🧡60

Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning".

Python

Updated 1 day ago

data-pruningdata-selectionefficient-training+5

Beyond-the-80-20-Rule-RLVR

Shenzhi-Wang

❤️40

The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning."

Apache-2.0

Python

Updated 2 weeks ago

Generalizable-Reward-Model

YangRui2015

🧡55

Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"

MIT

Python

Updated 1 week ago

prelude

gao-g

🧡60

Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".

MIT

Python

Updated 1 week ago

alignmenteditsgpt4+7

GitHub Explorer

Search Results

AgentGym-RL

es-fine-tuning-paper

FinRL_DeepSeek

awesome-lifelong-llm-agent

open-rs

TAPE

Awesome-Efficient-Agents

Video-3D-LLM

GNN4TaskPlan

LLM-Continual-Learning-Papers

NavCoT

awesome-language-model-analysis

BAPO

aLLM4TS

trust-align

S2R

RADAR

MT-R1-Zero

learning-from-rewards-llm-papers

TimeMaster

gen-mentor

LEMA

WALL-E

RL4GenomeBench

automanual

OUTFOX

Data-Whisperer

Beyond-the-80-20-Rule-RLVR

Generalizable-Reward-Model

prelude

AgentGym-RL

es-fine-tuning-paper

FinRL_DeepSeek

awesome-lifelong-llm-agent

open-rs

TAPE

Awesome-Efficient-Agents

Video-3D-LLM

GNN4TaskPlan

LLM-Continual-Learning-Papers

NavCoT

awesome-language-model-analysis

BAPO

aLLM4TS

trust-align

S2R

RADAR

MT-R1-Zero

learning-from-rewards-llm-papers

TimeMaster

gen-mentor

LEMA

WALL-E

RL4GenomeBench

automanual

OUTFOX

Data-Whisperer

Beyond-the-80-20-Rule-RLVR

Generalizable-Reward-Model

prelude