Found 79 repositories(showing 30)
Unakar
Reproduce R1 Zero on Logic Puzzle
DolbyUUU
Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".
jiangnan7
MLIR-based HLS and RL-driven logic synthesis co-optimization.
(WIP) Training an RL model to produce synthesis recipes for logic optimization.
DongmingShenDS
This repository contains Dongming Shen's code and documentation for the research projects conducted at the AIDyS Lab, USC. The project focuses on integrating Reinforcement Learning (RL) to solve partially observable Markov decision processes (POMDP) under finite linear temporal logic (LTL) constraints.
anshulsawant
This project provides a hands-on tutorial for understanding and implementing the Proximal Policy Optimization (PPO) algorithm to fine-tune Large Language Models (LLMs) using Reinforcement Learning (RL). It is inspired by the logic found in the TinyZero repository but significantly simplified for pedagogical purposes.
waifuai
Integrates LLM conversational AI characters into Godot game engine projects. ✨ Manages character personality, state, and interaction logic using a generative LLM. 🧠 Connects to a running Godot instance for seamless communication via sockets or godot-rl. 🎮 Includes an optional class for controlling VRM model animations and expressions directly.
dotty-cps-async
monadic infrastructure for reinforcement learning.
Trae1ounG
Exploring R1 on Logic Puzzle in Chinese
saibot007
NavBot-X Isaac Lab is a premium robotics simulation project focused on building an autonomous inspection rover in Isaac Sim / Isaac Lab. It brings together cinematic environment design, checkpoint-based task logic, omnidirectional rover control concepts, and gesture-control integration, with future expansion toward RL and Physical AI
andretosi
This project is an open-source version of a robot dog, complete with an advanced training environment and custom controller logic. The project aims to train the robot using RL algorithms and plans to support multiple backends for simulation, from PyBullet and MuJoCo to Gazebo, Unity, or Unreal Engine using ROS2 for communication.
Nyrus-Y
No description available
phyzhenli
No description available
jbarnes850
Verifiable math/logic environments for slime RL training
Ongoing OPEN FAST project: Implementing an innovative Reinforcement Learning (RL) controller (IDHP-IPC) for load mitigation on an ITI Barge floating wind turbine. External logic is scripted in Python and linked via a C bridge to the DISCON library.
jonberliner
rlsquare algorithm!
Dutch-voyage
No description available
k191105
No description available
sy-shi
A document to describe the logic of RL with Gym.
jacobarrio
Interactive guide to understanding Tensor Logic - designed for RL engineers
umd-xlab
Robust safety verification of RL agents using Signal Temporal Logic (STL)
Nozidoali
Pre-trained RL agent that synthesize tautology or near tautology logic
richardzhangatuoe
Safe RL framework integrating Linear Temporal Logic (LTL) constraints into PPO via a logic-to-cost mechanism.
AHartNtkn
An attempt to create an RL environment so that RL algorithms can do math an logic, treated as a sort of game.
rupeshsjce
AI and ML based Projects | Minimax Algorithm | Alpha-Beta Pruning | Go Game | MNIST DIGIT RECOGNISER | LOGIC | FOL | RL
sagar0x0
Optimizing RL training with GRPO. This repo implements the RL pipeline logic and optimizing it with different technique including custom kernels, reducing overhead mainly in grpo step. Features detailed Nsight profiling and benchmarking
rahulpanchall7
Virtual robotics racing simulation using Isaac Sim and ROS2. Two Robotnik Summit robots race using PID and RL-based control. Customize logic and experiment with robot behavior!
Johnny95420
This project uses Reinforcement Learning (RL) and LoRA to boost Qwen2-0.5B's reasoning, based on ProRL. The model surpasses instruction-tuning on math benchmarks and learns general problem-solving logic.
thundivalappil
Autonomous driving is a sequential decision-making problem under uncertainty. Instead of relying solely on rule-based logic, Reinforcement Learning (RL) learns driving behavior through interaction: the agent observes the environment state, takes actions, and improves a policy by maximizing long-term reward.
Horese07
A small reinforcement learning project that trains an agent to play Snake. The repository contains game logic, an RL agent and model code, a replay buffer implementation, utilities, and a script to play the game as a human. A training progress image is included to show example training results.