Search Results

Found 90 repositories(showing 30)

Mixed_Input_PPO_CNN_LSTM_Car_Navigation

ZHONGJunjie86

❤️35

Accepted by AROB 2021. A car-agent navigates in complex traffic conditions by Mixed_Input_PPO_CNN_LSTM model.

Python

Updated 9 months ago

a2ccnngama+5

PPO_PyBullet_Minitaur

EricChen0104

🧡60

This project implements a Proximal Policy Optimization (PPO) reinforcement learning agent to train the Minitaur robot to walk in the MinitaurBulletEnv-v0 environment using PyBullet. The agent uses a multilayer perceptron (MLP) to model the policy and value networks and learns to control the robot in a continuous action space.

MIT

Python

Updated 1 week ago

gymlocalmolocomotion+9

Humanoid-Robot-Reinforcement-Learning-PPO

mominalix

❤️45

This repository contains a project that leverages reinforcement learning to make a humanoid robot walk in a PyBullet simulation. It uses a custom Gym environment, a Proximal Policy Optimization (PPO) agent, and a provided URDF file for the robot model. The training process prints rewards per generation and visualizes the robot's behavior.

Python

Updated 1 month ago

gym-environmenthumanoid-robotppo+6

kukahusky_pybullet_ppo

Zunchao

❤️35

training ppo for mm kuka + husky, models and agent from pybullet

Python

Updated 2 years ago

ProjectGitGud

Ali-Najar

❤️35

**This is a reinforcement learning model using PPO to train an agent to defeat Dark Souls 3 enemies.**

Python

Updated 3 months ago

adversarial-coevolution

Nikelroid

🧡65

Adversarial Co-Evolution of RL and LLM Agents: A framework for training high-performance PPO agents against Large Language Models in Gin Rummy, utilizing curriculum learning and knowledge distillation.

Python

Updated 18 hours ago

curriculum-learninggin-rummyknowledge-distillation+8

RL_Adaptive_Learning_Final_Project

AdamQin1

❤️40

A reinforcement learning–based adaptive learning system that trains DQN and PPO teacher agents for question selection, using a Deep Knowledge Tracing (DKT) student model and alternative parametric student models.

MIT

Jupyter Notebook

Updated 3 months ago

rl-mpc-lane-keeping

Yuanzhe-Nikola-Chen

🧡60

A benchmark comparing MPC and PPO for safe autonomous lane-keeping using a kinematic bicycle model. Includes a Gym-style environment, shooting-based MPC, and a PyTorch PPO agent. Not affiliated with Mobileye, but inspired by industry ADAS/AV planning practices.

MIT

Python

Updated 2 weeks ago

AI-Economics

justiNNovick

❤️20

A research project that experiments integrating Reinforcement Learning into the Real Business Cycle economic model. This multi-agent environment features learning via Proximal Policy optimization (PPO), and tests different utility functions from economic theory as reward mediums.

Python

Updated 8 months ago

RNN-Powered-Trading-Agent-using-PPO

SruthiVihitha

❤️45

An end-to-end project that trains, evaluates, and visualizes RL trading agents which use RNNs (LSTM / GRU) or MLPs as market-state encoders. Includes an ablation study (LSTM vs GRU vs MLP), a production-grade LSTM agent trained with PPO, and a Streamlit GUI to load models, backtest and generate live signals.

Python

Updated 2 months ago

PPO-Agent-with-Generative-Modeling-in-Leduc-Poker

kailiu0712

❤️45

Used a generative neural network as opponent model in Reinforcement Learning to improve PPO agent's performance in heads-up Leduc poker. The opponent predictor is trained synchronically along with the policy and value networks. 2-stage training with random opponent and self-play. Successfully increased reward and win rate.

Python

Updated 1 month ago

CybORG-CAGE-1-Public

Mel-Meijer

❤️35

BSc Computer Science with Security and Forensics Final Project. Explores the capability of autonomous deep reinforcement learning models to defend a network against a simulated attacker in the CAGE challenge 1 environment scenario. Trains and compares PPO, DQN, and A2C in their ability to defend a network from the two different attacking agents.

Python

Updated 11 months ago

DRL_Stock_Analyzer

Shafwansafi06

❤️40

Use historical stock data and train a Deep Reinforcement Learning (DRL) agent using PPO to model market trends.

MIT

Python

Updated 6 months ago

Dark-Thermodynamic-Mind

Devanik21

💛70

Dark Zero Point Genesis: PPO Latent World Models Under Thermodynamic Scarcity 256 agents. 128D Latent Manifolds. Zero supervision. Agents utilize PPO-clipped surrogate objectives. Survival = Predictive Error Coding (PEC) × Energy Efficiency across a 50/15 Seasonal Cycle.

MIT

Python

Updated 1 day ago

autopoietic-systemscognitive-architectureenergy-efficiency+7

NASA-Space-Apps-Commercialising-LEO-by-OptimAI

green-hat-001

❤️30

2D orbital rocket sim with PPO in PyTorch. Models thrust, drag, gravity, fuel; agent learns efficient ascent. Includes telemetry & visualization

Python

Updated 2 months ago

aippo-algorithmpython3+1

next-financial-decision-model

Hanbry

❤️30

Next Financial Decision Model: A reinforcement learning project that develops a trading agent for financial environments. The agent is implemented using the PPO algorithm from TensorFlow's tf-agents library. The project includes a custom financial environment and a data-driven reward system.

Python

Updated 2 years ago

flappy-dqn

Labeeb-coder

❤️35

Flappy Bird Reinforcement Learning Agent This project trains an AI agent to play the Flappy Bird game using Deep Reinforcement Learning techniques (DQN, PPO, and Stable-Baselines3). It includes game logic, training scripts, evaluation tools, and pre-trained models.

Python

Updated 5 months ago

Super-Mario-Bros-AI-Training-and-Evaluation

Patenro

❤️35

This repository contains code for training an AI agent to play the Super Mario Bros game using reinforcement learning algorithms such as DQN, A2C, and PPO. It also includes evaluation of the trained models and testing of the best-performing model.

Updated 1 year ago

UWSN-PPO-Routing-Optimization

BENKRIMEN

❤️45

Reinforcement Learning-based Underwater Wireless Sensor Network Routing using PPO. This project implements a complete UWSN environment with acoustic propagation, energy models, path-loss, and a Proximal Policy Optimization agent for optimal routing.

Python

Updated 1 month ago

RL-Robotic-Arm-Control

Bautistao2

❤️40

This project implements a robotic arm control system using Deep Reinforcement Learning. The model is trained using Proximal Policy Optimization (PPO) in a simulated environment built with PyBullet. The goal is to achieve precise target-reaching motions through optimized agent training.

MIT

Python

Updated 7 months ago

DRL-Trading-Agents

varshil247

❤️35

An analysis of Actor-Critic based A2C and PPO trading agents, in an custom built Open AI GYM. Evaluating effects of environment knowlege (OHLCV vs Technical Indicators), reward functions (Log Returns vs Sharpe), and model architecture (MLP vs LSTM) in relation to profitability and model stability metrics. Backtested against SPY index.

Jupyter Notebook

Updated 9 months ago

advance-time-series-forecasting-with-deep-reinforcement-learning

gavisangavi2502-max

❤️35

Deep Reinforcement Learning is used to train a trading agent on synthetic financial time-series data. A custom Gym environment models Buy, Sell, Hold decisions. PPO learns to maximize returns vs a moving-average baseline using Sharpe ratio, drawdown, and cumulative profit metrics.

Python

Updated 4 months ago

LLM-Guided-Reinforcement-Learning-for-BipedalWalker-v3

abhaydwived

🧡65

Automated LLM-Guided Reinforcement Learning Testbed. This project leverages the modern BipedalWalker-v3 environment from Gymnasium to orchestrate a continuous cycle of agent training and intelligent reward shaping. By combining Stable Baselines3's PPO algorithm with the reasoning capabilities of Large Language Models (LLMs)

Python

Updated 5 days ago

bipedbipedalwalker-v3llm-guided-rl+3

Sentiment-Augmented-Deep-RL-Factor-Portfolio

pragyan2905

🧡60

A hybrid quantitative framework fusing Bi-LSTM networks for latent alpha factor extraction and Proximal Policy Optimization (PPO) for model-free execution. The system processes high-dimensional sentiment and technical signals to drive a stochastic policy gradient agent, optimizing dynamic portfolio allocation for maximal risk-adjusted returns.

GPL-3.0

Python

Updated 2 weeks ago

AlgoAgent

Licensed-Driver

🧡55

A template for training a single‑ticker reinforcement learning (RL) trader entirely via backtesting. It pulls OHLCV data from the Alpaca API, adds many technical indicators, simulates realistic execution with bid/ask spread and IBKR commission models, and trains an LSTM PPO agent (Stable‑Baselines3) inside a Gymnasium environment.

Python

Updated 2 weeks ago

Cartpole-v1_RL

soudeepan

❤️35

Welcome to my repository showcasing my adventure into Reinforcement Learning with the CartPole-v1 environment using the powerful Proximal Policy Optimization (PPO) model. Here, you'll find the code and resources detailing my journey as I trained an AI agent to balance a pole on a moving cart.

Jupyter Notebook

Updated 1 year ago

Sovereign-Risk-and-Protfolio-Allocation

thylinao1

❤️35

Sovereign default prediction using World Bank and FRED macro data. Compares two-tower neural embeddings vs tree-based models, then applies PPO reinforcement learning for risk-aware bond allocation across 117 countries. RL agent achieves 13-20% improvement over equal-weight baseline. Temporal validation on 34 years of data from 1990-2023.

Jupyter Notebook

Updated 3 months ago

AI-Lunar-Lander

Acrazt03

🧡50

A neural network is trained to land the Eagle lander on a virtual moon. It starts at an altitude of 80m at a random position and lands without crashing or running out of fuel. The model was trained using PPO and the ml-agents library and the enviorement was made on the Unity game engine.

Apache-2.0

ASP.NET

Updated 1 month ago

agentic-ppo-model

ignius299792458

🧡55

Implemented and benchmarked PPO agent across pytorch, OpenAI Gym environments (CartPole, LunarLander, MountainCar) — studying policy gradient convergence, reward shaping, and hyperparameter sensitivity under continuous and discrete action spaces

Python

Updated 3 weeks ago

MARL-Traffic

feiLinX

❤️35

Deep Multi-agent Reinforcement Learning Model: DQN, PPO, ACKTR

Updated 4 years ago

GitHub Explorer

Search Results

Mixed_Input_PPO_CNN_LSTM_Car_Navigation

PPO_PyBullet_Minitaur

Humanoid-Robot-Reinforcement-Learning-PPO

kukahusky_pybullet_ppo

ProjectGitGud

adversarial-coevolution

RL_Adaptive_Learning_Final_Project

rl-mpc-lane-keeping

AI-Economics

RNN-Powered-Trading-Agent-using-PPO

PPO-Agent-with-Generative-Modeling-in-Leduc-Poker

CybORG-CAGE-1-Public

DRL_Stock_Analyzer

Dark-Thermodynamic-Mind

NASA-Space-Apps-Commercialising-LEO-by-OptimAI

next-financial-decision-model

flappy-dqn

Super-Mario-Bros-AI-Training-and-Evaluation

UWSN-PPO-Routing-Optimization

RL-Robotic-Arm-Control

DRL-Trading-Agents

advance-time-series-forecasting-with-deep-reinforcement-learning

LLM-Guided-Reinforcement-Learning-for-BipedalWalker-v3

Sentiment-Augmented-Deep-RL-Factor-Portfolio

AlgoAgent

Cartpole-v1_RL

Sovereign-Risk-and-Protfolio-Allocation

AI-Lunar-Lander

agentic-ppo-model

MARL-Traffic

Mixed_Input_PPO_CNN_LSTM_Car_Navigation

PPO_PyBullet_Minitaur

Humanoid-Robot-Reinforcement-Learning-PPO

kukahusky_pybullet_ppo

ProjectGitGud

adversarial-coevolution

RL_Adaptive_Learning_Final_Project

rl-mpc-lane-keeping

AI-Economics

RNN-Powered-Trading-Agent-using-PPO

PPO-Agent-with-Generative-Modeling-in-Leduc-Poker

CybORG-CAGE-1-Public

DRL_Stock_Analyzer

Dark-Thermodynamic-Mind

NASA-Space-Apps-Commercialising-LEO-by-OptimAI

next-financial-decision-model

flappy-dqn

Super-Mario-Bros-AI-Training-and-Evaluation

UWSN-PPO-Routing-Optimization

RL-Robotic-Arm-Control

DRL-Trading-Agents

advance-time-series-forecasting-with-deep-reinforcement-learning

LLM-Guided-Reinforcement-Learning-for-BipedalWalker-v3

Sentiment-Augmented-Deep-RL-Factor-Portfolio

AlgoAgent

Cartpole-v1_RL

Sovereign-Risk-and-Protfolio-Allocation

AI-Lunar-Lander

agentic-ppo-model

MARL-Traffic