Found 240 repositories(showing 30)
vietnh1009
Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
taherfattahi
Proximal Policy Optimization (PPO) algorithm using PyTorch to train an agent for a rocket landing task in a custom environment
vietnh1009
Proximal Policy Optimization (PPO) algorithm for Contra
henrycharlesworth
Application of proximal policy optimization algorithm to the card game Big 2 using Tensorflow
elsheikh21
Implementation of a Deep Reinforcement Learning algorithm, Proximal Policy Optimization (SOTA), on a continuous action space openai gym (Box2D/Car Racing v0)
Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose a deep ensemble reinforcement learning scheme that automatically learns a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using the three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market conditions. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand approach for processing very large data. We test our algorithms on the 30 Dow Jones stocks which have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble scheme is shown to outperform the three individual algorithms and the two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.
ZhihanLee
A pytorch implementation of Constrained Reinforcement Learning Algorithm, including Constrained Soft Actor Critic (Soft Actor Critic Lagrangian) and Proximal Policy Optimization Lagrangian
AIResearcherHZ
This is a MATLAB-based reinforcement learning framework that includes the Proximal Policy Optimization (PPO) algorithm and its multi-agent extension (MAPPO). It supports GPU acceleration and parallel computing, making it suitable for research and engineering applications in control systems.
akjayant
This repository has code for the paper "Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm" accepted at NeurIPS 2022.
vietnh1009
Proximal Policy Optimization (PPO) algorithm for Sonic the Hedgehog
Solrikk
TradeWhisperer is a sophisticated cryptocurrency trading bot that leverages advanced Reinforcement Learning techniques, specifically the Proximal Policy Optimization (PPO) algorithm, to navigate the complex world of crypto markets. Built with a focus on adaptability and risk management, this bot combines technical analysis with machine learning.
dragen1860
Pytorch Implementation of Proximal Policy Optimization Algorithm
adi3e08
A clean and minimal implementation of PPO (Proximal Policy Optimization) algorithm in Pytorch, for continuous action spaces.
ImmanuelXIV
Reinforcement Learning | Multi-Agent RL | Self-Play | Proximal Policy Optimization Algorithm (PPO) agent | Unity Tennis environment
Chris-hughes10
A clean, modular implementation of the Proximal Policy Optimization (PPO) algorithm in PyTorch, written with a strong focus on readability and educational value, as well as performance.
maitchison
Example implemention of the Proximal Policy Optimization algorithm
dhyeythumar
Implementation of Proximal Policy Optimization algorithm on a custom Unity environment.
Jiankai-Sun
Proximal Policy Optimization(PPO) Algorithm and its distributed implementation in Pytorch
Landing a SpaceX Falcon Heavy Rocket in simulation using Reinforcement learning. Reinforcement learning is a technique that lets an agent learn how best to act in an environment using rewards as its signal. OpenAI released a library called Gym that lets us train AI agents really easily. We'll also use Stable Baselines and gym libraries to build an RL agent capable of landing a rocket perfectly. The specific algorithm we will be using is called proximal policy optimization, this is an improved version of actor-critic algorithm.
shareeff
Tensorflow implementation of proximal policy optimization (PPO) algorithm
This is a deterministic Tensorflow 2.0 (keras) implementation of a Open Ai's proximal policy optimization actor critic algorithm PPO.
reinai
Implementation of Trust Region Policy Optimization and Proximal Policy Optimization algorithms on the objective of Robot Walk.
ian0
Teaching the Donkey car to drive a track in the simulator using State Representation Learning and different Reinforcement Learning Algorithms including Deep Q-Network, Soft Actor-Critic and Proximal Policy Optimization Algorithms.
sidharthmohannair
Comparative Study of Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) for Autonomous Drone Navigation using ROS and Gazebo. This project explores the performance of these algorithms in complex environments, focusing on navigation efficiency, obstacle avoidance, and learning efficiency.
jamesliu
An efficient implementation of the Proximal Policy Optimization (PPO) algorithm with linear and attention policy for reinforcement learning.
LiubovSobolevskaya
mplementation of Advantage Actor Critic (A2C) and Proximal Policy Optimization Algorithm (PPO) use the advantages of Tensorflow 2.x.
s1ddh-rth
This project explores the application of reinforcement learning (RL) to train humanoid robots for dynamic rock climbing movements, focusing on achieving the challenging "dyno" maneuver. Using the Proximal Policy Optimization (PPO) algorithm, the simulation integrates physics-based environments to model realistic climbing scenarios.
iuliagroza
A Proximal Policy Optimization Approach to Detect Spoofing in Algorithmic Trading
rossettisimone
Proximal Policy Optimization Algorithm applied to PONG in discrete environment
mycode2021
Deep reinforcement learning project based on openAI's retro environment, proximal policy optimization and random network distillation algorithm to play Contra.