zhonghai1995

Follow

hai zhonghai1995

Follow

10 followers · 150 following

Highlights

Pro

Stars

google-deepmind / bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent

Python 1,508 182 Updated Apr 13, 2024

McGill-NLP / VinePPO

Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"

Python 76 6 Updated Oct 24, 2024

gxywy / rl-plotter

✨ A plotter for reinforcement learning (RL)

Python 207 30 Updated Dec 8, 2021

google-deepmind / alphastar

Python 410 55 Updated Sep 8, 2022

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

5,007 279 Updated Nov 1, 2024

histmeisah / Large-Language-Models-play-StarCraftII

TextStarCraft2,a pure language env which support llms play starcraft2

Python 207 14 Updated Oct 18, 2024

Michael-Beukman / RobocupGym

Reinforcement Learning inside a 3D soccer simulation

Python 24 Updated Sep 15, 2024

tinkoff-ai / CORL

High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC

Python 1,083 131 Updated Aug 3, 2023

cor3bit / bertsekas-marl

PyTorch Implementation of the Sequential Multiagent Rollout algorithm

Python 10 3 Updated Jun 28, 2024

dunnolab / xland-minigrid

JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid 🏎️

Python 196 15 Updated Oct 11, 2024

proroklab / VectorizedMultiAgentSimulator

VMAS is a vectorized differentiable simulator designed for efficient Multi-Agent Reinforcement Learning benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set …

Python 335 69 Updated Nov 5, 2024

Farama-Foundation / Metaworld

Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning

Python 1,266 273 Updated Nov 5, 2024

Haichao-Zhang / PEX

Policy Expansion for Bridging Offline-to-Online Reinforcement Learning (ICLR23)

Python 47 5 Updated Apr 4, 2023

PKU-MARL / DexterousHands

This is a library that provides dual dexterous hand manipulation tasks through Isaac Gym

Python 655 80 Updated Jun 20, 2024

shariqiqbal2810 / maddpg-pytorch

PyTorch Implementation of MADDPG (Lowe et. al. 2017)

Python 573 129 Updated Nov 26, 2019

twni2016 / pomdp-baselines

Simple (but often Strong) Baselines for POMDPs in PyTorch, ICML 2022

Python 302 40 Updated Aug 22, 2024

ikostrikov / rlpd

Python 222 25 Updated Feb 13, 2023

vitchyr / viskit

rllab's viskit with some added features

Python 73 35 Updated May 1, 2023

google-deepmind / distrax

Python 535 32 Updated Sep 18, 2024

my-yy / s2v_rc

Speech2Vec Reality Check

Python 75 4 Updated Feb 21, 2023

RLE-Foundation / rllte

Long-Term Evolution Project of Reinforcement Learning

Python 468 86 Updated Aug 26, 2024

shadps4-emu / shadPS4

PS4 emulator for Windows,Linux,MacOS

C++ 10,765 660 Updated Nov 7, 2024

google-deepmind / optax

Optax is a gradient processing and optimization library for JAX.

Python 1,687 190 Updated Nov 5, 2024

google / flax

Flax is a neural network library for JAX that is designed for flexibility.

Jupyter Notebook 6,105 643 Updated Nov 7, 2024

minitorch / minitorch

The full minitorch student suite.

Python 1,909 402 Updated Aug 17, 2024

facebookresearch / Pearl

A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.

Jupyter Notebook 2,660 163 Updated Nov 6, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

29,638 1,620 Updated Aug 1, 2024

jayeshs999 / sapg

Code for SAPG: Split and Aggregate Policy Gradients (ICML 2024)

Jupyter Notebook 41 2 Updated Sep 17, 2024

Emerge-Lab / gpudrive

GPU-acceleration of Nocturne via Madrona

Jupyter Notebook 225 20 Updated Nov 7, 2024

mantle2048 / rlplot

rlplot is an easy to use and highly encapsulated RL plot library (including basic error bar lineplot and a wrapper to "rliable").

Python 26 3 Updated Dec 8, 2023