Reinforcement Learning Short Course
reinforcement-learning q-learning ridesharing policy-gradient dynamic-programming deep-q-network markov-decision-processes policy-iteration value-iteration monte-carlo-methods temporal-differencing-learning model-based-rl policy-based-method fitted-q-iteration off-policy-evaluation offline-rl order-dispatch-recommendation
-
Updated
Dec 16, 2024 - Jupyter Notebook