My submission for Project 3 from Udacity's Deep Reinforcement Learning Nanodegree Program.
I've posted a demo video of a smart agent in action, which shows how well the training solves this task, and a plot of the scores during training.
In the reacher environment for the Continuous Control Project a double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal of the agent is to maintain its position at the target location for as many time steps as possible.
The observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arm. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector should be a number between -1 and 1.
The task is episodic, and in order to solve the environment, your agent must get an average score of +30 over 100 consecutive episodes.
If you would like to run this code locally follow the instructions below.
- Set up your Python environment as described the dependencies section of the readme from the Deep Reinforcement Learning Nanodegree program.
- Clone this repository.
- Select the environment that matches your operating system from the list below:
- Place the file in the root folder of the cloned repo.
- Unzip (or decompress) the file.
You can train an agent to solve the reacher environment by executing the cells in the Continuous Control notebook. The code in that notebook is self-contained except for a few simple utility functions, which are saved in the Python module util.py.