Skip to content


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation


Code style: black

This package provides a debugging tool for Deep Reinforcement Learning (DRL) frameworks, designed to detect and address DNN and RL issues that may arise during training. The tool allows you to monitor your training process in real-time, identifying any potential flaws and making it easier to improve the performance of your DRL models.

The implementation is clean and simple, with research-friendly features. The highlight features of DRLDebugger are:

  • 📜 Straightforward integration
    • DRLDebugger can be integrated into your project with a few lines of code.
  • 🗳️ DNN + RL checks
  • 🛃 Custom checks
  • 🖥️ Real-time warnings
  • 📈 Monitoring using Weights and Biases

Get started


  • Python >=3.7.7,<3.10 (not yet tested on 3.10)

Step 1. Install the debugger in your python environment:

git clone && cd RLDebugger
pip install -e .

Step 2. Create a .yml config file and copy the following lines:

  name: 'Debugger'
      constant: []
      variable: []
      - name: #Checker_Name_1
      - name: #Checker_Name_2

Step 3. Set up the config and run the debugger:

from debugger import rl_debugger

rl_debugger.set_config(config_path="the path to your debugger config.yml file")



For detailed steps on how to integrate the debugger, please refer to Integrating the Debugger

Checkers Included in this Package

The checkers included in this pachage are divided into two categories: DNN-specific checkers and RL-specific checkers. The DNN-specific checkers were adapted from the paper "Testing Feedforward Neural Networks Training Programs" 1. We would like to thank this paper's authors for providing the code for these checks ( These checks have been adapted to function in the DRL context and been migrated from TensorFlow 1 to PyTorch. The following Table list all the checkers with a link to their location in the package (you can find there more details on each checker).

Category Check Description
DNN Checks Activation Link to Description
Loss Link to Description
Weight Link to Description
Bias Link to Description
Gradient Link to Description
ProperFitting Link to Description
DRL Checks Action Link to Description
Agent Link to Description
Environment Link to Description
ExplorationParameter Link to Description
Reward Link to Description
Step Link to Description
State Link to Description
UncertaintyAction Link to Description
ValueFunction Link to Description

Integrating the Debugger tool in your DRL Pytorch Application

To integrate the tool you have to do the following 4 steps :

1. Prepare the config file

Create a .yml config file and copy the following lines :

  name: 'Debugger'
      constant: []
      variable: []
      - name: #Checker_Name_1
        period: #Checker_period_value
        skip_run_threshold: #Checker_skip_run_value
      - name: #Checker_Name_2

The debugger config should have the same structure and includes the following elements:

  • observed_params (only add the two lists when you will develop a new Checker) : contains the elements observed by the debugger. We have already a list of the default observed params (reward, actions, ...) so please only add non default params. The list of default params can be found here. Constant or variable indicates the nature of the observed parm.
  • check_type : Mention the name of the check you want to perform, by replacing #Checker_Name_1 and #Checker_Name_2 (you can add as many Checkers as you want). The names of the Checker can be found in the above table (Check Column).
    • period : You can specify the period over which the checker will be called each time. If you don't specify a specific period the default values found in the config data classes will be automatically used.
    • skip_run_threshold : You can specify the number of skipped steps over which the checker will be called each time. If you don't specify a specific value the default values found in the config data classes will be automatically used.

2. Installation and Importing

Please note that this step is temporary, as the project is still in development.

To set up the debugger in your python environment, follow these steps:
  1. Clone the repository
  2. cd 'path to RLDebugger repo'
  3. Run the command pip install -e .
  4. Import the debugger in your code with the following line:
from debugger import rl_debugger
  1. Set up the configuration using the following line :
rl_debugger.set_config(config_path="the path to your debugger config.yml file")

3. Configuring each Checker (Optional)

If you only need to deactivate a specific check for a particular Checker, this step is useful. You can modify the configuration of each Checker by changing the parameters in the config data class or by creating your own instance.

For instance, when debugging the weights during training (i.e Weight Checker), three different types of checks can be performed, but you may want to modify the thresholds or deactivate a specific check. To do so, all you have to do is modify the parameters you find in the WeightConfig, which you will find here, as shown below :

class WeightConfig:
    start: int = 100
    period: int = 10
    skip_run_threshold: int = 2
    numeric_ins: NumericIns = NumericIns()
    neg: Neg = Neg()
    dead: Dead = Dead()
    div: Div = Div()
    initial_weight: InitialWeight = InitialWeight()

4. Run the debugging

To run the debugging, use the following code:

from debugger import rl_debugger


This function will run the Checkers chosen by the user in the config file. It can be called from any class or file in your project ( you can imagine it as a global function).

For example, to run the Checker Action, the debug function should receive three parameters: actions_probs, max_total_steps, and max_reward. Moreover,The function debug can be called in different parts of the code and be provided with the parameters available at that point, but the Checker will wait until all the parameters are received, so it can start running.

It is important to note that the environment is the only parameter required for the debugger to properly operate. Also, the environment needs to be passed at the beginning of your code and in the first call of ".debug()".

from debugger import rl_debugger

env = gym.make("CartPole-v1")

It is also important to note that while calling debug, the key for the parameters (args) must match exactly as mentioned in the configuration file under observed_params. If there are parameters that are not required (i.e., not used by any of the Checkers), they can be omitted.

5. how to interpret the results

When you run the training, the debugging process will generate warning messages indicating any errors that have occurred and the elements that caused them. To help you better understand the results of the debugging process, it's important to carefully review these messages and take the necessary actions to resolve any issues that they highlight.

6. data visualization

To help you better understand the results of the debugging process, we added visualization options for some checkers (Action and Reward checkers) using Wandb. You can follow the same logic in your custom checkers if you want to add visualization to them. Please explore here for more details.

Creating a new Checker

1. Create the Checker Class

To create a new checker, you can follow the structure outlined in the code snippet below:

from debugger import DebuggerInterface
from dataclasses import dataclass

class CustomCheckerConfig:
    period: int = ...
    other_data: float = ...

class CustomChecker(DebuggerInterface):
    def __init__(self):
        super().__init__(check_type="CustomChecker", config=CustomCheckerConfig)
        # you can add other attributes

    #  You can define other functions

    def run(self, observed_param):
        if self.check_period():
            # Do some instructions ....
            self.error_msg.append("your error message")

To create a new Checker, you need to include the following elements in your class:

  1. CustomCheckerConfig data class: This class is mandatory and defines all the configurations necessary for running your custom Checker. It is necessary to include the period element, which determines the periodicity of the debugging (if you want the Checker to run only before the training, set its value to 0).

  2. Your Checker class: Your Checker should inherit from the DebuggerInterface class and initialize itself by calling super() and providing the name of your Checker of your Checker

  3. The run function: The function should include three important elements: the parameters you need (as mentioned in the observed_params in the config.yml file), the periodicity check using the predefined check_period() function, and appending the messages you want to display to the self.error_msg list.


  • If you need to know the number of times the Checker has run, you can check it using the self.iter_num variable.
  • If you need to know the number of steps the RL algorithm has performed, you can check it using the self.step_num variable.

2. Register the Checker

To run your new Checker, you need to register it by adding the following line in your

rl_debugger.register(checker_name="CustomChecker", checker_class=CustomChecker)
# the register method should be called before the set_config method

Finally, to run your Checker, all you have to do is add "CustomChecker" to the config.yml file and run the training.

Important Notes

  1. The environment is the only parameter required for the debugger to properly operate. Also, the environment needs to be passed at the beginning of your code and in the first call of ".debug()".
from debugger import rl_debugger

env = gym.make("CartPole-v1")
  1. Every observed parameters (e.g., model, target_model, action_probs, etc) need to be send once to the debugger through the ".debug()". The following code snippet is a wrong behavior.
from debugger import rl_debugger

state, reward, done, _ = env.step(action)
qvals = qnet(state)
batch = replay_buffer.sample(batch_size=32)
qvals = qnet(batch["state"])

The above code would result in a wrong behavior as the model is sent twice to the debugger. You should avoid send the same observed parameters from two different code locations.

  1. If you have a test run during the learning process, you have to turn off/on the debugger. Otherwise, some unexpected behavior may arise.
from debugger import rl_debugger

def run_testing():
        results = super().run_testing()
        return results
  1. It is recommended to add all the constant observed params (e.g., max_reward, loss_fn, max_total_steps, ...) at the beginning of your code and in the first call of ".debug()". These params needs to be observed once and many checkers rely on them. Thus, providing them once and in at the beginning of your code would be more efficient.
from debugger import rl_debugger

env = gym.make("CartPole-v1")

Support and get involved

Feel free to ask questions. Posting in Github Issues and PRs are also welcome.




No description, website, or topics provided.







No releases published


No packages published
