All code should be covered by unit tests with at least 70% coverage, and regression tests where appropriate. When a bug is located and fixed, new test(s) should be added that would catch this bug. Please use pykale discussions on testing to talk about tests and ask for help. The overall and specific coverage of your commits can be checked conveniently by browsing the respective branch at the codecov report, e.g., check commits under the rewrite-gripnet branch by clicking the specific commits.
These guidelines will help you to write tests to address sufficiently compact pieces of code such that it is easy to identify causes of failure and for tests to also cover larger workflows such that confidence or trust can be built in reproducibility of outputs. We use pytest (see tutorials python testing software carpentry (alpha) and tutorialspoint pytest tutorial). There is some subjectivity involved in deciding how much of the potential behaviour of your code to check.
- Compact pytest tutorial, pytest fixtures, pytest exceptions
- Example unit tests for deep learning code of variational autoencoder in PyTorch and the related post How to Trust Your Deep Learning Code. (It uses
unittest
but we usepytest
. To convert aunittest
to apytest
, unittest2pytest is a good starting point.) - Learn from existing pykale tests, pytorch tests, torchvision tests, and pytest+pytorch examples fastai1 tests and Kornia tests
- Use GitHub code links to find out definitions and references
- Use Python Test Explorer for Visual Studio Code or pytest in pycharm to run tests conveniently.
- fastai testing is a good high-level reference. We adapt its recommendations on writing tests below:
- Think about how to create a test of the real functionality that runs quickly, e.g. based on our
examples
. - Use module scope fixtures to run initial code that can be shared amongst tests. When using fixtures, make sure the test doesn’t modify the global object it received. If there's a risk of modifying a broadly scoped fixture, you could clone it with a more tightly scoped fixture or create a fresh fixture/object instead.
- Avoid pretrained models, since they have to be downloaded from the internet to run the test.
- Create some minimal data for your test, or use data already in repo’s data/ directory.
- Think about how to create a test of the real functionality that runs quickly, e.g. based on our
See more details below, particularly test data, common parameters, and running tests locally.
The test workflow is defined in pykale/.github/workflows/test.yml
. It is triggered on every pull request and every push to the main branch, and we also run it at midnight (UK time) each day.
To minimize additional downloading time, test data and pip packages, once downloaded, are stored in the GitHub Actions Caches for future reuse. Test data caching uses strict matching of the cache key, meaning any change in tests/download_test_data.py
will update the cache. For efficiency, pip package caching uses soft matching, resulting in cache updates driven by both the date tag and any modifications in **/setup.py
.
Data needed for testing should be uploaded to pykale/data (preferred) or other external sources. All data downloading processes should be included in tests/download_test_data.py
. It is recommended to use download_file_by_url
from kale.utils.download
for downloading data during tests to tests/test_data
, as defined download_path
of tests/conftest.py
. More complex test data requirements for your pull request can be discussed in the motivating issue or pykale discussions on testing.
Consider adding parameters (or objects etc.) that may be useful to multiple tests as fixtures in a conftest.py
file, either in tests/
or the appropriate sub-module.
To run tests locally you will need to have installed pykale
with the development requirements:
git clone https://github.com/pykale/pykale
cd pykale
pip install -e .[dev]
then run:
pytest
A unit test checks that a small "unit" of software (e.g. a function) performs correctly. It might, for example, check that the function add
returns the number 2
when a list [1, 1]
is the input.
Within the tests/
folder is a folder structure that mimics that of the kale
python module. Unit tests for code in a given file in kale/
should be placed in their equivalent file in tests/
e.g. unit tests for a function in pykale/kale/loaddata/image_access.py
should be located in pykale/tests/loaddata/test_image_access.py
.
Philosophically, the author of a "unit" of code knows exactly what it should do and can write the test criteria accordingly.
A regression test checks that software produces the same results after a change is made. In pykale
, we expect regression tests to achieve this by testing several different parts of the software at once (in effect, an integration test). A single regression test might test loading some input files, setting up a model and generating a plot based on the model. This could be achieved by running the software with previously stored baseline inputs and checking the output is the same as previously stored baseline outputs.
Regression tests should be placed in tests/regression
. Further subfolders can be added, as required. We plan to add regression tests covering existing functionality based on examples in the examples/
folder.
Philosophically, regression tests treat the "past as truth" - the correct output / behaviour is the way it worked before a change.
Comparisons / assertions involving pandas
DataFrames
(or other pandas
objects) should be made using pandas
utility functions: pandas.testing.assert_frame_equal
, pandas.testing.assert_series_equal
, pandas.testing.assert_index_equal
, pandas.testing.assert_extension_array_equal
.
Comparisons / assertions involving numpy
arrays
(or other numpy
objects) should be made using numpy
testing routines. numpy
floating point "problem" response will be as default.
Random numbers in pykale are generated using base python, numpy and pytorch. Prior to making an assertion where objects that make use of random numbers are compared, the set_seed()
function from kale.utils.seed
should be called e.g.
In __init__.py
or test_<modulename>.py
:
from kale.utils.seed import set_seed
In test, before assertion:
set_seed()
pytest
captures log messages of level WARNING or above and outputs them to the terminal.
numpy
can be configured to respond differently to floating point errors. pykale
normally uses the default configuration.
Be aware that the code for which you are adding a test may have side effects (e.g. a function changing something in a file or database, as well as returning a variable). e.g.
import math
def add(numbers):
math.pi += numbers[1] # Add first number in list to math.pi
return sum(numbers) # Add list of numbers together
print("Sum:", add([1, 1])) # Show numbers are added
print("pi:", math.pi) # Show side effect
will output:
Sum: 2
pi: 4.141592653589793
...having redefined the value of math.pi
! math.pi
will be redefined each time the function is run and nothing returned by the function gives any indication this has happened.
Minimising side effects makes code easier to test. Try and minimise side effects and ensure, where present, they are covered by tests.