Benchmarking dvc with pytest-benchmark.
Visit bench.dvc.org
Trigger a dispatch workflow with desired dataset and revisions and see results in bench.dvc.org/run_ID_ATTEMPT.html, where ID
is github.run_id
and ATTEMPT
is github.run_attempt
. For example, for https://github.com/iterative/dvc-bench/actions/runs/7119039172/attempts/2 it would be http://bench.dvc.org/run_7119039172_2.html
$ uv pip install -r requirements.txt
$ dvc pull # optional, otherwise will pull datasets dynamically
$ pytest --pyargs dvc.testing.benchmarks
$ pytest --pyargs dvc.testing.benchmarks.cli.commands.test_add
$ pytest -h
...
--dataset=DATASET
Dataset name to use in tests (e.g. tiny/small/large/mnist/etc)
--dvc-bin=DVC_BIN Path to dvc binary
--dvc-revs=DVC_REVS Comma-separated list of DVC revisions to test (overrides `--dvc-bin`)
--dvc-repo=DVC_GIT_REPO
Path or url to dvc git repo
--dvc-bench-repo=DVC_BENCH_GIT_REPO
Path or url to dvc-bench git repo (for loading benchmark dataset)
--dvc-install-deps=DVC_INSTALL_DEPS
Comma-separated list of DVC installation packages
--project-rev=PROJECT_REV
Project revision to test
--project-repo=PROJECT_GIT_REPO
Path or url to dvc project
...
$ pytest-benchmark compare --histogram histograms/ --group-by name --sort name --csv results.csv
and if you want beautiful plots:
$ dvc repro
$ dvc plots show
Benchmark test definitions are now part of dvc.testing.