Fixes #583 #584

vfdev-5 · 2019-08-23T06:22:47Z

Fixes #583

* [WIP] Added cifar10 distributed example * [WIP] Metric with all reduce decorator and tests * [WIP] Added tests for accumulation metric * [WIP] Updated with reinit_is_reduced * [WIP] Distrib adaptation for other metrics * [WIP] Warnings for EpochMetric and Precision/Recall when distrib * Updated metrics and tests to run on distributed configuration - Test on 2 GPUS single node - Added cmd in .travis.yml to indicate how to test locally - Updated travis to run tests in 4 processes * Minor fixes and cosmetics * Fixed bugs and improved contrib/cifar10 example * Updated docs * Fixes issue #543 (#572) * Fixes issue #543 Previous CM implementation suffered from the problem if target contains non-contiguous indices. New implementation is almost taken from torchvision's https://github.com/pytorch/vision/blob/master/references/segmentation/utils.py#L75-L117 This commit also removes the case of targets as (batchsize, num_categories, ...) where num_categories excludes background class. Confusion matrix computation is possible almost similarly for (batchsize, ...), but when target is all zero (0, ..., 0) = no classes (background class), then confusion matrix does not count any true/false predictions. * Update confusion_matrix.py * Update metrics.rst * Updated docs and set device as "cuda" in distributed instead of raising error * [WIP] Fix missing _is_reduced in precision/recall with tests * Updated other tests * Added mlflow logger (#558) * Added mlflow logger without tests * Added mlflow tests, updated mlflow logger code and other tests * Updated docs and added mlflow in travis * Added tests for mlflow OptimizerParamsHandler - additionally added OptimizerParamsHandler for plx with tests * Update to PyTorch v1.2.0 (#580) * Update .travis.yml * Update .travis.yml * Fixed tests and improved travis * Fix SSL problem of failing travis (#581) * Update .travis.yml * Update .travis.yml * Fixed tests and improved travis * Fixes SSL problem to download model weights * Fixed travis for deploy and nightly * Fixes #583 (#584) * Fixes docs build warnings (#585) * Return removable handle from Engine.add_event_handler(). (#588) * Add tests for event removable handle. Add feature tests for engine.add_event_handler returning removable event handles. * Return RemovableEventHandle from Engine.add_event_handler. * Fixup removable event handle test in python 2.7. Explicitly trigger gc, allowing cycle detection between engine and state, in removable handle weakref test. Python 2.7 cycle detection appears to be less aggressive than python 3+. * Add removable event handler docs. Add autodoc configuration for RemovableEventHandler, expand "concepts" documentation with event remove example following event add example. * Update concepts.rst * Updated travis and renamed tbptt test gpu -> cuda

* [WIP] Added cifar10 distributed example * [WIP] Metric with all reduce decorator and tests * [WIP] Added tests for accumulation metric * [WIP] Updated with reinit_is_reduced * [WIP] Distrib adaptation for other metrics * [WIP] Warnings for EpochMetric and Precision/Recall when distrib * Updated metrics and tests to run on distributed configuration - Test on 2 GPUS single node - Added cmd in .travis.yml to indicate how to test locally - Updated travis to run tests in 4 processes * Minor fixes and cosmetics * Fixed bugs and improved contrib/cifar10 example * Updated docs * Update metrics.rst * Updated docs and set device as "cuda" in distributed instead of raising error * [WIP] Fix missing _is_reduced in precision/recall with tests * Updated other tests * Updated travis and renamed tbptt test gpu -> cuda * Distrib (#573) * [WIP] Added cifar10 distributed example * [WIP] Metric with all reduce decorator and tests * [WIP] Added tests for accumulation metric * [WIP] Updated with reinit_is_reduced * [WIP] Distrib adaptation for other metrics * [WIP] Warnings for EpochMetric and Precision/Recall when distrib * Updated metrics and tests to run on distributed configuration - Test on 2 GPUS single node - Added cmd in .travis.yml to indicate how to test locally - Updated travis to run tests in 4 processes * Minor fixes and cosmetics * Fixed bugs and improved contrib/cifar10 example * Updated docs * Fixes issue #543 (#572) * Fixes issue #543 Previous CM implementation suffered from the problem if target contains non-contiguous indices. New implementation is almost taken from torchvision's https://github.com/pytorch/vision/blob/master/references/segmentation/utils.py#L75-L117 This commit also removes the case of targets as (batchsize, num_categories, ...) where num_categories excludes background class. Confusion matrix computation is possible almost similarly for (batchsize, ...), but when target is all zero (0, ..., 0) = no classes (background class), then confusion matrix does not count any true/false predictions. * Update confusion_matrix.py * Update metrics.rst * Updated docs and set device as "cuda" in distributed instead of raising error * [WIP] Fix missing _is_reduced in precision/recall with tests * Updated other tests * Added mlflow logger (#558) * Added mlflow logger without tests * Added mlflow tests, updated mlflow logger code and other tests * Updated docs and added mlflow in travis * Added tests for mlflow OptimizerParamsHandler - additionally added OptimizerParamsHandler for plx with tests * Update to PyTorch v1.2.0 (#580) * Update .travis.yml * Update .travis.yml * Fixed tests and improved travis * Fix SSL problem of failing travis (#581) * Update .travis.yml * Update .travis.yml * Fixed tests and improved travis * Fixes SSL problem to download model weights * Fixed travis for deploy and nightly * Fixes #583 (#584) * Fixes docs build warnings (#585) * Return removable handle from Engine.add_event_handler(). (#588) * Add tests for event removable handle. Add feature tests for engine.add_event_handler returning removable event handles. * Return RemovableEventHandle from Engine.add_event_handler. * Fixup removable event handle test in python 2.7. Explicitly trigger gc, allowing cycle detection between engine and state, in removable handle weakref test. Python 2.7 cycle detection appears to be less aggressive than python 3+. * Add removable event handler docs. Add autodoc configuration for RemovableEventHandler, expand "concepts" documentation with event remove example following event add example. * Update concepts.rst * Updated travis and renamed tbptt test gpu -> cuda * Compute IoU, Precision, Recall based on CM on CPU * Fixes incomplete merge with 1856c8e * Update distrib branch and CIFAR10 example (#647) * Added tests with gloo, minor updates and fixes * Added single/multi node tests with gloo and [WIP] with nccl * Added tests for multi-node nccl, improved examples/contrib/cifar10 example * Experiments: 1n1gpu, 1n2gpus, 2n2gpus * Fix flake8 * Fixes #645 (#646) - fix CI and improve create_lr_scheduler_with_warmup * Fix tests for python 2.7 * Finalized Cifar10 example (#649) * Added gcp tb logger image and updated README * Added gcp ai platform scripts to run trainings * Improved docs and readmes

Fixes #583

27931ba

vfdev-5 merged commit 3c1cc89 into master Aug 23, 2019

vfdev-5 deleted the issue_583 branch August 23, 2019 06:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes #583 #584

Fixes #583 #584

vfdev-5 commented Aug 23, 2019

Fixes #583 #584

Fixes #583 #584

Conversation

vfdev-5 commented Aug 23, 2019