Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up Model tests by 20% #5574

Merged
merged 9 commits into from
Mar 9, 2022
Merged

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Mar 9, 2022

We focus on the test_classification_model, test_detection_model, test_quantized_classification_model, test_segmentation_model, test_video_model tests and improve their execution times by 20%.

This is achieved by:

  • Reducing the input size for very large models
  • Avoid reestimating model outputs when possible
Before: 629.21 sec

Run: https://app.circleci.com/pipelines/github/pytorch/vision/15497/workflows/51d7e30b-86ff-4c71-af68-7cb3438f25c9/jobs/1252313

62.17s call     test/test_models.py::test_quantized_classification_model[mobilenet_v3_large]
46.86s call     test/test_models.py::test_quantized_classification_model[mobilenet_v2]
43.47s call     test/test_models.py::test_quantized_classification_model[resnext101_32x8d]
32.26s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x0_5]
22.26s call     test/test_models.py::test_classification_model[cpu-regnet_y_128gf]
18.74s call     test/test_models.py::test_quantized_classification_model[googlenet]
16.09s call     test/test_models.py::test_classification_model[cpu-efficientnet_v2_l]
13.48s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x1_0]
12.84s call     test/test_models.py::test_classification_model[cpu-efficientnet_b7]
11.11s call     test/test_models.py::test_classification_model[cpu-efficientnet_v2_m]
10.93s call     test/test_models.py::test_classification_model[cpu-vit_l_16]
10.48s call     test/test_models.py::test_classification_model[cpu-efficientnet_b6]
10.15s call     test/test_models.py::test_classification_model[cpu-vit_l_32]
9.12s call     test/test_models.py::test_classification_model[cpu-densenet201]
8.58s call     test/test_models.py::test_classification_model[cpu-efficientnet_b5]
8.48s call     test/test_models.py::test_detection_model[cpu-maskrcnn_resnet50_fpn]
8.40s call     test/test_models.py::test_classification_model[cpu-densenet161]
7.45s call     test/test_models.py::test_classification_model[cpu-densenet169]
7.27s call     test/test_models.py::test_classification_model[cpu-regnet_y_32gf]
7.18s call     test/test_models.py::test_detection_model[cpu-keypointrcnn_resnet50_fpn]
7.10s call     test/test_models.py::test_classification_model[cpu-efficientnet_v2_s]
6.94s call     test/test_models.py::test_classification_model[cpu-efficientnet_b4]
6.66s call     test/test_models.py::test_classification_model[cpu-convnext_large]
6.49s call     test/test_models.py::test_detection_model[cpu-fasterrcnn_mobilenet_v3_large_320_fpn]
6.02s call     test/test_models.py::test_detection_model[cpu-fasterrcnn_resnet50_fpn]
5.99s call     test/test_models.py::test_quantized_classification_model[resnet18]
5.81s call     test/test_models.py::test_classification_model[cpu-efficientnet_b3]
5.37s call     test/test_models.py::test_classification_model[cpu-regnet_y_1_6gf]
5.35s call     test/test_models.py::test_classification_model[cpu-regnet_y_16gf]
5.30s call     test/test_models.py::test_detection_model[cpu-ssdlite320_mobilenet_v3_large]
5.24s call     test/test_models.py::test_classification_model[cpu-regnet_x_32gf]
5.24s call     test/test_models.py::test_detection_model[cpu-ssd300_vgg16]
5.12s call     test/test_models.py::test_classification_model[cpu-wide_resnet101_2]
5.01s call     test/test_models.py::test_classification_model[cpu-efficientnet_b2]
4.99s call     test/test_models.py::test_classification_model[cpu-densenet121]
4.87s call     test/test_models.py::test_classification_model[cpu-efficientnet_b1]
4.61s call     test/test_models.py::test_classification_model[cpu-regnet_y_3_2gf]
4.45s call     test/test_models.py::test_classification_model[cpu-resnet152]
4.44s call     test/test_models.py::test_detection_model[cpu-fasterrcnn_mobilenet_v3_large_fpn]
4.38s call     test/test_models.py::test_classification_model[cpu-regnet_y_8gf]
4.25s call     test/test_models.py::test_classification_model[cpu-resnext101_32x8d]
4.20s call     test/test_models.py::test_classification_model[cpu-regnet_x_16gf]
4.11s call     test/test_models.py::test_classification_model[cpu-convnext_base]
4.09s call     test/test_models.py::test_classification_model[cpu-regnet_y_400mf]
3.82s call     test/test_models.py::test_classification_model[cpu-vit_b_16]
3.69s call     test/test_models.py::test_quantized_classification_model[inception_v3]
3.63s call     test/test_models.py::test_classification_model[cpu-efficientnet_b0]
3.63s call     test/test_models.py::test_detection_model[cpu-fcos_resnet50_fpn]
3.55s call     test/test_models.py::test_segmentation_model[cpu-deeplabv3_resnet101]
3.52s call     test/test_models.py::test_classification_model[cpu-vit_b_32]
3.45s call     test/test_models.py::test_classification_model[cpu-vgg19_bn]
3.40s call     test/test_models.py::test_classification_model[cpu-regnet_x_8gf]
3.40s call     test/test_models.py::test_detection_model[cpu-retinanet_resnet50_fpn]
3.20s call     test/test_models.py::test_segmentation_model[cpu-fcn_resnet101]
3.17s call     test/test_models.py::test_classification_model[cpu-vgg16_bn]
3.15s call     test/test_models.py::test_classification_model[cpu-convnext_small]
3.13s call     test/test_models.py::test_classification_model[cpu-vgg19]
3.12s call     test/test_models.py::test_classification_model[cpu-inception_v3]
3.05s call     test/test_models.py::test_classification_model[cpu-resnet101]
3.02s call     test/test_models.py::test_classification_model[cpu-vgg13_bn]
3.00s call     test/test_models.py::test_classification_model[cpu-regnet_y_800mf]
2.98s call     test/test_models.py::test_classification_model[cpu-vgg16]
2.95s call     test/test_models.py::test_classification_model[cpu-vgg11_bn]
2.92s call     test/test_models.py::test_classification_model[cpu-regnet_x_3_2gf]
2.89s call     test/test_models.py::test_classification_model[cpu-vgg13]
2.87s call     test/test_models.py::test_classification_model[cpu-vgg11]
2.83s call     test/test_models.py::test_classification_model[cpu-wide_resnet50_2]
2.71s call     test/test_models.py::test_quantized_classification_model[resnet50]
2.63s call     test/test_models.py::test_classification_model[cpu-googlenet]
2.60s call     test/test_models.py::test_classification_model[cpu-mobilenet_v3_large]
2.56s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x0_5]
2.56s call     test/test_models.py::test_classification_model[cpu-regnet_x_400mf]
2.49s call     test/test_models.py::test_segmentation_model[cpu-deeplabv3_mobilenet_v3_large]
2.33s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x2_0]
2.30s call     test/test_models.py::test_classification_model[cpu-mobilenet_v2]
2.29s call     test/test_models.py::test_classification_model[cpu-mnasnet1_3]
2.29s call     test/test_models.py::test_segmentation_model[cpu-lraspp_mobilenet_v3_large]
2.25s call     test/test_models.py::test_classification_model[cpu-mobilenet_v3_small]
2.20s call     test/test_models.py::test_classification_model[cpu-regnet_x_1_6gf]
2.19s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x1_0]
2.16s call     test/test_models.py::test_classification_model[cpu-resnext50_32x4d]
2.15s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x1_5]
2.15s call     test/test_models.py::test_classification_model[cpu-mnasnet0_75]
2.12s call     test/test_models.py::test_segmentation_model[cpu-deeplabv3_resnet50]
2.12s call     test/test_models.py::test_classification_model[cpu-mnasnet1_0]
2.04s call     test/test_models.py::test_classification_model[cpu-mnasnet0_5]
2.04s call     test/test_models.py::test_video_model[cpu-r2plus1d_18]
2.03s call     test/test_models.py::test_video_model[cpu-r3d_18]
2.01s call     test/test_models.py::test_classification_model[cpu-regnet_x_800mf]
1.95s call     test/test_models.py::test_segmentation_model[cpu-fcn_resnet50]
1.92s call     test/test_models.py::test_classification_model[cpu-convnext_tiny]
1.75s call     test/test_models.py::test_classification_model[cpu-resnet50]
1.49s call     test/test_models.py::test_video_model[cpu-mc3_18]
1.29s call     test/test_models.py::test_classification_model[cpu-resnet34]
0.99s call     test/test_models.py::test_classification_model[cpu-alexnet]
0.95s call     test/test_models.py::test_classification_model[cpu-resnet18]
0.54s call     test/test_models.py::test_classification_model[cpu-squeezenet1_0]
0.39s call     test/test_models.py::test_classification_model[cpu-squeezenet1_1]
After: 507.28 sec

Run: https://app.circleci.com/pipelines/github/pytorch/vision/15520/workflows/2cc25bdc-fe10-4705-b942-c206a9d12fad/jobs/1254266

44.83s call     test/test_models.py::test_quantized_classification_model[mobilenet_v3_large]
33.70s call     test/test_models.py::test_quantized_classification_model[mobilenet_v2]
33.10s call     test/test_models.py::test_quantized_classification_model[resnext101_32x8d]
27.79s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x1_0]
26.34s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x0_5]
17.15s call     test/test_models.py::test_classification_model[cpu-regnet_y_128gf]
13.76s call     test/test_models.py::test_quantized_classification_model[googlenet]
10.88s call     test/test_models.py::test_classification_model[cpu-efficientnet_v2_l]
10.11s call     test/test_models.py::test_classification_model[cpu-vit_l_16]
8.01s call     test/test_models.py::test_classification_model[cpu-vit_l_32]
7.95s call     test/test_models.py::test_classification_model[cpu-efficientnet_b7]
7.34s call     test/test_models.py::test_classification_model[cpu-densenet201]
7.09s call     test/test_models.py::test_classification_model[cpu-efficientnet_v2_m]
6.89s call     test/test_models.py::test_classification_model[cpu-efficientnet_b6]
6.56s call     test/test_models.py::test_classification_model[cpu-densenet169]
6.56s call     test/test_models.py::test_classification_model[cpu-densenet161]
6.48s call     test/test_models.py::test_classification_model[cpu-convnext_large]
5.76s call     test/test_models.py::test_detection_model[cpu-fasterrcnn_mobilenet_v3_large_fpn]
5.47s call     test/test_models.py::test_classification_model[cpu-efficientnet_b5]
5.27s call     test/test_models.py::test_classification_model[cpu-regnet_y_32gf]
4.80s call     test/test_models.py::test_classification_model[cpu-efficientnet_v2_s]
4.78s call     test/test_models.py::test_quantized_classification_model[resnet18]
4.71s call     test/test_models.py::test_detection_model[cpu-maskrcnn_resnet50_fpn]
4.67s call     test/test_models.py::test_classification_model[cpu-densenet121]
4.58s call     test/test_models.py::test_classification_model[cpu-resnet152]
4.49s call     test/test_models.py::test_classification_model[cpu-regnet_x_32gf]
4.44s call     test/test_models.py::test_classification_model[cpu-efficientnet_b4]
4.38s call     test/test_models.py::test_classification_model[cpu-wide_resnet101_2]
4.17s call     test/test_models.py::test_detection_model[cpu-keypointrcnn_resnet50_fpn]
4.08s call     test/test_models.py::test_quantized_classification_model[inception_v3]
4.07s call     test/test_models.py::test_detection_model[cpu-fasterrcnn_mobilenet_v3_large_320_fpn]
3.90s call     test/test_models.py::test_detection_model[cpu-fasterrcnn_resnet50_fpn]
3.90s call     test/test_models.py::test_detection_model[cpu-ssd300_vgg16]
3.87s call     test/test_models.py::test_classification_model[cpu-regnet_y_16gf]
3.86s call     test/test_models.py::test_classification_model[cpu-convnext_base]
3.80s call     test/test_models.py::test_classification_model[cpu-efficientnet_b3]
3.68s call     test/test_models.py::test_classification_model[cpu-regnet_y_1_6gf]
3.52s call     test/test_models.py::test_classification_model[cpu-resnext101_32x8d]
3.52s call     test/test_models.py::test_classification_model[cpu-vgg19_bn]
3.41s call     test/test_models.py::test_classification_model[cpu-regnet_y_8gf]
3.40s call     test/test_models.py::test_detection_model[cpu-ssdlite320_mobilenet_v3_large]
3.34s call     test/test_models.py::test_classification_model[cpu-efficientnet_b2]
3.32s call     test/test_models.py::test_classification_model[cpu-convnext_small]
3.21s call     test/test_models.py::test_classification_model[cpu-vit_b_16]
3.17s call     test/test_models.py::test_classification_model[cpu-vgg19]
3.14s call     test/test_models.py::test_classification_model[cpu-vgg16_bn]
3.14s call     test/test_models.py::test_classification_model[cpu-efficientnet_b1]
3.11s call     test/test_models.py::test_segmentation_model[cpu-deeplabv3_resnet101]
3.10s call     test/test_models.py::test_classification_model[cpu-regnet_y_3_2gf]
3.06s call     test/test_models.py::test_classification_model[cpu-regnet_x_8gf]
3.06s call     test/test_models.py::test_classification_model[cpu-vgg16]
3.02s call     test/test_models.py::test_classification_model[cpu-resnet101]
3.01s call     test/test_models.py::test_classification_model[cpu-regnet_x_16gf]
2.99s call     test/test_models.py::test_classification_model[cpu-vgg13_bn]
2.99s call     test/test_models.py::test_segmentation_model[cpu-fcn_resnet101]
2.85s call     test/test_models.py::test_classification_model[cpu-inception_v3]
2.82s call     test/test_models.py::test_classification_model[cpu-regnet_y_400mf]
2.79s call     test/test_models.py::test_classification_model[cpu-vgg11]
2.77s call     test/test_models.py::test_classification_model[cpu-vgg13]
2.76s call     test/test_models.py::test_classification_model[cpu-vit_b_32]
2.75s call     test/test_models.py::test_classification_model[cpu-vgg11_bn]
2.74s call     test/test_models.py::test_detection_model[cpu-retinanet_resnet50_fpn]
2.68s call     test/test_models.py::test_classification_model[cpu-regnet_x_3_2gf]
2.53s call     test/test_models.py::test_classification_model[cpu-wide_resnet50_2]
2.46s call     test/test_models.py::test_classification_model[cpu-efficientnet_b0]
2.44s call     test/test_models.py::test_classification_model[cpu-regnet_x_400mf]
2.43s call     test/test_models.py::test_detection_model[cpu-fcos_resnet50_fpn]
2.39s call     test/test_models.py::test_quantized_classification_model[resnet50]
2.26s call     test/test_models.py::test_classification_model[cpu-regnet_y_800mf]
2.24s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x0_5]
2.23s call     test/test_models.py::test_classification_model[cpu-resnext50_32x4d]
2.20s call     test/test_models.py::test_classification_model[cpu-googlenet]
2.17s call     test/test_models.py::test_segmentation_model[cpu-deeplabv3_mobilenet_v3_large]
2.17s call     test/test_models.py::test_video_model[cpu-r2plus1d_18]
2.04s call     test/test_models.py::test_classification_model[cpu-regnet_x_1_6gf]
2.03s call     test/test_models.py::test_classification_model[cpu-convnext_tiny]
2.00s call     test/test_models.py::test_classification_model[cpu-mnasnet1_3]
1.97s call     test/test_models.py::test_classification_model[cpu-mobilenet_v2]
1.95s call     test/test_models.py::test_classification_model[cpu-mobilenet_v3_large]
1.94s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x1_0]
1.93s call     test/test_models.py::test_segmentation_model[cpu-deeplabv3_resnet50]
1.92s call     test/test_models.py::test_segmentation_model[cpu-fcn_resnet50]
1.90s call     test/test_models.py::test_classification_model[cpu-regnet_x_800mf]
1.84s call     test/test_models.py::test_classification_model[cpu-resnet50]
1.79s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x2_0]
1.79s call     test/test_models.py::test_video_model[cpu-r3d_18]
1.72s call     test/test_models.py::test_classification_model[cpu-mnasnet0_75]
1.70s call     test/test_models.py::test_segmentation_model[cpu-lraspp_mobilenet_v3_large]
1.68s call     test/test_models.py::test_classification_model[cpu-mnasnet0_5]
1.66s call     test/test_models.py::test_classification_model[cpu-mnasnet1_0]
1.63s call     test/test_models.py::test_classification_model[cpu-shufflenet_v2_x1_5]
1.49s call     test/test_models.py::test_video_model[cpu-mc3_18]
1.47s call     test/test_models.py::test_classification_model[cpu-mobilenet_v3_small]
1.32s call     test/test_models.py::test_classification_model[cpu-resnet34]
1.13s call     test/test_models.py::test_classification_model[cpu-alexnet]
0.94s call     test/test_models.py::test_classification_model[cpu-resnet18]
0.65s call     test/test_models.py::test_classification_model[cpu-squeezenet1_0]
0.38s call     test/test_models.py::test_classification_model[cpu-squeezenet1_1]

@datumbox datumbox marked this pull request as draft March 9, 2022 11:39
@facebook-github-bot
Copy link

facebook-github-bot commented Mar 9, 2022

💊 CI failures summary and remediations

As of commit 01be6a5 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@datumbox datumbox force-pushed the hackathon/speedup_tests branch from 9214973 to 17593a5 Compare March 9, 2022 14:12
@datumbox datumbox force-pushed the hackathon/speedup_tests branch from db77373 to 1221e12 Compare March 9, 2022 18:11
@datumbox datumbox changed the title [WIP] Speed up CI tests Speed up Model tests by 20% Mar 9, 2022
@datumbox datumbox marked this pull request as ready for review March 9, 2022 18:17
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @datumbox , I took a brief look only. Minor question but otherwise LGTM

eager_out = nn_module(*args)
if eager_out is None:
with torch.no_grad(), freeze_rng_state():
if unwrapper:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line needed? It looks like it eager_out wasn't unwrapped before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not mandatory to have it for most of the existing models but this is only due to implementation details. Since this is a general purpose tool for checking JIT-scriptability, I opted for consistently unwrapping the output in all place.

FYI the reason many models don't have to get unwrapped is due to idioms like this:

@torch.jit.unused
def eager_outputs(self, x: Tensor, aux2: Tensor, aux1: Optional[Tensor]) -> GoogLeNetOutputs:
if self.training and self.aux_logits:
return _GoogLeNetOutputs(x, aux2, aux1)
else:
return x # type: ignore[return-value]
def forward(self, x: Tensor) -> GoogLeNetOutputs:
x = self._transform_input(x)
x, aux1, aux2 = self._forward(x)
aux_defined = self.training and self.aux_logits
if torch.jit.is_scripting():
if not aux_defined:
warnings.warn("Scripted GoogleNet always returns GoogleNetOutputs Tuple")
return GoogLeNetOutputs(x, aux2, aux1)
else:
return self.eager_outputs(x, aux2, aux1)

New non-detection models don't use this idiom any more (returning different output depending on jit/training flag), so I think it's safer to handle it explicitly.

Copy link
Contributor

@jdsgomes jdsgomes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for this change - great improvement!

@datumbox datumbox merged commit d0dede0 into pytorch:main Mar 9, 2022
@datumbox datumbox deleted the hackathon/speedup_tests branch March 9, 2022 19:34
@github-actions
Copy link

github-actions bot commented Mar 9, 2022

Hey @datumbox!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

facebook-github-bot pushed a commit that referenced this pull request Mar 15, 2022
Summary:
* Measuring execution times of models.

* Speed up models by avoiding re-estimation of eager output

* Fixing linter

* Reduce input size for big models

* Speed up jit check method.

* Add simple jitscript fallback check for flaky models.

* Restore pytest filtering

* Fixing linter

Reviewed By: vmoens

Differential Revision: D34878998

fbshipit-source-id: 37bfa05aac0d28d59d3320119147446006bff75c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants