[v1.8.0] Getting signal for `release/1.8` #51995

seemethere · 2021-02-09T21:40:33Z

No description provided.

Summary: Fixes #50695 I checked locally that the concatenated license file appears at `torch-<version>.dist-info/LICENSE` in the wheel. Pull Request resolved: #51634 Reviewed By: zhangguanheng66 Differential Revision: D26225550 Pulled By: walterddr fbshipit-source-id: 830c59fb7aea0eb50b99e295edddad9edab6ba3a Co-authored-by: mattip <matti.picus@gmail.com>

…el (#51864) (#51890) Summary: Test begins to fail after the driver udpate See #51863 Pull Request resolved: #51864 Reviewed By: bertmaher Differential Revision: D26304018 Pulled By: malfet fbshipit-source-id: bb7ade2f28d8cf8f847159d4ce92391f0794c258 Co-authored-by: Nikita Shulga <nshulga@fb.com>

facebook-github-bot · 2021-02-09T23:14:42Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/51995
✖️ Python docs build was skipped
✖️ C++ docs build was skipped
❓Need help or want to give feedback on the CI? Visit our office hours
↩️ [fb-only] Re-run with SSH instructions

❌ 4 New Failures

As of commit 56b43f4 (more details on the Dr. CI page):

Expand to see more

4/4 failures introduced in this PR

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages

pytorch_windows_vs2019_py36_cuda11.1_build (1/3)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

ModuleNotFoundError: No module named 'yaml'

Building wheel torch-1.8.0a0+56b43f4
-- Building version 1.8.0a0+56b43f4
Traceback (most recent call last):
  File "C:\Users\circleci\project\setup.py", line 368, in check_pydep
    importlib.import_module(importname)
  File "C:\Jenkins\Miniconda3\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'yaml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\circleci\project\setup.py", line 818, in <module>
    build_deps()
  File "C:\Users\circleci\project\setup.py", line 313, in build_deps
    check_pydep('yaml', 'pyyaml')
  File "C:\Users\circleci\project\setup.py", line 370, in check_pydep
    raise RuntimeError(missing_pydep.format(importname=importname, module=module))

pytorch_windows_vs2019_py36_cuda10.1_build (2/3)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

ModuleNotFoundError: No module named 'yaml'

Building wheel torch-1.8.0a0+56b43f4
-- Building version 1.8.0a0+56b43f4
Traceback (most recent call last):
  File "C:\Users\circleci\project\setup.py", line 368, in check_pydep
    importlib.import_module(importname)
  File "C:\Jenkins\Miniconda3\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'yaml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\circleci\project\setup.py", line 818, in <module>
    build_deps()
  File "C:\Users\circleci\project\setup.py", line 313, in build_deps
    check_pydep('yaml', 'pyyaml')
  File "C:\Users\circleci\project\setup.py", line 370, in check_pydep
    raise RuntimeError(missing_pydep.format(importname=importname, module=module))

pytorch_windows_vs2019_py36_cpu_build (3/3)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

ModuleNotFoundError: No module named 'yaml'

Building wheel torch-1.8.0a0+56b43f4
-- Building version 1.8.0a0+56b43f4
Traceback (most recent call last):
  File "C:\Users\circleci\project\setup.py", line 368, in check_pydep
    importlib.import_module(importname)
  File "C:\Jenkins\Miniconda3\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'yaml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\circleci\project\setup.py", line 818, in <module>
    build_deps()
  File "C:\Users\circleci\project\setup.py", line 313, in build_deps
    check_pydep('yaml', 'pyyaml')
  File "C:\Users\circleci\project\setup.py", line 370, in check_pydep
    raise RuntimeError(missing_pydep.format(importname=importname, module=module))

🕵️‍♀️ 1 failure not recognized by patterns:

The following CI failures may be due to changes from the PR

Job	Step	Action
^{pytorch_macos_10_13_py3_test}	^Test	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Summary: tries to fix doc_test Pull Request resolved: #51825 Reviewed By: bertmaher Differential Revision: D26295583 Pulled By: ngimel fbshipit-source-id: 13f6e7f1675d810adfd4abd2d579e2812fe54c80 (cherry picked from commit 6c0bf28) Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Co-authored-by: Natalia Gimelshein <ngimel@fb.com>

Summary: Fixes issue: #49728 ======== Ternary if operation fails in Torchscript when the condition variable is annotated as Final. Tests: ======= pytest -k test_ternary_static_if test/test_jit.py Pull Request resolved: #51789 Reviewed By: gmagogsfm Differential Revision: D26278969 Pulled By: nikithamalgifb fbshipit-source-id: 27d1383290211503188428fb2e8b7749f59ba16e Co-authored-by: nikithamalgi <nikithamalgi@devvm146.prn0.facebook.com>

* Fix leaf modules in Transformer [ghstack-poisoned] * Fix tuple type annotations [ghstack-poisoned] * Generalize dict key check in `create-arg` (#51927) Summary: Pull Request resolved: #51927 Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D26329655 Pulled By: jamesr66a fbshipit-source-id: a15e7d9564551521af12a8fde1c7524856f0cbc2

Summary: Pull Request resolved: #51878 `fake_quantize_per_tensor_affine_cachemask` and `fake_quantize_per_channel_affine_cachemask` are implementation details of `fake_quantize_per_tensor_affine` and `fake_quantize_per_channel_affine`, removing the Python bindings for them since there is no need to expose them. Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` Imported from OSS Reviewed By: albanD, bugra Differential Revision: D26314173 fbshipit-source-id: 733c93a3951453e739b6ed46b72fbad2244f6e97 (cherry picked from commit 33afb5f)

Summary: Move definition of copysign template and specialization for bfloat16/half types before first use of copysign in that file Add comment explaining why this is necessary Fixes #51889 Pull Request resolved: #51900 Reviewed By: walterddr Differential Revision: D26321741 Pulled By: malfet fbshipit-source-id: 888858b11d9708fa140fe9c0570cc5a24599205b

Summary: It frequently happens when PyTorch compiled with CUDA support is installed on machine that does not have NVIDIA GPUs. Fixes #47038 Pull Request resolved: #51806 Reviewed By: ezyang Differential Revision: D26285827 Pulled By: malfet fbshipit-source-id: 9fd5e690d0135a2b219c1afa803fb69de9729f5e

…ation hooks (#52215) Co-authored-by: wayi <wayi@devgpu238.prn2.facebook.com>

Co-authored-by: Mike Ruberry <mruberry@devfair044.maas>

Summary: Pull Request resolved: #50180 Resolves the regression in #49819 by adding copy over background stream similar to scatter. For internal use cases, this is gated with an env var that maintains the previous behavior when it is off. Test Plan: CI Reviewed By: mrshenli, ngimel Differential Revision: D25818170 fbshipit-source-id: e50c76c035504b2a44e2be084701cee45c90df75

Co-authored-by: Vitaly Fedyunin <vitaly.fedyunin@gmail.com>

…#52365)

torch.vmap is a prototype feature and should not be in the stable binary. This PR: - Removes the `torch.vmap` API - Removes the documentation entry for torch.vmap - Changes the vmap tests to use an internal API instead of torch.vmap. Test Plan: - Tested locally (test_torch, test_autograd, test_type_hints, test_vmap), but also wait for CI.

Summary: Necessary to ensure correct link order, especially if libraries are linked statically. Otherwise, one might run into: ``` /usr/bin/ld: /usr/local/cuda/lib64/libcublasLt_static.a(libcublasLt_static.a.o): undefined reference to symbol 'cudaStreamWaitEvent@libcudart.so.11.0' /usr/local/cuda/lib64/libcudart.so: error adding symbols: DSO missing from command line ``` Pull Request resolved: #52243 Reviewed By: seemethere, ngimel Differential Revision: D26437159 Pulled By: malfet fbshipit-source-id: 33b8bb5040bda10537833f3ad737f535488452ea

…#52406) Summary: Pull Request resolved: #52151 CUDA 11.2 might not be as performant as we thought so let's downgrade to something we think is more performant. Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D26408314 Pulled By: seemethere fbshipit-source-id: e2446aa0115e2c2a79718b1fdfd9fccf2072822d (cherry picked from commit a11650b) Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Summary: First part of #49886 to at least properly warn users of the current state Pull Request resolved: #52311 Reviewed By: soulitzer Differential Revision: D26495644 Pulled By: albanD fbshipit-source-id: 72abdfe41cdbcc1ac739a536eb85d1aa4ba90897

Summary: Pull Request resolved: #52389 Fixes: #49159 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26496319 Pulled By: gchanan fbshipit-source-id: d385cd683ef09e0596a9875ce84d03e6e77acc93

Summary: Fixes #39502 This PR adds support for exporting **fake_quantize_per_channel_affine** to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR #39738. `axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by onnx/onnx#2772. [update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master. The function is also tested offline with the following code ```python import torch from torch import quantization from torchvision import models qat_resnet18 = models.resnet18(pretrained=True).eval().cuda() qat_resnet18.qconfig = quantization.QConfig( activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant) quantization.prepare_qat(qat_resnet18, inplace=True) qat_resnet18.apply(quantization.enable_observer) qat_resnet18.apply(quantization.enable_fake_quant) dummy_input = torch.randn(16, 3, 224, 224).cuda() _ = qat_resnet18(dummy_input) for module in qat_resnet18.modules(): if isinstance(module, quantization.FakeQuantize): module.calculate_qparams() qat_resnet18.apply(quantization.disable_observer) qat_resnet18.cuda() input_names = [ "actual_input_1" ] output_names = [ "output1" ] torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13) ``` It can generate the desired graph. Pull Request resolved: #42835 Reviewed By: houseroad Differential Revision: D26293823 Pulled By: SplitInfinity fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea Co-authored-by: Hao Wu <skyw@users.noreply.github.com>

Summary: This is getting tested by #52441. Adds new config for macos arm64 to our binary builds. Now stores artifacts for mac builds. Pull Request resolved: #52443 Reviewed By: walterddr Differential Revision: D26517330 Pulled By: janeyx99 fbshipit-source-id: 02774937a827bdd4c08486dc9f8fe63446917f1e

Co-authored-by: eellison <eellison@fb.com>

Co-authored-by: Nikita Shulga <nshulga@fb.com> Co-authored-by: peterjc123 <peterghost86@gmail.com> Co-authored-by: Jane Xu <janeyx@fb.com>

…llLoss (#52510) Co-authored-by: Shubham Bhokare <32080845+shubhambhokare1@users.noreply.github.com>

Summary: Fixes #{issue number} Pull Request resolved: #51847 Reviewed By: albanD Differential Revision: D26405678 Pulled By: malfet fbshipit-source-id: 073b675225b48d1732771583f8f2473e0fdcf35c Co-authored-by: Joe Zhu <jozh@microsoft.com>

* [FX] Cherrypick docs fixes * Update code links to point to 1.8

Summary: For enabling amp in torch/xla, see [this](pytorch/xla#2654). Pull Request resolved: #48570 Reviewed By: ezyang Differential Revision: D26120627 Pulled By: ailzhang fbshipit-source-id: 32627b17c04bfdad128624676ea9bf6f117bc97d Co-authored-by: Chengji Yao <yaochengji@hotmail.com>

…53675) Summary: For #47027. Some progress has been made in #50665, but in my testing trying to unwrap the circular dependencies is turning into a neverending quest. This PR explicitly exports things in the top-level torch module without any semantic effect, in accordance with this py.typed library guidance: https://github.com/microsoft/pyright/blob/master/docs/typed-libraries.md#library-interface It may be possible to do some of the other fixes just using `__all__` where needed, but `__all__` has a semantic effect I would like to further review. This PR at least fixes simple completions like `torch.nn` in Pylance/pyright. Pull Request resolved: #52339 Reviewed By: smessmer Differential Revision: D26694909 Pulled By: malfet fbshipit-source-id: 99f2c6d0bf972afd4036df988e3acae857dde3e1 Co-authored-by: Jake Bailey <5341706+jakebailey@users.noreply.github.com>

Summary: Pull Request resolved: #53133 In light of some issues where users were having trouble installing CUDA specific versions of pytorch we should no longer have special privileges for CUDA 10.2. Recently I added scripts/release/promote/prep_binary_for_pypi.sh (#53056) to make it so that we could theoretically promote any wheel we publish to download.pytorch.org to pypi Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: walterddr Differential Revision: D26759823 Pulled By: seemethere fbshipit-source-id: 2d2b29e7fef0f48c23f3c853bdca6144b7c61f22 (cherry picked from commit b8546bd) Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Summary: Pull Request resolved: #53508 closes #53501 Differential Revision: D26885263 Test Plan: Imported from OSS Reviewed By: H-Huang Pulled By: mrshenli fbshipit-source-id: dd0493e6f179d93b518af8f082399cacb1c7cba6

Summary: Fixes #{51801} LSTMCell example updated Pull Request resolved: #51983 Reviewed By: agolynski Differential Revision: D26467104 Pulled By: zou3519 fbshipit-source-id: 31c8bf89b21cd2f748b2cc28a74169082d81503c Co-authored-by: CarlosJose126 <43588143+CarlosJose126@users.noreply.github.com>

Summary: Mitigates #53267 Pull Request resolved: #53274 Reviewed By: zhangguanheng66, ailzhang Differential Revision: D26819702 Pulled By: cpuhrsch fbshipit-source-id: 5b9b30db6f8fc414aa9f3c841429bf99bc927763 Co-authored-by: cpuhrsch <cpuhrsch@devvm2783.frc0.facebook.com>

* Add sample validation for LKJCholesky.log_prob * Fix distributions which don't properly honor validate_args=False A number of derived distributions use base distributions in their implementation. We add what we hope is a comprehensive test whether all distributions actually honor skipping validation of arguments in log_prob and then fix the bugs we found. These bugs are particularly cumbersome in PyTorch 1.8 and master when validate_args is turned on by default In addition one might argue that validate_args is not performing as well as it should when the default is not to validate but the validation is turned on in instantiation. Arguably, there is another set of bugs or at least inconsistencies when validation of inputs does not prevent invalid indices in sample validation (when with validation an IndexError is raised in the test). We would encourage the implementors to be more ambitious when validation is turned on and amend sample validation to throw a ValueError for consistency. * additional fixes to distributions * address failing tests Co-authored-by: neerajprad <neerajprad@devvm903.atn0.facebook.com> Co-authored-by: Thomas Viehmann <tv.code@beamnet.de>

Summary: Fixes #53368 Pull Request resolved: #53447 Reviewed By: albanD Differential Revision: D26946284 Pulled By: jbschlosser fbshipit-source-id: 54e5eec7da86fa02b1b6e4a235d66976a80764fc Co-authored-by: kshitij12345 <kshitijkalambarkar@gmail.com>

- Support transferring >2GB over CMA - Avoid loading stub version of CUDA driver - Don't use unsupported mmap option on older kernels - Don't join non-existing thread if CMA is not viable The last two manifested as uncaught exceptions (hence crashes) when initializing RPC. The first one caused same-machine RPC requests to fail.

…scriptMethods (#53519) (#53548) (#54005) Summary: Pull Request resolved: #53548 fixes issue faced in #53506 Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D26922415 Pulled By: malfet fbshipit-source-id: b61842827bb14cef8c7a7089b2426fa53e642c90 Co-authored-by: BowenBao <bowbao@microsoft.com>

…53328) (#53529) (#54007) Summary: Pull Request resolved: #53529 Supported for ONNX export after opset 10. This is not exportable to opsets < 10 due to 1. onnx::IsInf is introduced in opset 10 2. onnx::Equal does not accept float tensor prior to opset 11 Test Plan: Imported from OSS Reviewed By: pbelevich, malfet Differential Revision: D26922418 Pulled By: SplitInfinity fbshipit-source-id: 69bcba50520fa3d69db4bd4c2b9f88c00146fca7 Co-authored-by: BowenBao <bowbao@microsoft.com>

Summary: Pull Request resolved: #52216 Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D26427506 Pulled By: ailzhang fbshipit-source-id: ba4f2f66794cb2843926e5566eb4d25582f7fb2b Co-authored-by: Ailing Zhang <ailzhang@fb.com>

#52893) (#53311) (#54019) Summary: Pull Request resolved: #53311 Fixes dict output & nested tuple. Test Plan: Imported from OSS Reviewed By: pbelevich, malfet Differential Revision: D26922426 Pulled By: SplitInfinity fbshipit-source-id: c2c6b71c8d978b990181e0b025626dbf6ef2199e

Summary: To be in-sync with #53447 Pull Request resolved: #53931 Reviewed By: ngimel Differential Revision: D27026616 Pulled By: malfet fbshipit-source-id: 4c50b29fa296c90aeeeb1757bdaada92cbba33d4

Summary: Updating Kineto to include bugfixes for 1.8.1 Test Plan: CI

…= 24 * n (#54015) * Disabling dispatch to OneDNN for group convolutions when groups size is 24 * n * Add condition to non-zero grps Co-authored-by: Vitaly Fedyunin <vitaly.fedyunin@gmail.com>

Summary: Follow-up of #53447 Reference: #53447 (comment) Pull Request resolved: #53809 Reviewed By: bdhirsh Differential Revision: D27049643 Pulled By: jbschlosser fbshipit-source-id: 623a2a254783b86391dc2b0777b688506adb4c0e Co-authored-by: kshitij12345 <kshitijkalambarkar@gmail.com>

Summary: Since `char` is not guaranteed to be signed on all platforms (it is unsigned on ARM) Fixes #52146 Pull Request resolved: #52616 Test Plan: Run ` python3 -c "import torch;a=torch.tensor([-1], dtype=torch.int8);print(a.tolist())"` on arm-linux system Reviewed By: walterddr Differential Revision: D26586678 Pulled By: malfet fbshipit-source-id: 91972189b54f86add516ffb96d579acb0bc13311

Summary: When compiled with OpenMP support `ideep`'s computational_cache would cache max number of OpenMP workers This number could be wrong after `torch.set_num_threads` call, so clean it after the call. Fixes #53565 Pull Request resolved: #53871 Reviewed By: albanD Differential Revision: D27003265 Pulled By: malfet fbshipit-source-id: 1d84c23070eafb3d444e09590d64f97f99ae9d36

Co-authored-by: Joel Benjamin Schlosser <jbschlosser@fb.com>

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

codecov · 2021-03-20T07:42:48Z

Codecov Report

Merging #51995 (b6f4980) into orig/release/1.8 (9112f4e) will increase coverage by 12.61%.
The diff coverage is 67.96%.

❗ Current head b6f4980 differs from pull request most recent head f3c950e. Consider uploading reports for the commit f3c950e to get more accurate results

@@                  Coverage Diff                  @@
##           orig/release/1.8   #51995       +/-   ##
=====================================================
+ Coverage             67.88%   80.49%   +12.61%     
=====================================================
  Files                  1790     1949      +159     
  Lines                181216   213390    +32174     
=====================================================
+ Hits                 123025   171776    +48751     
+ Misses                58191    41614    -16577

Summary: Benchmark of ```python %timeit torch.randperm(100000, device='cuda'); torch.cuda.synchronize() ``` thrust: ``` 5.76 ms ± 42.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` cub: ``` 3.02 ms ± 32.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` sync in thrust sort is removed Warning: Thrust supports 64bit indexing, but cub doesn't, so this is a functional regression. However, `torch.randperm(2**31, device='cuda')` fails with OOM on 40GB A100, and `torch.randperm(2**32, device='cuda')` fails with OOM on 80GB A100, so I think this functional regression has low impact and is acceptable. Pull Request resolved: #53841 Reviewed By: albanD Differential Revision: D26993453 Pulled By: ngimel fbshipit-source-id: 39dd128559d53dbb01cab1585e5462cb5f3cceca Co-authored-by: Xiang Gao <qasdfgtyuiop@gmail.com>

Some users who are building from source on old glibc versions are hitting the issue of TensorPipe using the process_vm_readv syscall which is not wrapped by glibc. This PR tries to check that condition in CMake and disable that backend in such cases. This should have no effect on PyTorch's official builds, it should just help people who are building from source.

* [CI]Install older cmath during Windows build (#54431) Summary: Based on peterjc123 analysis, `cmath` after microsoft/STL@26bbe2a#diff-3fa97ceb95d524432661f01d4b34509c6d261a2f7f45ddcf26f79f55b3eec88a renders a lot of CUDA fail to compile with: ``` error: calling a __host__ function("__copysignf") from a __host__ __device__ function("c10::guts::detail::apply_impl< ::at::native::AUnaryFunctor< ::> &, ::std::tuple<float > &, (unsigned long long)0ull > ") is not allowed ``` Workaround for #54382 Pull Request resolved: #54431 Reviewed By: anjali411 Differential Revision: D27234299 Pulled By: malfet fbshipit-source-id: b3f1fef941341222cc10cb27346fcf4a1d522a0c * [CI] Install compatible cmath for Win binary builds (#54527) Summary: Pull Request resolved: #54527 Reviewed By: walterddr Differential Revision: D27269528 Pulled By: malfet fbshipit-source-id: 4afdc706598f3a6ad296468dfb77a70433ae7d0f

…ad. (#53929) (#54358) Summary: Pull Request resolved: #53929 The local autograd engine performs appropriate stream synchronization between autograd nodes in the graph to ensure a consumer's stream is synchronized with the producer's stream before executing the consumer. However in case of distributed autograd, the SendRpcBackward function receives gradients over the wire and TensorPipe uses its own pool of streams for this purpose. As a result, the tensors are received on TensorPipe's stream pool but SendRpcBackward runs on a different stream during the backward pass and there is no logic to synchronize these streams. To fix this, I've enhanced DistEngine to synchronize these streams appropriately when it receives grads over the wire. ghstack-source-id: 124055277 (Note: this ignores all push blocking failures!) Test Plan: 1) Added unit test which reproduced the issue. 2) waitforbuildbot. Reviewed By: walterddr, wanchaol Differential Revision: D27025307 fbshipit-source-id: 2944854e688e001cb3989d2741727b30d9278414 Co-authored-by: Pritam Damania <pritam.damania@fb.com>

Rong Rong and others added 2 commits February 9, 2021 10:16

facebook-github-bot added the cla signed label Feb 9, 2021

James Reed and others added 4 commits February 9, 2021 15:44

[FX] Hide experimental folder (#51987)

4073248

.jenkins: Release branch specific updates (#51982)

fa85782

seemethere force-pushed the release/1.8 branch from f2464dd to 9e5bcc1 Compare February 10, 2021 22:36

James Reed and others added 20 commits February 12, 2021 07:35

[v1.8 patch] [Resubmission] Add a documentation page for DDP communic…

c79decd

…ation hooks (#52215) Co-authored-by: wayi <wayi@devgpu238.prn2.facebook.com>

ports fix (#52242)

cd63c37

Co-authored-by: Mike Ruberry <mruberry@devfair044.maas>

Skip OneDNN Convolution in case of groups = 24 #50042 (#52313)

c6972eb

Co-authored-by: Vitaly Fedyunin <vitaly.fedyunin@gmail.com>

[1.8] Fix libnvrtc discoverability in package patched by auditwheel (…

3464d64

…#52365)

Fix upsample bicubic2d batching handling on CPU. (#52389) (#52445)

bcb64a8

Summary: Pull Request resolved: #52389 Fixes: #49159 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26496319 Pulled By: gchanan fbshipit-source-id: d385cd683ef09e0596a9875ce84d03e6e77acc93

Update freezing API - changes from 52337 (#52392)

0851cc4

Co-authored-by: eellison <eellison@fb.com>

[v1.8.0] Various CUDA 11.1 with BUILD_SPLIT_CUDA_FIXES (#52518)

f8afb8b

Co-authored-by: Nikita Shulga <nshulga@fb.com> Co-authored-by: peterjc123 <peterghost86@gmail.com> Co-authored-by: Jane Xu <janeyx@fb.com>

[1.8] Fix onnx mixed precision export for layernorm & fuseLogSoftmaxN…

8e7eebf

…llLoss (#52510) Co-authored-by: Shubham Bhokare <32080845+shubhambhokare1@users.noreply.github.com>

seemethere added this to the 1.8.0 milestone Feb 22, 2021

James Reed and others added 23 commits March 9, 2021 17:23

Docs cherrypicks 1.8.1 (#53674)

4e590c9

* [FX] Cherrypick docs fixes * Update code links to point to 1.8

Fix set_device_map docs (#53508) (#53822)

c439f85

Summary: Pull Request resolved: #53508 closes #53501 Differential Revision: D26885263 Test Plan: Imported from OSS Reviewed By: H-Huang Pulled By: mrshenli fbshipit-source-id: dd0493e6f179d93b518af8f082399cacb1c7cba6

Cherrypick #53576 into release/1.8 (#53766)

47f4b3f

[ONNX] Update embedding export wrt padding_idx (#53931) (#54033)

bb98a99

Summary: To be in-sync with #53447 Pull Request resolved: #53931 Reviewed By: ngimel Differential Revision: D27026616 Pulled By: malfet fbshipit-source-id: 4c50b29fa296c90aeeeb1757bdaada92cbba33d4

Update Kineto revision for 1.8.1 (#54044)

31a1a00

Summary: Updating Kineto to include bugfixes for 1.8.1 Test Plan: CI

Disabling dispatch to OneDNN for group convolutions when groups size …

51233ea

…= 24 * n (#54015) * Disabling dispatch to OneDNN for group convolutions when groups size is 24 * n * Add condition to non-zero grps Co-authored-by: Vitaly Fedyunin <vitaly.fedyunin@gmail.com>

[fix] Dimension out of range in pixel_shuffle / pixel_unshuffle (#54178)

d84e05b

Co-authored-by: Joel Benjamin Schlosser <jbschlosser@fb.com>

third_party: Update kineto to fix libtorch builds (#54205)

b6f4980

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

mattip and others added 5 commits March 23, 2021 11:23

various doc building cleanups (#54141)

f3c950e

seemethere closed this Apr 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v1.8.0] Getting signal for `release/1.8` #51995

[v1.8.0] Getting signal for `release/1.8` #51995

seemethere commented Feb 9, 2021

facebook-github-bot commented Feb 9, 2021 •

edited

Loading

🕵️ 3 new failures recognized by patterns

pytorch_windows_vs2019_py36_cuda11.1_build (1/3)

pytorch_windows_vs2019_py36_cuda10.1_build (2/3)

pytorch_windows_vs2019_py36_cpu_build (3/3)

🕵️‍♀️ 1 failure not recognized by patterns:

codecov bot commented Mar 20, 2021 •

edited

Loading

[v1.8.0] Getting signal for release/1.8 #51995

[v1.8.0] Getting signal for release/1.8 #51995

Conversation

seemethere commented Feb 9, 2021

facebook-github-bot commented Feb 9, 2021 • edited Loading

🔗 Helpful links

❌ 4 New Failures

🕵️ 3 new failures recognized by patterns

pytorch_windows_vs2019_py36_cuda11.1_build (1/3)

pytorch_windows_vs2019_py36_cuda10.1_build (2/3)

pytorch_windows_vs2019_py36_cpu_build (3/3)

🕵️‍♀️ 1 failure not recognized by patterns:

codecov bot commented Mar 20, 2021 • edited Loading

Codecov Report

[v1.8.0] Getting signal for `release/1.8` #51995

[v1.8.0] Getting signal for `release/1.8` #51995

facebook-github-bot commented Feb 9, 2021 •

edited

Loading

codecov bot commented Mar 20, 2021 •

edited

Loading