[ONNX] enable some RNN tests in scripting mode #57082

garymm · 2021-04-28T01:07:09Z

Previously they were all disabled, even though many of them actually
pass today.

Previously they were all disabled, even though many of them actually pass today.

facebook-github-bot · 2021-04-28T01:07:14Z

💊 CI failures summary and remediations

As of commit 455d1ee (more details on the Dr. CI page):

4/5 failures possibly* introduced in this PR
- 1/4 non-scanned failure(s)
1/5 broken upstream at merge base 0cb84f1 on Apr 27 from 3:48pm to 6:19pm

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_py3_6_gcc5_4_jit_legacy_test (1/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 28 01:39:23 sccache: error: couldn't connect to server

Apr 28 01:39:23 +++ eval 'extract_trap_cmd '
Apr 28 01:39:23 ++++ extract_trap_cmd
Apr 28 01:39:23 ++++ printf '%s\n' ''
Apr 28 01:39:23 +++ printf '%s\n' cleanup
Apr 28 01:39:23 ++ trap -- '
Apr 28 01:39:23 cleanup' EXIT
Apr 28 01:39:23 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-jit_legacy-test != *pytorch-win-* ]]
Apr 28 01:39:23 ++ which sccache
Apr 28 01:39:23 ++ sccache --stop-server
Apr 28 01:39:23 Stopping sccache server...
Apr 28 01:39:23 sccache: error: couldn't connect to server
Apr 28 01:39:23 sccache: caused by: Connection refused (os error 111)
Apr 28 01:39:23 ++ true
Apr 28 01:39:23 ++ rm /var/lib/jenkins/sccache_error.log
Apr 28 01:39:23 ++ [[ -n '' ]]
Apr 28 01:39:23 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-jit_legacy-test == *rocm* ]]
Apr 28 01:39:23 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Apr 28 01:39:23 ++ SCCACHE_IDLE_TIMEOUT=1200
Apr 28 01:39:23 ++ RUST_LOG=sccache::server=error
Apr 28 01:39:23 ++ sccache --start-server
Apr 28 01:39:23 sccache: Starting the server...

pytorch_linux_xenial_py3_6_gcc5_4_test (2/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 28 01:39:40 sccache: error: couldn't connect to server

Apr 28 01:39:40 +++ eval 'extract_trap_cmd '
Apr 28 01:39:40 ++++ extract_trap_cmd
Apr 28 01:39:40 ++++ printf '%s\n' ''
Apr 28 01:39:40 +++ printf '%s\n' cleanup
Apr 28 01:39:40 ++ trap -- '
Apr 28 01:39:40 cleanup' EXIT
Apr 28 01:39:40 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test != *pytorch-win-* ]]
Apr 28 01:39:40 ++ which sccache
Apr 28 01:39:40 ++ sccache --stop-server
Apr 28 01:39:40 Stopping sccache server...
Apr 28 01:39:40 sccache: error: couldn't connect to server
Apr 28 01:39:40 sccache: caused by: Connection refused (os error 111)
Apr 28 01:39:40 ++ true
Apr 28 01:39:40 ++ rm /var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ [[ -n '' ]]
Apr 28 01:39:40 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test == *rocm* ]]
Apr 28 01:39:40 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ SCCACHE_IDLE_TIMEOUT=1200
Apr 28 01:39:40 ++ RUST_LOG=sccache::server=error
Apr 28 01:39:40 ++ sccache --start-server
Apr 28 01:39:40 sccache: Starting the server...

pytorch_linux_backward_compatibility_check_test (3/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 28 01:39:40 sccache: error: couldn't connect to server

Apr 28 01:39:40 +++ eval 'extract_trap_cmd '
Apr 28 01:39:40 ++++ extract_trap_cmd
Apr 28 01:39:40 ++++ printf '%s\n' ''
Apr 28 01:39:40 +++ printf '%s\n' cleanup
Apr 28 01:39:40 ++ trap -- '
Apr 28 01:39:40 cleanup' EXIT
Apr 28 01:39:40 ++ [[ pytorch-linux-backward-compatibility-check-test != *pytorch-win-* ]]
Apr 28 01:39:40 ++ which sccache
Apr 28 01:39:40 ++ sccache --stop-server
Apr 28 01:39:40 Stopping sccache server...
Apr 28 01:39:40 sccache: error: couldn't connect to server
Apr 28 01:39:40 sccache: caused by: Connection refused (os error 111)
Apr 28 01:39:40 ++ true
Apr 28 01:39:40 ++ rm /var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ [[ -n '' ]]
Apr 28 01:39:40 ++ [[ pytorch-linux-backward-compatibility-check-test == *rocm* ]]
Apr 28 01:39:40 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ SCCACHE_IDLE_TIMEOUT=1200
Apr 28 01:39:40 ++ RUST_LOG=sccache::server=error
Apr 28 01:39:40 ++ sccache --start-server
Apr 28 01:39:40 sccache: Starting the server...

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

pytorch_xla_linux_bionic_py3_6_clang9_test on Apr 27 from 3:48pm to 6:19pm (7bcce2a - fa57191)
- 🔁 rerun

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm4.1-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

neginraoof · 2021-05-05T17:45:08Z

test/onnx/test_pytorch_onnx_onnxruntime.py

    @skipIfUnsupportedMinOpsetVersion(9)
    def f(self):
+        self.is_script_test_enabled = is_script_test_enabled


Is this PR still needed after #57564 ?

garymm · 2021-05-05T18:38:19Z

Not needed after #57564

Note the first commit in this PR has its own pull request here since it seemed self-contained: #57082 * [ONNX] simplify batch_first logic in RNN tests * [ONNX] support GRU with packed input in scripting mode This required two changes: * Add as_tensor to symbolic_opset9.py * Change torch::jit::pushPackingPastRnn to recognize and properly replace another use of the batch_sizes output of prim::PackPadded. Previously the code assumed that the first use was as input to the RNN operator. However in some cases, it is also used to compute max_batch_size. For example in this code: https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815 With these changes the GRU tests now pass in scripting mode for opset version >= 11. Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

Note the first commit in this PR has its own pull request here since it seemed self-contained: #57082 * [ONNX] simplify batch_first logic in RNN tests * [ONNX] support GRU with packed input in scripting mode This required two changes: * Add as_tensor to symbolic_opset9.py * Change torch::jit::pushPackingPastRnn to recognize and properly replace another use of the batch_sizes output of prim::PackPadded. Previously the code assumed that the first use was as input to the RNN operator. However in some cases, it is also used to compute max_batch_size. For example in this code: https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815 With these changes the GRU tests now pass in scripting mode for opset version >= 11. Co-authored-by: Gary Miguel <garymiguel@microsoft.com> [ghstack-poisoned]

Note the first commit in this PR has its own pull request here since it seemed self-contained: #57082 * [ONNX] simplify batch_first logic in RNN tests * [ONNX] support GRU with packed input in scripting mode This required two changes: * Add as_tensor to symbolic_opset9.py * Change torch::jit::pushPackingPastRnn to recognize and properly replace another use of the batch_sizes output of prim::PackPadded. Previously the code assumed that the first use was as input to the RNN operator. However in some cases, it is also used to compute max_batch_size. For example in this code: https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815 With these changes the GRU tests now pass in scripting mode for opset version >= 11. Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

Note the first commit in this PR has its own pull request here since it seemed self-contained: #57082 * [ONNX] simplify batch_first logic in RNN tests * [ONNX] support GRU with packed input in scripting mode This required two changes: * Add as_tensor to symbolic_opset9.py * Change torch::jit::pushPackingPastRnn to recognize and properly replace another use of the batch_sizes output of prim::PackPadded. Previously the code assumed that the first use was as input to the RNN operator. However in some cases, it is also used to compute max_batch_size. For example in this code: https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815 With these changes the GRU tests now pass in scripting mode for opset version >= 11. Co-authored-by: Gary Miguel <garymiguel@microsoft.com> [ghstack-poisoned]

Note the first commit in this PR has its own pull request here since it seemed self-contained: #57082 * [ONNX] simplify batch_first logic in RNN tests * [ONNX] support GRU with packed input in scripting mode This required two changes: * Add as_tensor to symbolic_opset9.py * Change torch::jit::pushPackingPastRnn to recognize and properly replace another use of the batch_sizes output of prim::PackPadded. Previously the code assumed that the first use was as input to the RNN operator. However in some cases, it is also used to compute max_batch_size. For example in this code: https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815 With these changes the GRU tests now pass in scripting mode for opset version >= 11. Co-authored-by: Gary Miguel <garymiguel@microsoft.com> Differential Revision: [D28714805](https://our.internmc.facebook.com/intern/diff/D28714805) [ghstack-poisoned]

Summary: Pull Request resolved: #58691 Note the first commit in this PR has its own pull request here since it seemed self-contained: #57082 * [ONNX] simplify batch_first logic in RNN tests * [ONNX] support GRU with packed input in scripting mode This required two changes: * Add as_tensor to symbolic_opset9.py * Change torch::jit::pushPackingPastRnn to recognize and properly replace another use of the batch_sizes output of prim::PackPadded. Previously the code assumed that the first use was as input to the RNN operator. However in some cases, it is also used to compute max_batch_size. For example in this code: https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815 With these changes the GRU tests now pass in scripting mode for opset version >= 11. Test Plan: Imported from OSS Reviewed By: driazati Differential Revision: D28714805 Pulled By: SplitInfinity fbshipit-source-id: f19647a04533d9ec76399a8793b3f712ea0337d2 Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

Summary: Pull Request resolved: pytorch#58691 Note the first commit in this PR has its own pull request here since it seemed self-contained: pytorch#57082 * [ONNX] simplify batch_first logic in RNN tests * [ONNX] support GRU with packed input in scripting mode This required two changes: * Add as_tensor to symbolic_opset9.py * Change torch::jit::pushPackingPastRnn to recognize and properly replace another use of the batch_sizes output of prim::PackPadded. Previously the code assumed that the first use was as input to the RNN operator. However in some cases, it is also used to compute max_batch_size. For example in this code: https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815 With these changes the GRU tests now pass in scripting mode for opset version >= 11. Test Plan: Imported from OSS Reviewed By: driazati Differential Revision: D28714805 Pulled By: SplitInfinity fbshipit-source-id: f19647a04533d9ec76399a8793b3f712ea0337d2 Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

[ONNX] enable some RNN tests in scripting mode

455d1ee

Previously they were all disabled, even though many of them actually pass today.

garymm requested review from BowenBao, neginraoof and spandantiwari as code owners April 28, 2021 01:07

facebook-github-bot added the cla signed label Apr 28, 2021

pytorchbot added the open source label Apr 28, 2021

garymm mentioned this pull request May 4, 2021

[ONNX] RNN scripting #57564

Merged

neginraoof reviewed May 5, 2021

View reviewed changes

garymm closed this May 5, 2021

garymm deleted the aten-as_tensor branch May 5, 2021 19:14

BowenBao mentioned this pull request May 19, 2021

[ONNX] RNN scripting (#57564) #58576

Closed

BowenBao mentioned this pull request May 20, 2021

[ONNX] RNN scripting (#57564) #58691

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNX] enable some RNN tests in scripting mode #57082

[ONNX] enable some RNN tests in scripting mode #57082

garymm commented Apr 28, 2021

facebook-github-bot commented Apr 28, 2021 •

edited

Loading

neginraoof May 5, 2021 •

edited

Loading

garymm commented May 5, 2021

[ONNX] enable some RNN tests in scripting mode #57082

[ONNX] enable some RNN tests in scripting mode #57082

Conversation

garymm commented Apr 28, 2021

facebook-github-bot commented Apr 28, 2021 • edited Loading

💊 CI failures summary and remediations

🕵️ 3 new failures recognized by patterns

pytorch_linux_xenial_py3_6_gcc5_4_jit_legacy_test (1/3)

pytorch_linux_xenial_py3_6_gcc5_4_test (2/3)

pytorch_linux_backward_compatibility_check_test (3/3)

🚧 1 fixed upstream failure:

ci.pytorch.org: 1 failed

neginraoof May 5, 2021 • edited Loading

Choose a reason for hiding this comment

garymm commented May 5, 2021

facebook-github-bot commented Apr 28, 2021 •

edited

Loading

neginraoof May 5, 2021 •

edited

Loading