Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ONNX] enable some RNN tests in scripting mode #57082

Closed
wants to merge 1 commit into from

Conversation

garymm
Copy link
Collaborator

@garymm garymm commented Apr 28, 2021

Previously they were all disabled, even though many of them actually
pass today.

Previously they were all disabled, even though many of them actually
pass today.
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Apr 28, 2021

💊 CI failures summary and remediations

As of commit 455d1ee (more details on the Dr. CI page):


  • 4/5 failures possibly* introduced in this PR
    • 1/4 non-scanned failure(s)
  • 1/5 broken upstream at merge base 0cb84f1 on Apr 27 from 3:48pm to 6:19pm

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_jit_legacy_test (1/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 28 01:39:23 sccache: error: couldn't connect to server
Apr 28 01:39:23 +++ eval 'extract_trap_cmd '
Apr 28 01:39:23 ++++ extract_trap_cmd
Apr 28 01:39:23 ++++ printf '%s\n' ''
Apr 28 01:39:23 +++ printf '%s\n' cleanup
Apr 28 01:39:23 ++ trap -- '
Apr 28 01:39:23 cleanup' EXIT
Apr 28 01:39:23 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-jit_legacy-test != *pytorch-win-* ]]
Apr 28 01:39:23 ++ which sccache
Apr 28 01:39:23 ++ sccache --stop-server
Apr 28 01:39:23 Stopping sccache server...
Apr 28 01:39:23 sccache: error: couldn't connect to server
Apr 28 01:39:23 sccache: caused by: Connection refused (os error 111)
Apr 28 01:39:23 ++ true
Apr 28 01:39:23 ++ rm /var/lib/jenkins/sccache_error.log
Apr 28 01:39:23 ++ [[ -n '' ]]
Apr 28 01:39:23 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-jit_legacy-test == *rocm* ]]
Apr 28 01:39:23 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Apr 28 01:39:23 ++ SCCACHE_IDLE_TIMEOUT=1200
Apr 28 01:39:23 ++ RUST_LOG=sccache::server=error
Apr 28 01:39:23 ++ sccache --start-server
Apr 28 01:39:23 sccache: Starting the server...

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (2/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 28 01:39:40 sccache: error: couldn't connect to server
Apr 28 01:39:40 +++ eval 'extract_trap_cmd '
Apr 28 01:39:40 ++++ extract_trap_cmd
Apr 28 01:39:40 ++++ printf '%s\n' ''
Apr 28 01:39:40 +++ printf '%s\n' cleanup
Apr 28 01:39:40 ++ trap -- '
Apr 28 01:39:40 cleanup' EXIT
Apr 28 01:39:40 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test != *pytorch-win-* ]]
Apr 28 01:39:40 ++ which sccache
Apr 28 01:39:40 ++ sccache --stop-server
Apr 28 01:39:40 Stopping sccache server...
Apr 28 01:39:40 sccache: error: couldn't connect to server
Apr 28 01:39:40 sccache: caused by: Connection refused (os error 111)
Apr 28 01:39:40 ++ true
Apr 28 01:39:40 ++ rm /var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ [[ -n '' ]]
Apr 28 01:39:40 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test == *rocm* ]]
Apr 28 01:39:40 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ SCCACHE_IDLE_TIMEOUT=1200
Apr 28 01:39:40 ++ RUST_LOG=sccache::server=error
Apr 28 01:39:40 ++ sccache --start-server
Apr 28 01:39:40 sccache: Starting the server...

See CircleCI build pytorch_linux_backward_compatibility_check_test (3/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 28 01:39:40 sccache: error: couldn't connect to server
Apr 28 01:39:40 +++ eval 'extract_trap_cmd '
Apr 28 01:39:40 ++++ extract_trap_cmd
Apr 28 01:39:40 ++++ printf '%s\n' ''
Apr 28 01:39:40 +++ printf '%s\n' cleanup
Apr 28 01:39:40 ++ trap -- '
Apr 28 01:39:40 cleanup' EXIT
Apr 28 01:39:40 ++ [[ pytorch-linux-backward-compatibility-check-test != *pytorch-win-* ]]
Apr 28 01:39:40 ++ which sccache
Apr 28 01:39:40 ++ sccache --stop-server
Apr 28 01:39:40 Stopping sccache server...
Apr 28 01:39:40 sccache: error: couldn't connect to server
Apr 28 01:39:40 sccache: caused by: Connection refused (os error 111)
Apr 28 01:39:40 ++ true
Apr 28 01:39:40 ++ rm /var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ [[ -n '' ]]
Apr 28 01:39:40 ++ [[ pytorch-linux-backward-compatibility-check-test == *rocm* ]]
Apr 28 01:39:40 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Apr 28 01:39:40 ++ SCCACHE_IDLE_TIMEOUT=1200
Apr 28 01:39:40 ++ RUST_LOG=sccache::server=error
Apr 28 01:39:40 ++ sccache --start-server
Apr 28 01:39:40 sccache: Starting the server...

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@skipIfUnsupportedMinOpsetVersion(9)
def f(self):
self.is_script_test_enabled = is_script_test_enabled
Copy link
Contributor

@neginraoof neginraoof May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR still needed after #57564 ?

@garymm
Copy link
Collaborator Author

garymm commented May 5, 2021

Not needed after #57564

@garymm garymm closed this May 5, 2021
@garymm garymm deleted the aten-as_tensor branch May 5, 2021 19:14
BowenBao pushed a commit that referenced this pull request May 17, 2021
Note the first commit in this PR has its own pull request here since it seemed self-contained:
#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>
BowenBao pushed a commit that referenced this pull request May 19, 2021
Note the first commit in this PR has its own pull request here since it seemed self-contained:
#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

[ghstack-poisoned]
BowenBao pushed a commit that referenced this pull request May 19, 2021
Note the first commit in this PR has its own pull request here since it seemed self-contained:
#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>
BowenBao pushed a commit that referenced this pull request May 20, 2021
Note the first commit in this PR has its own pull request here since it seemed self-contained:
#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

[ghstack-poisoned]
BowenBao added a commit that referenced this pull request May 20, 2021
Note the first commit in this PR has its own pull request here since it seemed self-contained:
#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

[ghstack-poisoned]
BowenBao added a commit that referenced this pull request May 26, 2021
Note the first commit in this PR has its own pull request here since it seemed self-contained:
#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>

Differential Revision: [D28714805](https://our.internmc.facebook.com/intern/diff/D28714805)

[ghstack-poisoned]
facebook-github-bot pushed a commit that referenced this pull request May 27, 2021
Summary:
Pull Request resolved: #58691

Note the first commit in this PR has its own pull request here since it seemed self-contained:
#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D28714805

Pulled By: SplitInfinity

fbshipit-source-id: f19647a04533d9ec76399a8793b3f712ea0337d2

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>
deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
Pull Request resolved: pytorch#58691

Note the first commit in this PR has its own pull request here since it seemed self-contained:
pytorch#57082

* [ONNX] simplify batch_first logic in RNN tests

* [ONNX] support GRU with packed input in scripting mode

This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
  replace another use of the batch_sizes output of prim::PackPadded.
  Previously the code assumed that the first use was as input to the
  RNN operator. However in some cases, it is also used to compute
  max_batch_size. For example in this code:
  https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815

With these changes the GRU tests now pass in scripting mode for opset
version >= 11.

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D28714805

Pulled By: SplitInfinity

fbshipit-source-id: f19647a04533d9ec76399a8793b3f712ea0337d2

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants