SoftmaxCrossEntropyLoss-12 forward and backward kernel implementation. #3465

codemzs · 2020-04-09T09:46:31Z

SoftmaxCrossEntropyLoss-12 forward and backward kernel implementation.

Graph transformer to populate log probability output for SoftmaxCrossEntropyLoss if it has not been requested, this is needed by the gradient builder and PyTorch does not have this output as a required type so the converter does not add it.
Verification/testing:
- 48 grad tests exercising 2-D, 4-D inputs, all reduction modes such as sum, mean and none, ignore_index with/without weights.
- 43 tests to ensure CPU and GPU outputs match.
- Several tests were written in ONNX repo that created this loss function using PyTorch, recorded the output, then converted the model to ONNX and then ran on ONNX Runtime to ensure output matches with PyTorch.
- HuggingFace model converted from PyTorch to ONNX and then trained on ONNX Runtime Training to ensure numbers match.
Enables some of the disabled ONNX tests that were previously failing.

This reverts commit 847cb50.

onnxruntime/test/onnx/main.cc

orttraining/orttraining/test/gradient/gradient_ops_test.cc

SherlockNoMad · 2020-04-14T21:54:56Z

Also need some tests in cross_entropy_test.cc #Resolved

… into softmaxcrossentropy_nllloss

codemzs · 2020-04-15T04:10:54Z

Done.

In reply to: 613700754 [](ancestors = 613700754)

orttraining/orttraining/core/graph/gradient_builder.cc

orttraining/orttraining/core/optimizer/insert_output_rewriter.cc

orttraining/orttraining/core/optimizer/insert_output_rewriter.h

orttraining/orttraining/test/gradient/gradient_ops_test.cc

orttraining/orttraining/training_ops/cpu/loss/softmax_cross_entropy_loss.cc

… into softmaxcrossentropy_nllloss

codemzs · 2020-04-16T03:06:30Z

/azp run #Resolved

azure-pipelines · 2020-04-16T03:06:35Z

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. #Resolved

codemzs · 2020-04-16T03:07:02Z

/azp run orttraining-linux-gpu-ci-pipeline #Resolved

azure-pipelines · 2020-04-16T03:07:11Z

Azure Pipelines successfully started running 1 pipeline(s). #Resolved

codemzs · 2020-04-16T03:08:00Z

/azp run orttraining-linux-ci-pipeline
/azp run orttraining-linux-gpu-ci-pipeline
/azp run orttraining-linux-gpu-inference-only-ci
#Resolved

azure-pipelines · 2020-04-16T03:08:05Z

No pipelines are associated with this pull request. #Resolved

codemzs · 2020-04-16T03:08:17Z

/azp run orttraining-linux-ci-pipeline #Resolved

codemzs · 2020-04-16T03:08:25Z

/azp run orttraining-linux-gpu-ci-pipeline #Resolved

azure-pipelines · 2020-04-16T03:08:26Z

Azure Pipelines successfully started running 1 pipeline(s). #Resolved

azure-pipelines · 2020-04-16T03:08:35Z

Azure Pipelines successfully started running 1 pipeline(s). #Resolved

codemzs · 2020-04-16T03:08:36Z

/azp run orttraining-linux-gpu-inference-only-ci #Resolved

azure-pipelines · 2020-04-16T03:08:46Z

Azure Pipelines successfully started running 1 pipeline(s). #Resolved

orttraining/orttraining/test/gradient/gradient_ops_test.cc

SherlockNoMad · 2020-04-16T03:28:43Z

orttraining/orttraining/training_ops/cuda/loss/softmax_cross_entropy_loss_impl.cu

+      weight_data_nd,
+      label,
+      weight,
+      count,


Sorry I meant to use N_D here.
Otherwise N_D is not used in this function #Resolved

orttraining/orttraining/training_ops/cuda/loss/softmax_cross_entropy_loss_impl.cu

orttraining/orttraining/test/gradient/gradient_ops_test.cc

orttraining/orttraining/test/training_ops/cuda/cross_entropy_test.cc

azure-pipelines · 2020-04-16T04:26:44Z