[Caffe2]Enable fusion for IDEEP in optimizeForIdeep #8105

gujinghui · 2018-06-04T05:04:51Z

Enable fusion for IDEEP in optimizeForIdeep, including Conv+ReLU, Conv+Sum, Conv+Sum+ReLU, Conv+BN, and pre-convert filter format

gujinghui · 2018-06-04T05:06:07Z

@yinghai
pls review

caffe2/ideep/operators/conv_fusion_op.cc

@@ -75,25 +73,10 @@ class IDEEPConvFusionOp final : public IDEEPConvPoolOpBase {
        "*",
        group_);

-    bool weights_changed =


caffe2/ideep/operators/conv_op.cc

@@ -36,21 +34,6 @@ class IDEEPConvOp final : public IDEEPConvPoolOpBase {
        "*",
        group_);

-    bool weights_changed =


caffe2/opt/optimize_ideep.cc

-    // We only want to fuse for IDEEP convs
-    if (op->device_option().device_type() != DeviceType::IDEEP) {
-      return false;
+bool fuseConvBNHelperForIdeep(repr::NNModule* nn, caffe2::Workspace* ws) {


caffe2/opt/optimize_ideep.cc

+  }
+}
+
+void fuseConvSumForIdeep(repr::NNModule* nn, caffe2::Workspace* ws) {


caffe2/opt/optimize_ideep.cc

+  }
+}
+
+void preConvertFiltersFormat(repr::NNModule* nn, caffe2::Workspace* ws) {


gujinghui · 2018-06-05T02:23:03Z

Add @jgong5

caffe2/ideep/operators/conv_fusion_op.cc

@@ -75,25 +73,10 @@ class IDEEPConvFusionOp final : public IDEEPConvPoolOpBase {
        "*",
        group_);

-    bool weights_changed =


caffe2/python/ideep/convfusion_op_test.py

@@ -189,6 +190,34 @@ def test_convolution_sum_fusion(self, stride, pad, kernel, size,
            print(S0.flatten())
            print(np.max(np.abs(S1 - S0)))
            self.assertTrue(False)
+
+        # Auto fusion for Conv + Sum
+        workspace.ResetWorkspace()


gujinghui · 2018-06-08T05:18:14Z

hi @yinghai
Did you get our mail on this PR?
Could you give more suggestions?
Thanks.

gujinghui · 2018-06-13T15:35:56Z

@yinghai new patch uploaded. pls help review.

@jgong5

yinghai · 2018-06-13T17:15:05Z

@gujinghui Sorry for the late reply. I'm working on verifying the fix for group convolution and next item is to check the memory increase issue. I'll get back to this one once the above outstanding issues are resolved as this one is not a blocker to anything. Makes sense?

jgong5 · 2018-06-14T00:42:39Z

@yinghai Please let us know anything we can help to troubleshoot the memory footprint issue. I guess we can reproduce the problem with any model like ResNet-50, right?

gujinghui · 2018-06-14T02:13:48Z

@yinghai As @jgong5 said, perhaps we can work together to figure out what happen on memory consumption issue?

yinghai · 2018-06-14T03:20:53Z

@jgong5 @gujinghui yeah, to start, just try resnet50 with ideep and mkl and check the difference.

yinghai · 2018-06-15T21:06:46Z

@gujinghui Actually do you have mask RCNN case to run? That might repo the memory issue better.

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

Use optimizeForIdeep to convert filter format, instead. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

gujinghui · 2018-07-02T08:01:11Z

@yinghai
rebased. pls review

cmake/Modules/FindMKL.cmake

@@ -304,7 +304,7 @@ if (USE_MKL AND USE_IDEEP)

  if (MKLDNN_INCLUDE_DIR)
    list(APPEND IDEEP_INCLUDE_DIR ${MKLDNN_INCLUDE_DIR})
-    list(APPEND __ideep_looked_for ${MKLDNN_INCLUDE_DIR})
+    list(APPEND __ideep_looked_for MKLDNN_INCLUDE_DIR)


yinghai

Thanks for pushing this. I have some comments that needs to be addressed. In addition, I think this PR is losing focus, as it is supposed to be transformation. However, I saw

Transformation
Fixes in group size and FC
Fixes in CMakefile

I suggest we split this PR into 3. Actually, let me take the CMakefile fix. And you can work on to split this into 2. Sounds good?

caffe2/ideep/operators/conv_fusion_op.cc

@@ -75,25 +73,10 @@ class IDEEPConvFusionOp final : public IDEEPConvPoolOpBase {
        "*",
        group_);

-    bool weights_changed =


caffe2/ideep/operators/conv_op.cc

@@ -21,41 +19,23 @@ class IDEEPConvOp final : public IDEEPConvPoolOpBase {
    const auto& X = Input(INPUT);
    const auto& filter = Input(FILTER);
    auto* Y = Output(OUTPUT);
-    auto Y_dims = CalcOutputDims(X, filter.get_dim(0));
+    auto grouped = filter.is_grouped() ? 1 : 0;
+    auto Y_dims = CalcOutputDims(


caffe2/ideep/operators/fully_connected_op.cc

@@ -18,16 +20,39 @@ class IDEEPFullyConnectedOp final : public IDEEPOperator {
    const auto& filter = Input(FILTER);
    auto* Y = Output(OUTPUT);

+    auto newDims = [&](itensor::dims adims, size_t axis) {


cmake/Modules/FindMKL.cmake

@@ -304,7 +304,7 @@ if (USE_MKL AND USE_IDEEP)

  if (MKLDNN_INCLUDE_DIR)
    list(APPEND IDEEP_INCLUDE_DIR ${MKLDNN_INCLUDE_DIR})
-    list(APPEND __ideep_looked_for ${MKLDNN_INCLUDE_DIR})
+    list(APPEND __ideep_looked_for MKLDNN_INCLUDE_DIR)


facebook-github-bot

@yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in #8105 and probably can help with #9024 Pull Request resolved: #9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6

Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in pytorch/pytorch#8105 and probably can help with pytorch/pytorch#9024 Pull Request resolved: pytorch/pytorch#9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6

gujinghui · 2018-07-09T07:29:12Z

This PR is being split to 2.
Pls refer to #9255 on op fusion part.
Pls refer to #9488 on pre-convert filters part.

Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in pytorch/pytorch#8105 and probably can help with pytorch/pytorch#9024 Pull Request resolved: pytorch/pytorch#9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6

yinghai · 2018-07-17T05:28:20Z

@gujinghui Is this PR still relevant? If yes, could you rebase on master?

gujinghui · 2018-07-17T07:45:06Z

@yinghai This PR is not needed now. Will close it.

Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in pytorch#8105 and probably can help with pytorch#9024 Pull Request resolved: pytorch#9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6

onnxbot added the caffe2 label Jun 4, 2018

yinghai reviewed Jun 4, 2018

View reviewed changes

yinghai reviewed Jun 5, 2018

View reviewed changes

gujinghui force-pushed the enable_fusion branch from 71b1d96 to 67b311d Compare June 13, 2018 15:32

gujinghui added 5 commits July 2, 2018 11:05

Enable Conv fusion optimizations in optimizeForIdeep

01d7314

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

pre-convert FC filter format in optimizeForIDEEP

099a46c

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

fix test case of conv_fusion op

423c98e

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

revert the change of training_mode flag in Conv op

2a9e62d

Use optimizeForIdeep to convert filter format, instead. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

Fix build issue for USE_IDEEP

3428b01

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

gujinghui force-pushed the enable_fusion branch from 67b311d to 3428b01 Compare July 2, 2018 07:59

yinghai reviewed Jul 2, 2018

View reviewed changes

gujinghui added 2 commits July 3, 2018 10:07

Merge branch 'master' into enable_fusion

5238a97

Update converter.cc

50e8678

yinghai suggested changes Jul 6, 2018

View reviewed changes

yinghai mentioned this pull request Jul 6, 2018

Fix IDEEP CMakefile #9217

Closed

facebook-github-bot reviewed Jul 6, 2018

View reviewed changes

gujinghui closed this Jul 17, 2018

ezyang added the open source label Jun 24, 2019

[Caffe2]Enable fusion for IDEEP in optimizeForIdeep #8105

[Caffe2]Enable fusion for IDEEP in optimizeForIdeep #8105

Conversation

gujinghui commented Jun 4, 2018

gujinghui commented Jun 4, 2018

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

gujinghui commented Jun 5, 2018

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

gujinghui commented Jun 8, 2018 • edited Loading

gujinghui commented Jun 13, 2018 • edited Loading

yinghai commented Jun 13, 2018

jgong5 commented Jun 14, 2018

gujinghui commented Jun 14, 2018

yinghai commented Jun 14, 2018

yinghai commented Jun 15, 2018

gujinghui commented Jul 2, 2018

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

yinghai left a comment

Choose a reason for hiding this comment

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

facebook-github-bot left a comment

Choose a reason for hiding this comment

gujinghui commented Jul 9, 2018 • edited Loading

yinghai commented Jul 17, 2018

gujinghui commented Jul 17, 2018

gujinghui commented Jun 8, 2018 •

edited

Loading

gujinghui commented Jun 13, 2018 •

edited

Loading

gujinghui commented Jul 9, 2018 •

edited

Loading