Added an option to use the pure python implementation. #1137

gabrieldemarmiesse · 2020-02-23T11:23:22Z

What is left to do:

Add try-except. If there are problems with loading the .so
fallback to the pure python implementation and print a warning
explaining the TF_ADDONS_PY_OPS and addons versions compatible with
the current tf.
May also flip the TF_ADDONS_PY_OPS switch automatically?
Recommend that contributors making new ops add pure python equivalent in the CONTRIBUTING

I think macos and windows users will be happy to use the activations functions on gpu :)

I also believe that the description of TF addons discouraged Windows and MacOS users to use addons when they want to use the gpu too. That's not good, because the majority of the components of addons work on windows and mac with a gpu.

gabrieldemarmiesse · 2020-02-23T12:03:46Z

Bazel is really not easy for python devs.
When locally, I'm doing

TF_ADDONS_NO_BUILD=1 pip install -e ./

I can do:

from tensorflow_addons.activations import hardshrink

and it works.

Is there a flag to make bazel respect basic python rules? This is really fustrating to spend so much time on this PR just to add a single python file.

Squadrick · 2020-02-24T17:33:43Z

@gabrieldemarmiesse yeah, that happens with bazel due to the way it resolves path imports. Haven't found a fix for it yet.

seanpmorgan · 2020-02-24T18:02:00Z

@googlebot I consent.

googlebot · 2020-02-24T18:02:10Z

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

seanpmorgan · 2020-02-24T18:06:04Z

tensorflow_addons/activations/BUILD

@@ -16,6 +16,7 @@ py_library(
        "tanhshrink.py",
    ],
    data = [
+        "//tensorflow_addons:options.py",


Agree that is this painful and unituitive. Sorry for any wasted time. We could raise this with the bazel team, but I believe this is fundamental to how this resolving works.

Yeah it's linked to how bazel works and that makes sense for monorepos like tensorflow. I'm not convinced the added value of making the wheels + running the tests with bazel outweight the costs in addons. It makes totally sense to use it to build the SO though. I know we had a similar conversation a while ago, maybe I'll open an issue so that we can discuss it more. Anyway thanks for the fix! It helps a lot!

Yeah I think this can be discussed further. Typical contributions to Addons will/should fall into a subpackage that should have the BUILD already setup. This PR is a change that affects the entirety of the project so it's a bit more complicated. I agree we should encourage contributors to make these types of changes but we can look at and cons again to see if its worth splitting the C++ test/builds from python tests/builds.

seanpmorgan · 2020-02-24T18:15:10Z

Add try-except. If there are problems with loading the .so
fallback to the pure python implementation and print a warning
explaining the TF_ADDONS_PY_OPS and addons versions compatible with
the current tf.

This should be doable since it throws a python error instead of a crash. tensorflow.python.framework.errors_impl.NotFoundError

May also flip the TF_ADDONS_PY_OPS switch automatically?

Recommend that contributors making new ops add pure python equivalent in the CONTRIBUTING

Good idea, but not sure this is always feasible so maybe dependant on the subpackage. For example, image kernels may be hard to reproduce without depending on OpenCV or something.

I think macos and windows users will be happy to use the activations functions on gpu :)

Agree, but I think macos users are out of luck unless they're using a ROCm supported TF. AFAIK there hasn't been an NVIDIA GPU on macos system for a long time. I could be wrong on the current state of this:
https://github.com/ROCmSoftwarePlatform/tensorflow-upstream

gabrieldemarmiesse · 2020-02-25T09:29:41Z

I'll implement the fallback in this PR. We can change the CONTRIBUTING in another one

There seem to be a lot of activity around ROCm in the tensorflow/tensorflow repo. Do you guys know what's the status? It'd be cool if TF were to support AMD officially :)

gabrieldemarmiesse · 2020-02-25T16:06:24Z

Here is the warning message:

2020-02-25 16:58:17.446948: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-02-25 16:58:17.447611: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-02-25 16:58:17.448055: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (CRE1-L11685): /proc/driver/nvidia/version does not exist
2020-02-25 16:58:17.458128: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-02-25 16:58:17.507655: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2712000000 Hz
2020-02-25 16:58:17.510852: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fffe1fe08d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-02-25 16:58:17.511249: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
/mnt/c/Users/gdemarmi/Desktop/projects/addons/tensorflow_addons/options.py:48: RuntimeWarning: Traceback (most recent call last):
  File "/mnt/c/Users/gdemarmi/Desktop/projects/addons/tensorflow_addons/activations/hardshrink.py", line 49, in hardshrink
    return _hardshrink_custom_op(x, lower, upper)
  File "/mnt/c/Users/gdemarmi/Desktop/projects/addons/tensorflow_addons/activations/hardshrink.py", line 58, in _hardshrink_custom_op
    return _activation_so.ops.addons_hardshrink(x, lower, upper)
  File "/mnt/c/Users/gdemarmi/Desktop/projects/addons/tensorflow_addons/utils/resource_loader.py", line 49, in ops
    self._ops = tf.load_op_library(get_path_to_datafile(self.relative_path))
  File "/home/gdemarmi/softwares/python/anaconda/lib/python3.7/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /mnt/c/Users/gdemarmi/Desktop/projects/addons/tensorflow_addons/custom_ops/activations/_activation_ops.so: cannot open shared object file: No such file or directory


The hardshrink C++/CUDA custom op could not be loaded. 
For this reason, Addons will fallback to an implementation written 
in Python with public TensorFlow ops. There worst you might experience with
this is a moderate slowdown on GPU. There can be multiple 
reason for this loading error, one of them may be an ABI incompatibility between 
the TensorFlow installed on your system and the TensorFlow used to compile
TensorFlow Addons' custom ops. The stacktrace generated when loading the 
shared object file was displayed above.

If you want this warning to disappear, either make sure the TensorFlow installed
is compatible with this version of Addons, or tell TensorFlow Addons to 
prefer using Python implementations and not custom C++/CUDA ones. You can do that 
by changing the TF_ADDONS_PY_OPS flag
either with the environment variable:

TF_ADDONS_PY_OPS=1 python my_script.py

or in your code, after your imports:

import tensorflow_addons as tfa
import ...
import ...

tfa.options.TF_ADDONS_PY_OPS = True


  warnings.warn(warning_msg, RuntimeWarning)

seanpmorgan

Looks very close to LGTM. Thanks for the effort on this. Few comments, I think we need to clarify that this is not entirely a GPU issue. The slowdown will occur for not loading CPU ops as well.

README.md

seanpmorgan · 2020-02-26T15:00:49Z

tensorflow_addons/activations/BUILD

@@ -16,6 +16,7 @@ py_library(
        "tanhshrink.py",
    ],
    data = [
+        "//tensorflow_addons:options.py",


Yeah I think this can be discussed further. Typical contributions to Addons will/should fall into a subpackage that should have the BUILD already setup. This PR is a change that affects the entirety of the project so it's a bit more complicated. I agree we should encourage contributors to make these types of changes but we can look at and cons again to see if its worth splitting the C++ test/builds from python tests/builds.

tensorflow_addons/options.py

seanpmorgan

Almost LGTM. Minor edits.

One last question I have is the implications of serializing and deserializing these ops. For example if someone trained a model with custom-ops and then someone else loaded using py_ops what guarantees do we have that they behave the same? I suppose as long as we have test cases ensuring they're the same?

README.md

Co-Authored-By: Sean Morgan <seanmorgan91@gmail.com>

gabrieldemarmiesse · 2020-02-27T04:59:59Z

I'm unsure about serialization. I think we need another opinion on this one.

bot-of-gabrieldemarmiesse · 2020-02-27T15:28:05Z

@facaiy @seanpmorgan @tensorflow/sig-addons-maintainers

You are owners of some files modified in this pull request.
Would you kindly review the changes whenever you have the time to?
Thank you very much.

gabrieldemarmiesse · 2020-02-27T15:28:42Z

Sorry about the spam. My bot is training.

Squadrick · 2020-02-27T19:07:01Z

Almost LGTM. Minor edits.

One last question I have is the implications of serializing and deserializing these ops. For example, if someone trained a model with custom-ops and then someone else loaded using py_ops what guarantees do we have that they behave the same? I suppose as long as we have test cases ensuring they're the same?

That's the only way I can see it work. Also, in verify_funcs_are_equivalent, atol=1e-4 is a significant error bound that can't be blamed on fp-approx, the default value for atol is 1e-6, and I think we should aim for lower.

seanpmorgan

LGTM Thanks!

* Added an option to use the pure python implementation.

Added an option to use the pure python implementation.

e4f4fd7

gabrieldemarmiesse requested review from facaiy, seanpmorgan and a team as code owners February 23, 2020 11:23

boring-cyborg bot added the activations label Feb 23, 2020

googlebot added the cla: yes label Feb 23, 2020

Forgot to add the new python file to the BUILD.

91fb3a2

gabrieldemarmiesse added 2 commits February 23, 2020 12:45

Some info.

ed123be

Merge branch 'dodu' into enable_py_func_publicly

a17572e

Add options as dependency

44e2b73

This comment has been minimized.

Sign in to view

googlebot added cla: no and removed cla: yes labels Feb 24, 2020

googlebot added cla: yes and removed cla: no labels Feb 24, 2020

Lint

e717d0b

seanpmorgan reviewed Feb 24, 2020

View reviewed changes

gabrieldemarmiesse added 4 commits February 25, 2020 11:29

Merge branch 'master' into enable_py_func_publicly

73c149e

Added warning and fallback.

b66e1ac

Format.

39ef6f8

Format again.

d8bc287

Removed the option module from checks.

ab32156

boring-cyborg bot added the build label Feb 25, 2020

gabrieldemarmiesse mentioned this pull request Feb 25, 2020

Public API to force load custom ops #1151

Closed

seanpmorgan added the kokoro:force-run label Feb 26, 2020

kokoro-team removed the kokoro:force-run label Feb 26, 2020

Merge branch 'master' into enable_py_func_publicly

fe9246f

seanpmorgan requested changes Feb 26, 2020

View reviewed changes

seanpmorgan reviewed Feb 26, 2020

View reviewed changes

tensorflow_addons/options.py Show resolved Hide resolved

gabrieldemarmiesse mentioned this pull request Feb 26, 2020

Benchmark activations using different implementations. #1156

Closed

gabrieldemarmiesse added 2 commits February 26, 2020 22:29

Merge branch 'master' into enable_py_func_publicly

60c7c9f

Some rewording.

d6a3aac

gabrieldemarmiesse requested a review from seanpmorgan February 26, 2020 22:37

seanpmorgan requested changes Feb 27, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

gabrieldemarmiesse and others added 3 commits February 27, 2020 05:57

Update README.md

3b906e2

Co-Authored-By: Sean Morgan <seanmorgan91@gmail.com>

Update README.md

3183583

Co-Authored-By: Sean Morgan <seanmorgan91@gmail.com>

Update README.md

415f1f3

Co-Authored-By: Sean Morgan <seanmorgan91@gmail.com>

gabrieldemarmiesse mentioned this pull request Feb 27, 2020

Notify proxy maintainers #1167

Merged

gabrieldemarmiesse mentioned this pull request Feb 27, 2020

Better precision when comparing implementations. #1172

Merged

seanpmorgan approved these changes Mar 7, 2020

View reviewed changes

seanpmorgan merged commit 21d0574 into tensorflow:master Mar 7, 2020

This was referenced Mar 8, 2020

Exposed the pure python gelu implementation publicly. #1250

Merged

Adding a pure python implementation corresponding to simple C++/Cuda ops. #1114

Closed

gabrieldemarmiesse mentioned this pull request Mar 15, 2020

Resolve discrepancy in python and custom op gelu implementations #1220

Closed

jrruijli pushed a commit to jrruijli/addons that referenced this pull request Dec 23, 2020

Added an option to use the pure python implementation. (tensorflow#1137)

2029487

* Added an option to use the pure python implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added an option to use the pure python implementation. #1137

Added an option to use the pure python implementation. #1137

gabrieldemarmiesse commented Feb 23, 2020 •

edited

Loading

gabrieldemarmiesse commented Feb 23, 2020 •

edited

Loading

Squadrick commented Feb 24, 2020

This comment has been minimized.

seanpmorgan commented Feb 24, 2020

googlebot commented Feb 24, 2020

seanpmorgan Feb 24, 2020

gabrieldemarmiesse Feb 25, 2020

seanpmorgan Feb 26, 2020

seanpmorgan commented Feb 24, 2020

gabrieldemarmiesse commented Feb 25, 2020 •

edited

Loading

gabrieldemarmiesse commented Feb 25, 2020

seanpmorgan left a comment

seanpmorgan Feb 26, 2020

seanpmorgan left a comment

gabrieldemarmiesse commented Feb 27, 2020

bot-of-gabrieldemarmiesse commented Feb 27, 2020

gabrieldemarmiesse commented Feb 27, 2020

Squadrick commented Feb 27, 2020

seanpmorgan left a comment •

edited

Loading

Added an option to use the pure python implementation. #1137

Added an option to use the pure python implementation. #1137

Conversation

gabrieldemarmiesse commented Feb 23, 2020 • edited Loading

gabrieldemarmiesse commented Feb 23, 2020 • edited Loading

Squadrick commented Feb 24, 2020

This comment has been minimized.

seanpmorgan commented Feb 24, 2020

googlebot commented Feb 24, 2020

seanpmorgan Feb 24, 2020

Choose a reason for hiding this comment

gabrieldemarmiesse Feb 25, 2020

Choose a reason for hiding this comment

seanpmorgan Feb 26, 2020

Choose a reason for hiding this comment

seanpmorgan commented Feb 24, 2020

gabrieldemarmiesse commented Feb 25, 2020 • edited Loading

gabrieldemarmiesse commented Feb 25, 2020

seanpmorgan left a comment

Choose a reason for hiding this comment

seanpmorgan Feb 26, 2020

Choose a reason for hiding this comment

seanpmorgan left a comment

Choose a reason for hiding this comment

gabrieldemarmiesse commented Feb 27, 2020

bot-of-gabrieldemarmiesse commented Feb 27, 2020

gabrieldemarmiesse commented Feb 27, 2020

Squadrick commented Feb 27, 2020

seanpmorgan left a comment • edited Loading

Choose a reason for hiding this comment

gabrieldemarmiesse commented Feb 23, 2020 •

edited

Loading

gabrieldemarmiesse commented Feb 23, 2020 •

edited

Loading

gabrieldemarmiesse commented Feb 25, 2020 •

edited

Loading

seanpmorgan left a comment •

edited

Loading