Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernize OpenVINO-based Nuclio functions and allow them to run on Kubernetes #6129

Merged
merged 11 commits into from
May 15, 2023

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented May 11, 2023

Motivation and context

Currently, OpenVINO-based functions assume that a local directory will be mounted into the container. In Kubernetes, that isn't possible, so implement an alternate approach: create a separate base image and inherit the function image from it.

In addition, implement some modernizations:

  • Upgrade the version of OpenVINO to the latest (2022.3). Make the necessary updates to the code. Note that 2022.1 introduced an entirely new inference API, but I haven't switched to it yet to minimize changes.

  • Use the runtime version of the Docker image as the base instead of the dev version. This significantly reduces the size of the final image (by ~3GB).

  • Replace the faster_rcnn_inception_v2_coco model with faster_rcnn_inception_resnet_v2_atrous_coco, as the former has been
    removed from OMZ.

  • Ditto with person-reidentification-retail-0300 -> 0277.

  • The IRs used in the DEXTR function are not supported by OpenVINO anymore (format too old), so rewrite the build process to create them from the original code/weights instead.

How has this been tested?

I manually tried each affected function to make sure they still work.

Checklist

  • I submit my changes into the develop branch
  • I have added a description of my changes into the CHANGELOG file
  • [ ] I have updated the documentation accordingly
  • [ ] I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)
  • [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@SpecLad SpecLad force-pushed the modernize-openvino-functions branch 2 times, most recently from 0ac3fa6 to 96eae99 Compare May 11, 2023 09:11
@SpecLad SpecLad requested a review from bsekachev May 11, 2023 09:42
@SpecLad SpecLad marked this pull request as ready for review May 11, 2023 09:42
@SpecLad SpecLad marked this pull request as draft May 11, 2023 10:22
@SpecLad
Copy link
Contributor Author

SpecLad commented May 11, 2023

Converting to draft until I figure out the Helm pipeline failures.

@@ -4,19 +4,23 @@
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
FUNCTIONS_DIR=${1:-$SCRIPT_DIR}

nuctl create project cvat
docker build -t cvat.openvino.base "$SCRIPT_DIR/openvino/base"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our documentation we have some lines to build nuclio functions without deploy_cpu.sh script.

https://opencv.github.io/cvat/docs/contributing/setup-additional-components/

Don't they work anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the docs. One of the nuctl deploy commands uses a TF function, which is unaffected by these changes, so I kept it as-is. The rest I replaced with calls to deploy_cpu.sh.

Comment on lines +15 to +20
func_root="$(dirname "$func_config")"
func_rel_path="$(realpath --relative-to="$SCRIPT_DIR" "$(dirname "$func_root")")"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not we updated deploy_gpu.sh the same way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks for the reminder. I ported relevant changes to deploy_gpu.sh.

@bsekachev
Copy link
Member

./deploy_cpu.sh openvino/omz/intel/face-detection-0205/nuclio/ causes:

23.05.12 04:13:31.372 cessor.healthcheck.server (I) Listening {"listenAddress": ":8082"}
23.05.12 04:13:31.372            processor.http (D) Creating worker pool {"num": 2}
23.05.12 04:13:31.372 sor.http.w0.python.logger (D) Creating listener socket {"path": "/tmp/nuclio-rpc-chevaaucno8m37gurf20.sock"}
23.05.12 04:13:31.372 sor.http.w1.python.logger (D) Creating listener socket {"path": "/tmp/nuclio-rpc-chevaaucno8m37gurf1g.sock"}
23.05.12 04:13:31.372 sor.http.w0.python.logger (W) Python 3.6 runtime is deprecated and will soon not be supported. Please migrate your code and use Python 3.7 runtime (`python:3.7`) or higher
23.05.12 04:13:31.372 sor.http.w0.python.logger (D) Using Python wrapper script path {"path": "/opt/nuclio/_nuclio_wrapper.py"}
23.05.12 04:13:31.372 sor.http.w0.python.logger (D) Using Python handler {"handler": "main:handler"}
23.05.12 04:13:31.372 sor.http.w1.python.logger (W) Python 3.6 runtime is deprecated and will soon not be supported. Please migrate your code and use Python 3.7 runtime (`python:3.7`) or higher
23.05.12 04:13:31.372 sor.http.w1.python.logger (D) Using Python wrapper script path {"path": "/opt/nuclio/_nuclio_wrapper.py"}
23.05.12 04:13:31.372 sor.http.w1.python.logger (D) Using Python handler {"handler": "main:handler"}
23.05.12 04:13:31.372 sor.http.w1.python.logger (E) Can't find Python exe {"error": "exec: \"/opt/nuclio/common/openvino/python3\": stat /opt/nuclio/common/openvino/python3: no such file or directory"}
23.05.12 04:13:31.372 sor.http.w0.python.logger (E) Can't find Python exe {"error": "exec: \"/opt/nuclio/common/openvino/python3\": stat /opt/nuclio/common/openvino/python3: no such file or directory"}

Error - exec: "/opt/nuclio/common/openvino/python3": stat /opt/nuclio/common/openvino/python3: no such file or directory
    ...//nuclio/pkg/processor/runtime/rpc/abstract.go:239

Call stack:
Can't run wrapper
    ...//nuclio/pkg/processor/runtime/rpc/abstract.go:239
Failed to run wrapper
    ...//nuclio/pkg/processor/runtime/rpc/abstract.go:106
Failed to start runtime
    /nuclio/pkg/processor/worker/factory.go:100
Failed to create worker
    /nuclio/pkg/processor/worker/factory.go:119
Failed to create workers
    /nuclio/pkg/processor/worker/factory.go:129

    /nuclio/pkg/platform/local/platform.go:1168
Failed to deploy function
    ...//nuclio/pkg/platform/abstract/platform.go:197
  NAMESPACE |               NAME               | PROJECT |  STATE   | REPLICAS | NODE PORT
  nuclio    | openvino-dextr                   | cvat    | ready    | 1/1      |     32770
  nuclio    | openvino-omz-face-detection-0205 | cvat    | building | 1/1      |
  nuclio    | pth-facebookresearch-sam-vit-h   | cvat    | ready    | 1/1      |     32769

@bsekachev
Copy link
Member

./deploy_cpu.sh openvino/dextr/nuclio/ causes another error:

Step 10/15 : ADD export.py adaptive-pool.patch .
When using ADD with more than one source file, the destination must be a directory and end with a /
Deploying openvino/dextr function...
23.05.12 11:16:41.155                     nuctl (I) Deploying function {"name": ""}
23.05.12 11:16:41.155                     nuctl (I) Building {"builderKind": "docker", "versionInfo": "Label: 1.8.14, Git commit: cbb0774230996a3eb4621c1a2079e2317578005b, OS: linux, Arch: amd64, Go version: go1.17.8", "name": ""}
23.05.12 11:16:41.553                     nuctl (I) Staging files and preparing base images
23.05.12 11:16:41.554                     nuctl (W) Python 3.6 runtime is deprecated and will soon not be supported. Please migrate your code and use Python 3.7 runtime (`python:3.7`) or higher
23.05.12 11:16:41.555                     nuctl (I) Building processor image {"registryURL": "", "imageName": "cvat.openvino.dextr:latest"}
23.05.12 11:16:41.555     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.8.14-amd64"}
23.05.12 11:16:44.556     nuctl.platform.docker (W) Docker command outputted to stderr - this may result in errors {"workingDir": "/tmp/nuclio-build-1967427100/staging", "cmd": "docker build --network host --force-rm -t nuclio-onbuild-chevbr12gbapuflbhibg -f /tmp/nuclio-build-1967427100/staging/Dockerfile.onbuild   --build-arg NUCLIO_LABEL=1.8.14 --build-arg NUCLIO_ARCH=amd64 --build-arg NUCLIO_BUILD_LOCAL_HANDLER_DIR=handler  .", "stderr": "DEPRECATED: The legacy builder is deprecated and will be removed in a future release.\n            Install the buildx component to build images with BuildKit:\n            https://docs.docker.com/go/buildx/\n\n"}
23.05.12 11:16:45.236     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
23.05.12 11:16:50.682     nuctl.platform.docker (W) Docker command outputted to stderr - this may result in errors {"workingDir": "/tmp/nuclio-build-1967427100/staging", "cmd": "docker build --network host --force-rm -t nuclio-onbuild-chevbsh2gbapuflbhic0 -f /tmp/nuclio-build-1967427100/staging/Dockerfile.onbuild   --build-arg NUCLIO_LABEL=1.8.14 --build-arg NUCLIO_ARCH=amd64 --build-arg NUCLIO_BUILD_LOCAL_HANDLER_DIR=handler  .", "stderr": "DEPRECATED: The legacy builder is deprecated and will be removed in a future release.\n            Install the buildx component to build images with BuildKit:\n            https://docs.docker.com/go/buildx/\n\n"}
23.05.12 11:16:50.992            nuctl.platform (I) Building docker image {"image": "cvat.openvino.dextr:latest"}
23.05.12 11:16:54.636     nuctl.platform.docker (W) Docker command outputted to stderr - this may result in errors {"workingDir": "/tmp/nuclio-build-1967427100/staging", "cmd": "docker build --network host --force-rm -t cvat.openvino.dextr:latest -f /tmp/nuclio-build-1967427100/staging/Dockerfile.processor   --build-arg NUCLIO_LABEL=1.8.14 --build-arg NUCLIO_ARCH=amd64 --build-arg NUCLIO_BUILD_LOCAL_HANDLER_DIR=handler  .", "stderr": "DEPRECATED: The legacy builder is deprecated and will be removed in a future release.\n            Install the buildx component to build images with BuildKit:\n            https://docs.docker.com/go/buildx/\n\npull access denied for cvat.openvino.dextr.base, repository does not exist or may require 'docker login': denied: requested access to the resource is denied\n"}
23.05.12 11:16:54.640                     nuctl (W) Failed to create a function; setting the function status {"err": "Failed to build processor image", "errVerbose": "\nError - exit status 1\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\n\nCall stack:\nstdout:\nSending build context to Docker daemon  50.08MB\r\r\nStep 1/12 : FROM cvat.openvino.dextr.base\n\nstderr:\nDEPRECATED: The legacy builder is deprecated and will be removed in a future release.\n            Install the buildx component to build images with BuildKit:\n            https://docs.docker.com/go/buildx/\n\npull access denied for cvat.openvino.dextr.base, repository does not exist or may require 'docker login': denied: requested access to the resource is denied\n\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\nFailed to build\n    /nuclio/pkg/dockerclient/shell.go:117\nFailed to build docker image\n    .../pkg/containerimagebuilderpusher/docker.go:54\nFailed to build processor image\n    /nuclio/pkg/processor/build/builder.go:263\nFailed to build processor image"}

Error - exit status 1
    /nuclio/pkg/cmdrunner/shellrunner.go:96

Call stack:
stdout:
Sending build context to Docker daemon  50.08MB
Step 1/12 : FROM cvat.openvino.dextr.base

stderr:
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            Install the buildx component to build images with BuildKit:
            https://docs.docker.com/go/buildx/

pull access denied for cvat.openvino.dextr.base, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

    /nuclio/pkg/cmdrunner/shellrunner.go:96
Failed to build
    /nuclio/pkg/dockerclient/shell.go:117
Failed to build docker image
    .../pkg/containerimagebuilderpusher/docker.go:54
Failed to build processor image
    /nuclio/pkg/processor/build/builder.go:263
Failed to deploy function
    ...//nuclio/pkg/platform/abstract/platform.go:197
  NAMESPACE |               NAME               | PROJECT |  STATE   | REPLICAS | NODE PORT
  nuclio    | openvino-dextr                   | cvat    | error    | 1/1      |
  nuclio    | openvino-omz-face-detection-0205 | cvat    | building | 1/1      |
  nuclio    | pth-facebookresearch-sam-vit-h   | cvat    | ready    | 1/1      |     32769

@SpecLad
Copy link
Contributor Author

SpecLad commented May 12, 2023

Error - exec: "/opt/nuclio/common/openvino/python3": stat /opt/nuclio/common/openvino/python3: no such file or directory ...//nuclio/pkg/processor/runtime/rpc/abstract.go:239

I don't understand how that could be possible. I'm pretty sure I removed all references to /opt/nuclio/common/openvino/python3 from the repo. Maybe an earlier step failed?

When using ADD with more than one source file, the destination must be a directory and end with a /

This must be a slight difference between BuildKit and the old Docker builder. I made a change to explicitly enable BuildKit - could you try again? I also made the script stop after the first error.

@bsekachev
Copy link
Member

I don't understand how that could be possible. I'm pretty sure I removed all references to /opt/nuclio/common/openvino/python3 from the repo. Maybe an earlier step failed?

Maybe it is related with #5603. I am trying to build on WSL.

@SpecLad
Copy link
Contributor Author

SpecLad commented May 12, 2023

Maybe it is related with #5603.

I doubt it.

The only reason Nuclio would be trying to execute /opt/nuclio/common/openvino/python3 is the NUCLIO_PYTHON_EXE_PATH variable, and... I removed it from everywhere. So there should really be no way for this string to even occur.

I am trying to build on WSL.

I've been developing on WSL this entire time. 🤷‍♂️

@SpecLad SpecLad force-pushed the modernize-openvino-functions branch from e658952 to 7177f13 Compare May 12, 2023 12:56
@SpecLad SpecLad marked this pull request as ready for review May 12, 2023 12:57
@bsekachev
Copy link
Member

  • Probably need to add installing docker-buildx-plugin to user guide.

@bsekachev
Copy link
Member

Something wrong with DEXTR masks. It should not work this way:
image

@bsekachev
Copy link
Member

Now all the changed functions (except the Deep extreme cut issue above) work for me.

@SpecLad
Copy link
Contributor Author

SpecLad commented May 12, 2023

Probably need to add installing docker-buildx-plugin to user guide.

Why?

Something wrong with DEXTR masks. It should not work this way:

Could you upload the original image, so I could test it myself?

SpecLad added 8 commits May 14, 2023 20:15
…bernetes

Currently, OpenVINO-based functions assume that a local directory will be
mounted into the container. In Kubernetes, that isn't possible, so implement
an alternate approach: create a separate base image and inherit the function
image from it.

In addition, implement some modernizations:

* Upgrade the version of OpenVINO to the latest (2022.3). Make the necessary
  updates to the code. Note that 2022.1 introduced an entirely new inference
  API, but I haven't switched to it yet to minimize changes.

* Use the runtime version of the Docker image as the base instead of the dev
  version. This significantly reduces the size of the final image (by ~3GB).

* Replace the `faster_rcnn_inception_v2_coco` model with
  `faster_rcnn_inception_resnet_v2_atrous_coco`, as the former has been
  removed from OMZ.

* Ditto with `person-reidentification-retail-0300` -> `0277`.

* The IRs used in the DEXTR function are not supported by OpenVINO anymore
  (format too old), so rewrite the build process to create them from the
  original code/weights instead.
* fail early
* use Docker Buildkit
…_cpu.sh

I didn't migrate the changes related to `docker build` commands, since none
of the GPU-based functions have separate Dockerfiles yet.
SpecLad added 3 commits May 14, 2023 21:32
Remove the sample output for the nuclio deployment step(s), because a) it's
a long wall of text that doesn't really demonstrate anything, and b) it gets
even longer after recent changes, because the `docker build` commands log
all build steps.
It's no longer needed, since after recent changes, the common files are baked
into each image (that requires them).

This also decouples our Helm chart from the rest of the repository, which I
think is nice.
By default, Model Downloader downloads all available precisions.
@SpecLad SpecLad force-pushed the modernize-openvino-functions branch from 7177f13 to 244e997 Compare May 14, 2023 17:34
@SpecLad
Copy link
Contributor Author

SpecLad commented May 14, 2023

Something wrong with DEXTR masks.

I think I figured it out - there were some postprocessing layers baked into the old IRs. I now added those to export.py - could you try again?

@bsekachev
Copy link
Member

I think I figured it out - there were some postprocessing layers baked into the old IRs. I now added those to export.py - could you try again?

Now it works.

@bsekachev
Copy link
Member

Why?

Because it didn't work without explicit installing for me.

@SpecLad SpecLad merged commit 98616c7 into cvat-ai:develop May 15, 2023
@SpecLad SpecLad deleted the modernize-openvino-functions branch May 15, 2023 10:17
@azhavoro azhavoro mentioned this pull request May 18, 2023
nmanovic added a commit that referenced this pull request May 18, 2023
### Added
- Introduced a new configuration option for controlling the invocation of Nuclio functions.
  (<#6146>)

### Changed
- Relocated SAM masks decoder to frontend operation.
  (<#6019>)
- Switched `person-reidentification-retail-0300` and `faster_rcnn_inception_v2_coco` Nuclio functions with `person-reidentification-retail-0277` and `faster_rcnn_inception_resnet_v2_atrous_coco` respectively.
  (<#6129>)
- Upgraded OpenVINO-based Nuclio functions to utilize the OpenVINO 2022.3 runtime.
  (<#6129>)

### Fixed
- Resolved issues with tracking multiple objects (30 and more) using the TransT tracker.
  (<#6073>)
- Addressed azure.core.exceptions.ResourceExistsError: The specified blob already exists.
  (<#6082>)
- Corrected image scaling issues when transitioning between images of different resolutions.
  (<#6081>)
- Fixed inaccurate reporting of completed job counts.
  (<#6098>)
- Allowed OpenVINO-based Nuclio functions to be deployed to Kubernetes.
  (<#6129>)
- Improved skeleton size checks after drawing.
  (<#6156>)
- Fixed HRNet CPU serverless function.
  (<#6150>)
- Prevented sending of empty list of events.
  (<#6154>)
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Jul 1, 2023
…bernetes (cvat-ai#6129)

Currently, OpenVINO-based functions assume that a local directory will
be mounted into the container. In Kubernetes, that isn't possible, so
implement an alternate approach: create a separate base image and
inherit the function image from it.

In addition, implement some modernizations:

* Upgrade the version of OpenVINO to the latest (2022.3). Make the
necessary updates to the code. Note that 2022.1 introduced an entirely
new inference API, but I haven't switched to it yet to minimize changes.

* Use the runtime version of the Docker image as the base instead of the
dev version. This significantly reduces the size of the final image (by
~3GB).

* Replace the `faster_rcnn_inception_v2_coco` model with
`faster_rcnn_inception_resnet_v2_atrous_coco`, as the former has been
  removed from OMZ.

* Ditto with `person-reidentification-retail-0300` -> `0277`.

* The IRs used in the DEXTR function are not supported by OpenVINO
anymore (format too old), so rewrite the build process to create them
from the original code/weights instead.
@Keramblock
Copy link
Contributor

Hi, @SpecLad does this PR mean, that I could run SAM via cvat helm chart? Or that is not possible right now? I tried to check the docs and do not see any info about how to do that.

@SpecLad
Copy link
Contributor Author

SpecLad commented Jul 25, 2023

@Keramblock You cannot deploy serverless functions themselves using the Helm chart, but you can deploy Nuclio with the chart (set the nuclio.enabled value to true), and then deploy SAM to that Nuclio instance using nuctl.

EDIT: Also, I should note that this PR doesn't affect SAM, as SAM is not OpenVINO-based.

@Keramblock
Copy link
Contributor

Oh, ok, thanks a lot for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants