Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preetha/weight sharing fix #545

Merged
merged 3 commits into from
Jan 27, 2025

Conversation

preetha-intel
Copy link

Description

Move variables from subgraph to session context
Reduce redundant check with OVEP Get capability at backend manager
Bug fix for duplicate outputs with epctx model
Bug fix to set default compile time device option if runtime option is not provided.

@ankitm3k ankitm3k force-pushed the openvino/ep-weight-sharing branch 2 times, most recently from a7fdeca to c9fb7e0 Compare January 24, 2025 05:29
@preetha-intel preetha-intel force-pushed the preetha/weight-sharing-fix branch from dbe14b5 to 4fc48eb Compare January 24, 2025 08:36
@sfatimar
Copy link

LGTM

@ankitm3k ankitm3k force-pushed the openvino/ep-weight-sharing branch from 48ee137 to 6d1f1cf Compare January 27, 2025 08:46
@preetha-intel preetha-intel force-pushed the preetha/weight-sharing-fix branch from dc68414 to 42812a3 Compare January 27, 2025 13:30
@preetha-intel preetha-intel force-pushed the preetha/weight-sharing-fix branch from 42812a3 to df46642 Compare January 27, 2025 14:08
@preetha-intel preetha-intel merged commit 7179a0b into openvino/ep-weight-sharing Jan 27, 2025
6 of 16 checks passed
sfatimar added a commit that referenced this pull request Jan 27, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
Co-authored-by: saurabhkale117 <saurabh1.kale@intel.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: ankitm3k <ankit.maheshkar@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
sfatimar added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
Co-authored-by: saurabhkale117 <saurabh1.kale@intel.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: ankitm3k <ankit.maheshkar@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
sfatimar added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
Co-authored-by: saurabhkale117 <saurabh1.kale@intel.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: ankitm3k <ankit.maheshkar@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
ankitm3k added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
Co-authored-by: saurabhkale117 <saurabh1.kale@intel.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: ankitm3k <ankit.maheshkar@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
ankitm3k added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
Co-authored-by: saurabhkale117 <saurabh1.kale@intel.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: ankitm3k <ankit.maheshkar@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants