Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CoreML] support coreml model cache #23065

Merged
merged 36 commits into from
Dec 31, 2024
Merged

[CoreML] support coreml model cache #23065

merged 36 commits into from
Dec 31, 2024

Conversation

wejoncy
Copy link
Contributor

@wejoncy wejoncy commented Dec 10, 2024

Description

Refactor compute plan profiling

Support cache coreml model to speed up session initialization. this is only support by user provided entry and user responsible to manage the cache

With the cache, session initialization time can be reduced by 50% or more:

model before after
yolo11.onnx 0.6s 0.1s
yolo11-fp16.onnx 1.8s 0.1s

Motivation and Context

@wejoncy wejoncy requested a review from skottmckay December 10, 2024 10:26
@wejoncy wejoncy marked this pull request as ready for review December 10, 2024 10:26
@wejoncy wejoncy linked an issue Dec 10, 2024 that may be closed by this pull request
@wejoncy wejoncy force-pushed the jicwen/coreml_cache branch from 5bfc8eb to fc9db07 Compare December 10, 2024 11:10
@wejoncy wejoncy force-pushed the jicwen/coreml_cache branch from d539da2 to 1d1c874 Compare December 10, 2024 12:42
@wejoncy wejoncy force-pushed the jicwen/coreml_cache branch from 81c2b9e to 7b11848 Compare December 16, 2024 06:16
…ory.h

Co-authored-by: Scott McKay <skottmckay@gmail.com>
…ory.h

Co-authored-by: Scott McKay <skottmckay@gmail.com>
wejoncy and others added 5 commits December 20, 2024 16:50
@wejoncy wejoncy requested a review from skottmckay December 23, 2024 07:57
@skottmckay
Copy link
Contributor

Are there any unit tests for the new code? We should be able to test that the expected cache files are created in the right places and that things like invalid cache key values are rejected.

@wejoncy
Copy link
Contributor Author

wejoncy commented Dec 24, 2024

Are there any unit tests for the new code? We should be able to test that the expected cache files are created in the right places and that things like invalid cache key values are rejected.

Added unit test for three cases where hash is valid or invalid.

wejoncy added 2 commits December 24, 2024 15:18
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

,

@microsoft microsoft deleted a comment from github-actions bot Dec 24, 2024
skottmckay
skottmckay previously approved these changes Dec 29, 2024
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@wejoncy wejoncy merged commit 8687011 into main Dec 31, 2024
96 checks passed
@wejoncy wejoncy deleted the jicwen/coreml_cache branch December 31, 2024 01:29
tarekziade pushed a commit to tarekziade/onnxruntime that referenced this pull request Jan 10, 2025
### Description
Refactor compute plan profiling

Support cache coreml model to speed up session initialization. this is
only support by user provided entry and user responsible to manage the
cache


With the cache, session initialization time can be reduced by 50% or
more:
|model| before| after|
|--|--|--|
|yolo11.onnx| 0.6s|0.1s|
|yolo11-fp16.onnx|1.8s|0.1s|


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: wejoncy <wejoncy@.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
guschmue pushed a commit that referenced this pull request Jan 12, 2025
### Description
Refactor compute plan profiling

Support cache coreml model to speed up session initialization. this is
only support by user provided entry and user responsible to manage the
cache


With the cache, session initialization time can be reduced by 50% or
more:
|model| before| after|
|--|--|--|
|yolo11.onnx| 0.6s|0.1s|
|yolo11-fp16.onnx|1.8s|0.1s|


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: wejoncy <wejoncy@.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CoreML - Writing CoreML Model on every inference session creation
2 participants