-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE #1275
Conversation
OPENVINO_THROW("Unsupported \"GENERATE_HINT\" provided: " + | ||
str + ". Please select either \"FAST_COMPILE\" or \"BEST_PERF\"."); | ||
str + ". Please select either \"" + to_string(GenerateHint::BEST_PERF) + "\" or \"" + to_string(GenerateHint::FAST_COMPILE) +"\"."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case - if operator<<(ostream&)
is provided for the hint type, OPENVINO_THROW
concatenates the arguments to a string itself so it could be just
OPENVINO_THROW("Unsupported \"GENERATE_HINT\" provided: ", str, ". Please select either \"", GenerateHint::BEST_PERF, "\" or \"", GenerateHint::FAST_COMPILE, "\".");
@@ -523,6 +534,9 @@ ov::AnyMap get_default_generate_config(const std::shared_ptr<ov::Model>& model, | |||
if (npudesc.has_value() && npudesc->arch == "4000") { | |||
config.emplace("NPU_DPU_GROUPS", 4); | |||
} | |||
if (hint == GenerateHint::FAST_COMPILE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check if it doesn't break anything on arch<4000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks there is no regressions on MTL system. While performance(tokens rate) of LLama2 increased at least by 10%
build_jenkins |
copy of PR : #1275 to release/2024/5
E-149055