Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE #1275

Merged
merged 2 commits into from
Dec 3, 2024

Conversation

esmirno
Copy link
Contributor

@esmirno esmirno commented Nov 28, 2024

E-149055

@github-actions github-actions bot added the category: LLM LLM pipeline (stateful, static) label Nov 28, 2024
@dmatveev dmatveev self-assigned this Nov 29, 2024
@dmatveev dmatveev added this to the 2024.6 milestone Nov 29, 2024
@dmatveev dmatveev added the port to LTS PR needs to be ported to LTS label Nov 29, 2024
Comment on lines 254 to +255
OPENVINO_THROW("Unsupported \"GENERATE_HINT\" provided: " +
str + ". Please select either \"FAST_COMPILE\" or \"BEST_PERF\".");
str + ". Please select either \"" + to_string(GenerateHint::BEST_PERF) + "\" or \"" + to_string(GenerateHint::FAST_COMPILE) +"\".");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case - if operator<<(ostream&) is provided for the hint type, OPENVINO_THROW concatenates the arguments to a string itself so it could be just

    OPENVINO_THROW("Unsupported \"GENERATE_HINT\" provided: ", str, ". Please select either \"", GenerateHint::BEST_PERF, "\" or \"", GenerateHint::FAST_COMPILE, "\".");

@@ -523,6 +534,9 @@ ov::AnyMap get_default_generate_config(const std::shared_ptr<ov::Model>& model,
if (npudesc.has_value() && npudesc->arch == "4000") {
config.emplace("NPU_DPU_GROUPS", 4);
}
if (hint == GenerateHint::FAST_COMPILE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check if it doesn't break anything on arch<4000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks there is no regressions on MTL system. While performance(tokens rate) of LLama2 increased at least by 10%

@dmatveev dmatveev enabled auto-merge December 3, 2024 11:48
@ilya-lavrenov
Copy link
Contributor

build_jenkins

@dmatveev dmatveev added this pull request to the merge queue Dec 3, 2024
@ilya-lavrenov ilya-lavrenov removed the port to LTS PR needs to be ported to LTS label Dec 3, 2024
Merged via the queue into openvinotoolkit:master with commit e2fa0d0 Dec 3, 2024
53 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Dec 4, 2024
sungeunk pushed a commit to sungeunk/openvino.genai that referenced this pull request Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: LLM LLM pipeline (stateful, static)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants