StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE #1275

esmirno · 2024-11-28T20:54:21Z

E-149055

dmatveev · 2024-11-29T14:48:32Z

src/cpp/src/llm_pipeline_static.cpp

    OPENVINO_THROW("Unsupported \"GENERATE_HINT\" provided: " +
-                   str + ". Please select either \"FAST_COMPILE\" or \"BEST_PERF\".");
+                   str + ". Please select either \"" + to_string(GenerateHint::BEST_PERF) + "\" or \"" + to_string(GenerateHint::FAST_COMPILE) +"\".");


Just in case - if operator<<(ostream&) is provided for the hint type, OPENVINO_THROW concatenates the arguments to a string itself so it could be just

OPENVINO_THROW("Unsupported \"GENERATE_HINT\" provided: ", str, ". Please select either \"", GenerateHint::BEST_PERF, "\" or \"", GenerateHint::FAST_COMPILE, "\".");

dmatveev · 2024-11-29T14:49:38Z

src/cpp/src/llm_pipeline_static.cpp

@@ -523,6 +534,9 @@ ov::AnyMap get_default_generate_config(const std::shared_ptr<ov::Model>& model,
    if (npudesc.has_value() && npudesc->arch == "4000") {
        config.emplace("NPU_DPU_GROUPS", 4);
    }
+    if (hint == GenerateHint::FAST_COMPILE) {


Let's check if it doesn't break anything on arch<4000

Looks there is no regressions on MTL system. While performance(tokens rate) of LLama2 increased at least by 10%

ilya-lavrenov · 2024-12-03T19:24:16Z

build_jenkins

copy of PR : #1275 to release/2024/5

…penvinotoolkit#1275) E-149055

Introduced NPUW_UNFOLD_IREQ for fast_compile hint

0bd72fc

github-actions bot added the category: LLM LLM pipeline (stateful, static) label Nov 28, 2024

dmatveev self-assigned this Nov 29, 2024

dmatveev added this to the 2024.6 milestone Nov 29, 2024

dmatveev added the port to LTS PR needs to be ported to LTS label Nov 29, 2024

dmatveev reviewed Nov 29, 2024

View reviewed changes

Merge branch 'master' into master

c2fa294

dmatveev approved these changes Dec 3, 2024

View reviewed changes

dmatveev enabled auto-merge December 3, 2024 11:48

ilya-lavrenov mentioned this pull request Dec 3, 2024

Introduced NPUW_UNFOLD_IREQ for FAST_COMPILE hint #1288

Merged

dmatveev added this pull request to the merge queue Dec 3, 2024

ilya-lavrenov removed the port to LTS PR needs to be ported to LTS label Dec 3, 2024

Merged via the queue into openvinotoolkit:master with commit e2fa0d0 Dec 3, 2024
53 checks passed

github-merge-queue bot pushed a commit that referenced this pull request Dec 4, 2024

Introduced NPUW_UNFOLD_IREQ for FAST_COMPILE hint (#1288)

e42723a

copy of PR : #1275 to release/2024/5

sungeunk pushed a commit to sungeunk/openvino.genai that referenced this pull request Dec 16, 2024

StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE (o…

3db2d18

…penvinotoolkit#1275) E-149055

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE #1275

StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE #1275

esmirno commented Nov 28, 2024

dmatveev Nov 29, 2024

dmatveev Nov 29, 2024

esmirno Dec 3, 2024

ilya-lavrenov commented Dec 3, 2024

StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE #1275

StaticLLMPipeline: Introduced NPUW_UNFOLD_IREQ for hint FAST_COMPILE #1275

Conversation

esmirno commented Nov 28, 2024

dmatveev Nov 29, 2024

Choose a reason for hiding this comment

dmatveev Nov 29, 2024

Choose a reason for hiding this comment

esmirno Dec 3, 2024

Choose a reason for hiding this comment

ilya-lavrenov commented Dec 3, 2024