StaticLLMPipeline: Optimize V-tensors layout by Anatolii Talamanov #1232

AsyaPronina · 2024-11-19T15:11:51Z

Retargeted #1177 to master, added port to LTS label.

Ticket:

EISW-146537

dmatveev

Thanks @AsyaPronina for managing this!

dmatveev · 2024-11-19T21:01:04Z

src/cpp/src/llm_pipeline_static.cpp

-    // (8) Compile both model
+    // (6) Apply opt layout if applicable
+    // NB: Try to apply opt transpose only for Llama-2-7b-chat-hf model
+    if ( model_desc.name_or_path == "meta-llama/Llama-2-7b-chat-hf" ||


I think both lhs & rhs in == should be lowercased first to make comparison more robust

Will create a ticket for this!

do we need hardcode for specific model? is not model_desc.type == "llama" enough ?

@ilya-lavrenov we're not confident (yet) if it will work for other llama versions. @TolyaTalamanov will generalize this change as he gets back from vacation

StaticLLMPipeline: Optimize V-tensors layout by Anatolii Talamanov

d49648a

AsyaPronina added the port to LTS PR needs to be ported to LTS label Nov 19, 2024

AsyaPronina requested review from ilya-lavrenov and Wovchena November 19, 2024 15:11

github-actions bot added category: LLM LLM pipeline (stateful, static) category: sampling Sampling / Decoding algorithms category: cmake / build Cmake scripts labels Nov 19, 2024

AsyaPronina mentioned this pull request Nov 19, 2024

StaticLLMPipeline: Optimize V-tensors layout #1177

Merged

AsyaPronina requested a review from dmatveev November 19, 2024 15:13

Wovchena approved these changes Nov 19, 2024

View reviewed changes

ilya-lavrenov added this to the 2025.0 milestone Nov 19, 2024

ilya-lavrenov assigned Wovchena Nov 19, 2024

ilya-lavrenov added this pull request to the merge queue Nov 19, 2024

Merged via the queue into openvinotoolkit:master with commit 9d6bca9 Nov 19, 2024
52 checks passed

dmatveev reviewed Nov 19, 2024

View reviewed changes

ilya-lavrenov removed the port to LTS PR needs to be ported to LTS label Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StaticLLMPipeline: Optimize V-tensors layout by Anatolii Talamanov #1232

StaticLLMPipeline: Optimize V-tensors layout by Anatolii Talamanov #1232

AsyaPronina commented Nov 19, 2024 •

edited

Loading

dmatveev left a comment

dmatveev Nov 19, 2024

AsyaPronina Nov 19, 2024

ilya-lavrenov Nov 20, 2024

dmatveev Nov 20, 2024

StaticLLMPipeline: Optimize V-tensors layout by Anatolii Talamanov #1232

StaticLLMPipeline: Optimize V-tensors layout by Anatolii Talamanov #1232

Conversation

AsyaPronina commented Nov 19, 2024 • edited Loading

dmatveev left a comment

Choose a reason for hiding this comment

dmatveev Nov 19, 2024

Choose a reason for hiding this comment

AsyaPronina Nov 19, 2024

Choose a reason for hiding this comment

ilya-lavrenov Nov 20, 2024

Choose a reason for hiding this comment

dmatveev Nov 20, 2024

Choose a reason for hiding this comment

AsyaPronina commented Nov 19, 2024 •

edited

Loading