Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StaticLLMPipeline: Optimize V-tensors layout by Anatolii Talamanov #1232

Conversation

AsyaPronina
Copy link
Contributor

@AsyaPronina AsyaPronina commented Nov 19, 2024

Retargeted #1177 to master, added port to LTS label.

Ticket:

  • EISW-146537

@AsyaPronina AsyaPronina added the port to LTS PR needs to be ported to LTS label Nov 19, 2024
@github-actions github-actions bot added category: LLM LLM pipeline (stateful, static) category: sampling Sampling / Decoding algorithms category: cmake / build Cmake scripts labels Nov 19, 2024
@AsyaPronina AsyaPronina requested a review from dmatveev November 19, 2024 15:13
@ilya-lavrenov ilya-lavrenov added this to the 2025.0 milestone Nov 19, 2024
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Nov 19, 2024
Merged via the queue into openvinotoolkit:master with commit 9d6bca9 Nov 19, 2024
52 checks passed
Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AsyaPronina for managing this!

// (8) Compile both model
// (6) Apply opt layout if applicable
// NB: Try to apply opt transpose only for Llama-2-7b-chat-hf model
if ( model_desc.name_or_path == "meta-llama/Llama-2-7b-chat-hf" ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both lhs & rhs in == should be lowercased first to make comparison more robust

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will create a ticket for this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need hardcode for specific model? is not model_desc.type == "llama" enough ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilya-lavrenov we're not confident (yet) if it will work for other llama versions. @TolyaTalamanov will generalize this change as he gets back from vacation

@ilya-lavrenov ilya-lavrenov removed the port to LTS PR needs to be ported to LTS label Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: cmake / build Cmake scripts category: LLM LLM pipeline (stateful, static) category: sampling Sampling / Decoding algorithms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants