[GPU] Relax SDPA head size limitations for LLMs #24930

sshlyapn · 2024-06-10T12:03:51Z

Details:

Relax SDPA head size limitations for LLMs from 128 only to a range of 64 to 256
Fix accuracy issue in SDPA first token processing for TARGET_SEQ_LEN_BLOCK_SIZE % SUBGROUPS_PER_WG != 0 case

…nge of 64 to 256; fix accuracy issue in SDPA kernel when processing the first token

### Details: - Relax SDPA head size limitations for LLMs from 128 only to a range of 64 to 256 - Fix accuracy issue in SDPA first token processing for `TARGET_SEQ_LEN_BLOCK_SIZE % SUBGROUPS_PER_WG != 0` case ### Tickets: - *ticket-id*

….2 (#25261) ### Details: This PR is a backport of original #24930 to OV 2024.2 version - Relax SDPA head size limitations for LLMs from 128 only to a range of 64 to 256 - Fix accuracy issue in SDPA first token processing for `TARGET_SEQ_LEN_BLOCK_SIZE % SUBGROUPS_PER_WG != 0` case

[GPU] Relax SDPA head size limitations for LLMs from 128 only to a ra…

f6664dc

…nge of 64 to 256; fix accuracy issue in SDPA kernel when processing the first token

sshlyapn requested review from a team as code owners June 10, 2024 12:03

sshlyapn added the category: GPU OpenVINO GPU plugin label Jun 10, 2024

sshlyapn added this to the 2024.3 milestone Jun 10, 2024

sshlyapn added the under_perf_check label Jun 10, 2024

p-durandin approved these changes Jun 11, 2024

View reviewed changes

p-durandin added this pull request to the merge queue Jun 11, 2024

Merged via the queue into openvinotoolkit:master with commit 00f4c99 Jun 11, 2024
102 checks passed

sshlyapn mentioned this pull request Jun 27, 2024

[GPU] Relax SDPA head size limitations for LLMs - backport to OV 2024.2 #25261

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Relax SDPA head size limitations for LLMs #24930

[GPU] Relax SDPA head size limitations for LLMs #24930

sshlyapn commented Jun 10, 2024 •

edited

Loading

[GPU] Relax SDPA head size limitations for LLMs #24930

[GPU] Relax SDPA head size limitations for LLMs #24930

Conversation

sshlyapn commented Jun 10, 2024 • edited Loading

Details:

sshlyapn commented Jun 10, 2024 •

edited

Loading