Fix embeddings memory corruption #6467

dhiltgen · 2024-08-22T19:40:07Z

The patch was leading to a buffer overrun corruption. Once removed though, parallism in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count. To work around this, only use slot 0 for embeddings.

Fixes #6435

dhiltgen · 2024-08-22T19:48:11Z

The embeddings before/after this change are the same, but just the prompt_eval_count is off for some reason.

dhiltgen · 2024-08-22T20:03:33Z

Turns out the prompt_eval_count has changed since 0.3.5, so this is unrelated to my change. The test's assumption no longer holds. I'll update that in the test.

server/sched.go

The patch was leading to a buffer overrun corruption. Once removed though, parallism in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count. To work around this, only use slot 0 for embeddings.

The token eval count has changed with recent llama.cpp bumps (0.3.5+)

* Fix embeddings memory corruption The patch was leading to a buffer overrun corruption. Once removed though, parallism in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count. To work around this, only use slot 0 for embeddings. * Fix embed integration test assumption The token eval count has changed with recent llama.cpp bumps (0.3.5+)

dhiltgen marked this pull request as draft August 22, 2024 19:40

dhiltgen marked this pull request as ready for review August 22, 2024 20:08

dhiltgen force-pushed the embeddings branch from 80dc0d2 to 97da74d Compare August 22, 2024 20:42

jmorganca reviewed Aug 22, 2024

View reviewed changes

server/sched.go Outdated Show resolved Hide resolved

dhiltgen added 2 commits August 22, 2024 14:08

Fix embeddings memory corruption

8972fd2

The patch was leading to a buffer overrun corruption. Once removed though, parallism in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count. To work around this, only use slot 0 for embeddings.

Fix embed integration test assumption

9c33cd1

The token eval count has changed with recent llama.cpp bumps (0.3.5+)

dhiltgen force-pushed the embeddings branch from 97da74d to 9c33cd1 Compare August 22, 2024 21:08

jmorganca approved these changes Aug 22, 2024

View reviewed changes

dhiltgen merged commit 90ca841 into ollama:main Aug 22, 2024
15 checks passed

dhiltgen deleted the embeddings branch August 22, 2024 21:51

rick-github mentioned this pull request Sep 13, 2024

The system parameter OLLAMA_NUM_PALLEL is invalid for embeding model #6792

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix embeddings memory corruption #6467

Fix embeddings memory corruption #6467

dhiltgen commented Aug 22, 2024 •

edited

Loading

dhiltgen commented Aug 22, 2024

dhiltgen commented Aug 22, 2024

Fix embeddings memory corruption #6467

Fix embeddings memory corruption #6467

Conversation

dhiltgen commented Aug 22, 2024 • edited Loading

dhiltgen commented Aug 22, 2024

dhiltgen commented Aug 22, 2024

dhiltgen commented Aug 22, 2024 •

edited

Loading