Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qwq instance crash #8309

Closed
wrpromail opened this issue Jan 5, 2025 · 2 comments
Closed

qwq instance crash #8309

wrpromail opened this issue Jan 5, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@wrpromail
Copy link

What is the issue?

hardware 3090 X 8
AMD EPYC 7402 24-Core Processor
CUDA 12.2
driver version 535.183.06
running on ollama docker image(0.5.4)

using command ollama run qwq and type some question for testing, crash after three question.
bash reply:
readlink -f /proc/1Error: an error was encountered while running the model: CUDA error: an illegal memory access was encountered
current device: 5, in function ggml_backend_cuda_synchronize at llama/ggml-cuda/ggml-cuda.cu:2317
cudaStreamSynchronize(cuda_ctx->stream())
llama/ggml-cuda/ggml-cuda.cu:96: CUDA error

container log output:
goroutine 158 gp=0xc000443180 m=nil [GC worker (idle)]:
runtime.gopark(0x5b357c2acf3bbd?, 0x1?, 0x37?, 0xd?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00044af38 sp=0xc00044af18 pc=0x556f7a50092e
runtime.gcBgMarkWorker(0xc00015d0a0)
runtime/mgc.go:1412 +0xe9 fp=0xc00044afc8 sp=0xc00044af38 pc=0x556f7a4ae209
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00044afe0 sp=0xc00044afc8 pc=0x556f7a4ae0e5
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00044afe8 sp=0xc00044afe0 pc=0x556f7a508561
created by runtime.gcBgMarkStartWorkers in goroutine 21
runtime/mgc.go:1328 +0x105

goroutine 159 gp=0xc000443340 m=nil [GC worker (idle)]:
runtime.gopark(0x5b357c2acf1b00?, 0x1?, 0x15?, 0x62?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00044b738 sp=0xc00044b718 pc=0x556f7a50092e
runtime.gcBgMarkWorker(0xc00015d0a0)
runtime/mgc.go:1412 +0xe9 fp=0xc00044b7c8 sp=0xc00044b738 pc=0x556f7a4ae209
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00044b7e0 sp=0xc00044b7c8 pc=0x556f7a4ae0e5
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00044b7e8 sp=0xc00044b7e0 pc=0x556f7a508561
created by runtime.gcBgMarkStartWorkers in goroutine 21
runtime/mgc.go:1328 +0x105

goroutine 161 gp=0xc0004436c0 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?)
runtime/proc.go:424 +0xce fp=0xc00044bda8 sp=0xc00044bd88 pc=0x556f7a50092e
runtime.netpollblock(0x556f7a53c158?, 0x7a499186?, 0x6f?)
runtime/netpoll.go:575 +0xf7 fp=0xc00044bde0 sp=0xc00044bda8 pc=0x556f7a4c5697
internal/poll.runtime_pollWait(0x7f9443577e38, 0x72)
runtime/netpoll.go:351 +0x85 fp=0xc00044be00 sp=0xc00044bde0 pc=0x556f7a4ffc25
internal/poll.(*pollDesc).wait(0xc000232e80?, 0xc0002045e1?, 0x0)
internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00044be28 sp=0xc00044be00 pc=0x556f7a555a67
internal/poll.(*pollDesc).waitRead(...)
internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000232e80, {0xc0002045e1, 0x1, 0x1})
internal/poll/fd_unix.go:165 +0x27a fp=0xc00044bec0 sp=0xc00044be28 pc=0x556f7a5565ba
net.(*netFD).Read(0xc000232e80, {0xc0002045e1?, 0x0?, 0x0?})
net/fd_posix.go:55 +0x25 fp=0xc00044bf08 sp=0xc00044bec0 pc=0x556f7a5ce885
net.(*conn).Read(0xc000246000, {0xc0002045e1?, 0x0?, 0x0?})
net/net.go:189 +0x45 fp=0xc00044bf50 sp=0xc00044bf08 pc=0x556f7a5d8285
net.(*TCPConn).Read(0x0?, {0xc0002045e1?, 0x0?, 0x0?})
:1 +0x25 fp=0xc00044bf80 sp=0xc00044bf50 pc=0x556f7a5e5325
net/http.(*connReader).backgroundRead(0xc0002045d0)
net/http/server.go:690 +0x37 fp=0xc00044bfc8 sp=0xc00044bf80 pc=0x556f7a706077
net/http.(*connReader).startBackgroundRead.gowrap2()
net/http/server.go:686 +0x25 fp=0xc00044bfe0 sp=0xc00044bfc8 pc=0x556f7a705fa5
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00044bfe8 sp=0xc00044bfe0 pc=0x556f7a508561
created by net/http.(*connReader).startBackgroundRead in goroutine 58
net/http/server.go:686 +0xb6

BTW:
ollama instance occupied about 43 GB VRAM by nvidia-smi
but I got a wrong memory usage by ollama ps
root@c3b740f86306:/# ollama ps
NAME ID SIZE PROCESSOR UNTIL
qwq:latest 46407beda5c0 125 GB 100% GPU 19 minutes from now

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.5.4

@wrpromail wrpromail added the bug Something isn't working label Jan 5, 2025
@rick-github
Copy link
Collaborator

Full server log will help in debugging.

@wrpromail
Copy link
Author

Full server log will help in debugging.
Thank you. I noticed that when I run other models, the ollama ps command also shows incorrect VRAM usage, and similar crashes occur. It might be a hardware or driver issue, so I will close the issue for now. I will update the GPU server and perform the same validation on the new server while collecting complete logs. If the issue persists, I will create a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants