fix: solve network disruption during downloads, add OLLAMA_DOWNLOAD_CONN setting #5683

supercurio · 2024-07-13T22:54:53Z

The process of managing bandwidth for model downloads has been an ongoing journey.

Users reported difficulties when downloading model since January in issue Rate limit download speed on pulling new models #2006
The feature Revert "adjust download and upload concurrency based on available bandwidth" #2995 was reverted in March 2024

The situation left Ollama server with unsafe network concurrency defaults since, causing problems for many users and people sharing the same network, whether they realize Ollama is the origin of their troubles or not.
In the associated issue, users describe in length the problems caused and creative mitigations.
Fortunately, the root cause is simple: 64 concurrent connections, an extremely aggressive value guaranteed to challenge any network congestion algorithm, and the fix is straightforward: opting for 1 concurrent connection by default per model download.
This PR addresses the root cause while adding the ability to configure network concurrency for download if required, via the OLLAMA_DOWNLOAD_CONN setting.
This PR avoids on purpose any complex, ineffective or hard to configure workarounds, like dynamic concurrency adjustments or manual bandwidth limiting.

From the commit associated:
The Ollama server now downloads models using a single connection. This change addresses the root cause of issue #2006 by following best practices instead of relying on workarounds. Users have been reporting problems associated with model downloads since January 2024, describing issues such as "hogging the entire device", "reliably and repeatedly kills my connection", "freezes completely leaving no choice but to hard reset", "when I download models, everyone in the office gets a really slow internet", and "when downloading large models, it feels like my home network is being DDoSed."

The environment variable OLLAMA_DOWNLOAD_CONN can be set to control the number of concurrent connections with a maximum value of 64 (the previous default, an aggressive value - unsafe in some conditions). The new default value is 1, ensuring each Ollama download is given the same priority as other network activities.

An entry in the FAQ describes how to use OLLAMA_DOWNLOAD_CONN for different use cases. This patch comes with a safe and unproblematic default value.

Changes include updates to the envconfig/config.go, cmd/cmd.go, server/download.go, and docs/faq.md files.

…ONN setting The Ollama server now downloads models using a single connection. This change addresses the root cause of issue ollama#2006 by following best practices instead of relying on workarounds. Users have been reporting problems associated with model downloads since January 2024, describing issues such as "hogging the entire device", "reliably and repeatedly kills my connection", "freezes completely leaving no choice but to hard reset", "when I download models, everyone in the office gets a really slow internet", and "when downloading large models, it feels like my home network is being DDoSed." The environment variable `OLLAMA_DOWNLOAD_CONN` can be set to control the number of concurrent connections with a maximum value of 64 (the previous default, an aggressive value - unsafe in some conditions). The new default value is 1, ensuring each Ollama download is given the same priority as other network activities. An entry in the FAQ describes how to use `OLLAMA_DOWNLOAD_CONN` for different use cases. This patch comes with a safe and unproblematic default value. Changes include updates to the `envconfig/config.go`, `cmd/cmd.go`, `server/download.go`, and `docs/faq.md` files.

docs/faq.md

`ollama serve` instead of `ollama server` Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>

donuts-are-good · 2024-07-17T02:15:13Z

lgtm

erkinalp · 2024-07-24T10:07:40Z

doesn't implement the whole of #2006, though

supercurio · 2024-07-24T11:09:44Z

doesn't implement the whole of #2006, though

Correct, and that's intentional.
#2006 describes a problem, but suggests a solution that's far from optimal and only partial: it will not work for variable bandwidth conditions like on mobile networks, unless capping well below the expected minimum, making every download unnecessarily slow.
Since the root cause was not known initially, it was still a reasonable suggestion.

The root cause of the problems reported in #2006 is a wildly excessive amount of 64 default simultaneous connections.
To give context, the official build of the ubiquitous aria2 downloading utility for HTTP and Torrent protocols hardcoded its maximum to only 16 instead.

This fix solves the root cause of the problem while still offering configurability via the OLLAMA_DOWNLOAD_CONN variable.

It would be useful to be able to change runtime parameters like this one, model parallelism, debug status via command line parameters and API calls (like the one for pulling models), however it is out of scope for this fix.

erkinalp · 2024-08-21T15:39:01Z

envconfig/config.go

@@ -215,6 +219,23 @@ func LoadConfig() {
 		}
 	}

+	if dlp := clean("OLLAMA_DOWNLOAD_CONN"); dlp != "" {
+		const minDownloadConnections = 1
+		const maxDownloadConnections = 64


Suggested change

const maxDownloadConnections = 64

const maxDownloadConnections = 1000

Some ollama users have really really fast (multi-gigabit) networks, let them download many parts at once

mchiang0610 · 2024-11-21T10:22:14Z

Thank you so much for contributing this. We have made some improvements on solving the network connections when downloading models, and lowering the number of connections.

Anyone still seeing consistent problems? Closing this for now, but please feel free to reopen.

Maltz42 · 2024-11-27T20:25:37Z

Thank you so much for contributing this. We have made some improvements on solving the network connections when downloading models, and lowering the number of connections.

Anyone still seeing consistent problems? Closing this for now, but please feel free to reopen.

I still experience 10-15% packet loss on a gigabit fiber connection during model pulls using v0.4.5.

IMO, one stream is almost always the appropriate number of streams to default to, but I do appreciate the config option allowing more. This commit seems like the right solution to me, over any hard-coded value >1.

Maltz42 · 2024-12-02T17:26:30Z

@mchiang0610 - Can we re-open this? I'm still seeing that fairly heavy packet loss it in 0.4.7 as well.

Maltz42 · 2024-12-10T23:20:13Z

Still a problem in 0.5.1 - I'd really like to see this one merged.

thinkverse reviewed Jul 13, 2024

View reviewed changes

docs/faq.md Outdated Show resolved Hide resolved

Update docs/faq.md

230a912

`ollama serve` instead of `ollama server` Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>

joelanman mentioned this pull request Aug 6, 2024

Rate limit download speed on pulling new models #2006

Open

erkinalp suggested changes Aug 21, 2024

View reviewed changes

mchiang0610 closed this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: solve network disruption during downloads, add OLLAMA_DOWNLOAD_CONN setting #5683

fix: solve network disruption during downloads, add OLLAMA_DOWNLOAD_CONN setting #5683

supercurio commented Jul 13, 2024

donuts-are-good commented Jul 17, 2024

erkinalp commented Jul 24, 2024

supercurio commented Jul 24, 2024 •

edited

Loading

erkinalp Aug 21, 2024

mchiang0610 commented Nov 21, 2024

Maltz42 commented Nov 27, 2024

Maltz42 commented Dec 2, 2024

Maltz42 commented Dec 10, 2024

	const maxDownloadConnections = 64
	const maxDownloadConnections = 1000

fix: solve network disruption during downloads, add OLLAMA_DOWNLOAD_CONN setting #5683

fix: solve network disruption during downloads, add OLLAMA_DOWNLOAD_CONN setting #5683

Conversation

supercurio commented Jul 13, 2024

donuts-are-good commented Jul 17, 2024

erkinalp commented Jul 24, 2024

supercurio commented Jul 24, 2024 • edited Loading

erkinalp Aug 21, 2024

Choose a reason for hiding this comment

mchiang0610 commented Nov 21, 2024

Maltz42 commented Nov 27, 2024

Maltz42 commented Dec 2, 2024

Maltz42 commented Dec 10, 2024

supercurio commented Jul 24, 2024 •

edited

Loading