Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fly example
Browse files Browse the repository at this point in the history
mxyng committed Nov 1, 2023
1 parent c05ab9a commit b9a1174
Showing 4 changed files with 136 additions and 0 deletions.
1 change: 1 addition & 0 deletions examples/flyio/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fly.toml
67 changes: 67 additions & 0 deletions examples/flyio/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Deploy Ollama to Fly.io

> Note: this example exposes a public endpoint and does not configure authentication. Use with care.
## Prerequisites

- Ollama: https://ollama.ai/download
- Fly.io account. Sign up for a free account: https://fly.io/app/sign-up

## Steps

1. Login to Fly.io

```bash
fly auth login
```

1. Create a new Fly app

```bash
fly launch --name <name> --image ollama/ollama --internal-port 11434 --vm-size shared-cpu-8x --now
```

1. Pull and run `orca-mini:3b`

```bash
OLLAMA_HOST=https://<name>.fly.dev ollama run orca-mini:3b
```

`shared-cpu-8x` is a free-tier eligible machine type. For better performance, switch to a `performance` or `dedicated` machine type or attach a GPU for hardware acceleration (see below).

## (Optional) Persistent Volume

By default Fly Machines use ephemeral storage which is problematic if you want to use the same model across restarts without pulling it again. Create and attach a persistent volume to store the downloaded models:

1. Create the Fly Volume

```bash
fly volume create ollama
```

1. Update `fly.toml` and add `[mounts]`

```toml
[mounts]
source = "ollama"
destination = "/mnt/ollama/models"
```

1. Update `fly.toml` and add `[env]`

```toml
[env]
OLLAMA_MODELS = "/mnt/ollama/models"
```

1. Deploy your app

```bash
fly deploy
```

## (Optional) Hardware Acceleration

Fly.io GPU is currently in waitlist. Sign up for the waitlist: https://fly.io/gpu

Once you've been accepted, create the app with the additional flags `--vm-gpu-kind a100-pcie-40gb` or `--vm-gpu-kind a100-pcie-80gb`.
1 change: 1 addition & 0 deletions examples/flyio/deploy-fly/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fly.toml
67 changes: 67 additions & 0 deletions examples/flyio/deploy-fly/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Deploy Ollama to Fly.io

> Note: this example exposes a public endpoint and does not configure authentication. Use with care.
## Prerequisites

- Ollama: https://ollama.ai/download
- Fly.io account. Sign up for a free account: https://fly.io/app/sign-up

## Steps

1. Login to Fly.io

```bash
fly auth login
```

1. Create a new Fly app

```bash
fly launch --name <name> --image ollama/ollama --internal-port 11434 --vm-size shared-cpu-8x --now
```

1. Pull and run `orca-mini:3b`

```bash
OLLAMA_HOST=https://<name>.fly.dev ollama run orca-mini:3b
```

`shared-cpu-8x` is a free-tier eligible machine type. For better performance, switch to a `performance` or `dedicated` machine type or attach a GPU for hardware acceleration (see below).

## (Optional) Persistent Volume

By default Fly Machines use ephemeral storage which is problematic if you want to use the same model across restarts without pulling it again. Create and attach a persistent volume to store the downloaded models:

1. Create the Fly Volume

```bash
fly volume create ollama
```

1. Update `fly.toml` and add `[mounts]`

```toml
[mounts]
source = "ollama"
destination = "/mnt/ollama/models"
```

1. Update `fly.toml` and add `[env]`

```toml
[env]
OLLAMA_MODELS = "/mnt/ollama/models"
```

1. Deploy your app

```bash
fly deploy
```

## (Optional) Hardware Acceleration

Fly.io GPU is currently in waitlist. Sign up for the waitlist: https://fly.io/gpu

Once you've been accepted, create the app with the additional flags `--vm-gpu-kind a100-pcie-40gb` or `--vm-gpu-kind a100-pcie-80gb`.

0 comments on commit b9a1174

Please sign in to comment.