Skip to content

Commit

Permalink
doc/user: reorg monitoring docs into "operations" section
Browse files Browse the repository at this point in the history
This makes space for additional deployment instructions.

I also took the opportunity to reformat the monitoring docs to be more
consistent with the rest of the documentation.
  • Loading branch information
benesch committed Jul 30, 2020
1 parent 709b23b commit 9fe686d
Show file tree
Hide file tree
Showing 5 changed files with 95 additions and 193 deletions.
5 changes: 5 additions & 0 deletions doc/user/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,11 @@ identifier = "sql"
name = "SQL"
weight = 60

[[menu.main]]
identifier = "operations"
name = "Operations"
weight = 80

[[menu.main]]
identifier = "demos"
name = "Demos"
Expand Down
131 changes: 0 additions & 131 deletions doc/user/content/monitoring/_index.md

This file was deleted.

5 changes: 5 additions & 0 deletions doc/user/content/ops/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: "Operations"
description: "Find details about running your Materialize instances"
disable_toc: true
---
23 changes: 23 additions & 0 deletions doc/user/content/ops/deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: "Deployment"
description: "Find details about running your Materialize instances"
menu:
main:
parent: operations
---

_This page is a work in progress and will have more detail in the coming months.
If you have specific questions, feel free to [file a GitHub
issue](https://github.com/MaterializeInc/materialize/issues/new?labels=C-feature&template=feature.md)._

## Memory

Materialize stores the majority of its state in-memory, and works best when the streamed data
can be reduced in some way. For example, if you know that only a subset of your rows and columns
are relevant for your queries, it helps to avoid materializing sources or views until you've
expressed this to the system (we can avoid stashing that data, which can in some cases dramatically
reduce the memory footprint).

To minimize the chances that Materialize runs out of memory in a production environment,
we recommend you make additional memory available to Materialize via a SSD-backed
swap file or swap partition.
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
---
title: "Monitoring and Operations"
title: "Monitoring"
description: "Find details about running your Materialize instances"
menu: "main"
weight: 80
aliases:
- /monitoring
menu:
main:
parent: operations
---

_This page is a work in progress and will have more detail in the coming months.
Expand All @@ -11,27 +14,28 @@ issue](https://github.com/MaterializeInc/materialize/issues/new?labels=C-feature

Materialize supports integration with monitoring tools using HTTP endpoints.

### Quick monitoring dashboard
## Quick monitoring dashboard

Materialize provides a recommended Grafana dashboard and an all-inclusive Docker image
preconfigured to run the dashboard as [`materialize/dashboard`][simplemon-hub].
preconfigured to run it as [`materialize/dashboard`][simplemon-hub].

The only configuration required to get started with the Docker image is the
`MATERIALIZED_URL=<host>:<port>` environment variable.

As an example, if you are running `materialized` in a cloud instance at the IP address
`172.16.0.0`, you can launch the dashboard by running this command and
opening `http://localhost:3000` in your web browser:
As an example, if you are running Materialize in a cloud instance at the IP address
`172.16.0.0`, you can get a dashboard by running this command and
opening <http://localhost:3000> in your web browser:

```shell
# expose ports ______point it at materialize______
$ docker run -d -p 3000:3000 -e MATERIALIZED_URL=172.16.0.0:6875 materialize/dashboard
# expose ports ______point it at materialize______
```

See [Observing local Materialize](#observing-local-materialize) below if you want to run
the dashboard on the same machine on which you are running Materialize.
To instead run the dashboard on the machine on which you are running
Materialize, see the [Observing local Materialize](#observing-local-materialize)
section below.

The `materialize/dashboard` Docker image bundles Prometheus and Grafana together to make
The dashboard Docker image bundles Prometheus and Grafana together to make
getting insight into Materialize's performance easy. It is not particularly
configurable, and in particular is not designed to handle large metric volumes or long
uptimes. It will start truncating metrics history after about 1GB of storage, which
Expand All @@ -49,57 +53,25 @@ $ docker run -d \
materialize/dashboard
```

### Health check

Materialize supports a minimal health check endpoint at `<materialized
host>/status`.

### Prometheus

Materialize exposes [Prometheus](https://prometheus.io/) metrics at the default
path, `<materialized host>/metrics`.

Materialize broadly publishes the following types of data there:

- Materialize-specific data with a `mz_*` prefix. For example,
`rate(mz_responses_sent_total[10s])` will show you the number of responses
averaged over 10 second windows.
- Standard process metrics with a `process_*` prefix. For exmple, `process_cpu`.

### Grafana

Materialize provides a [recommended dashboard][dashboard-json] that you can [import into
Grafana][graf-import]. It relies on you having configured Prometheus to scrape
materialized.

### Datadog

Materialize metrics can be sent to Datadog via the
[OpenMetrics agent check](https://www.datadoghq.com/blog/monitor-prometheus-metrics/).
(Requires Datadog Agent 6 and above). Simply configure _"prometheus_url"_ (ie
`http://<materialized host>/metrics`), namespace, and metrics (ie `mz*`) in
_"openmetrics.d/conf.yaml"_.

## Other Setups

Even if you aren't running materialized at web scale, you can still use our web-scale
tools to observe it.

### Observing local Materialize

Using the dashboard to observe a Materialize instance running on the same
machine as the dashboard is complicated by Docker. The solution depends upon
your host platform.

#### Inside Docker Compose or Kubernetes

Local schedulers like Docker Compose (which we use for our demos) or Kubernetes will
typically expose running containers to each other using their service name as a public
DNS hostname, but _only_ within the network that they are running in.

The easiest way to use the dashboard inside a scheduler is to tell the scheduler to run
it. [Here is an example][dc-example] of configuring Docker Compose to run the dashboard.
it. Check out the [example configuration for Docker Compose][dc-example].

#### On MacOS, with materialized running outside of Docker
#### On macOS, with Materialize running outside of Docker

The problem with this is that `localhost` inside of Docker cannot, on Docker for Mac,
refer to the mac network. So instead you must use `host.docker.internal`:
The problem with this is that `localhost` inside of Docker does not, on Docker for Mac,
refer to the macOS network. So instead you must use `host.docker.internal`:

```
docker run -p 3000:3000 -e MATERIALIZED_URL=host.docker.internal:6875 materialize/dashboard
Expand All @@ -108,7 +80,7 @@ docker run -p 3000:3000 -e MATERIALIZED_URL=host.docker.internal:6875 materializ
#### On Linux, with Materialize running outside of Docker

Docker containers use a different network than their host by default, but that is easy to
get around using the `--network` flag. Using the host network means that ports will be
override by using the `--network host` option. Using the host network means that ports will be
allocated from the host, so the `-p` flag is no longer necessary:

```
Expand All @@ -120,14 +92,42 @@ docker run --network host -e MATERIALIZED_URL=localhost:6875 materialize/dashboa
[graf-import]: https://grafana.com/docs/grafana/latest/reference/export_import/#importing-a-dashboard
[dc-example]: https://github.com/MaterializeInc/materialize/blob/d793b112758c840c1240eefdd56ca6f7e4f484cf/demo/billing/mzcompose.yml#L60-L70

## Memory

Materialize stores the majority of its state in-memory, and works best when the streamed data
can be reduced in some way. For example, if you know that only a subset of your rows and columns
are relevant for your queries, it helps to avoid materializing sources or views until you've
expressed this to the system (we can avoid stashing that data, which can in some cases dramatically
reduce the memory footprint).

To minimize the chances that Materialize runs out of memory in a production environment,
we recommend you make additional memory available to Materialize via a SSD-backed
swap file or swap partition.
## Health check

Materialize supports a minimal health check endpoint at `<materialized
host>/status`.

## Prometheus

Materialize exposes [Prometheus](https://prometheus.io/) metrics at the default
path, `<materialized host>/metrics`.

Materialize broadly publishes the following types of data there:

- Materialize-specific data with a `mz_*` prefix. For example,
`rate(mz_responses_sent_total[10s])` will show you the number of responses
averaged over 10 second windows.
- Standard process metrics with a `process_*` prefix. For exmple, `process_cpu`.

## Grafana

Materialize provides a [recommended dashboard][dashboard-json] that you can [import into
Grafana][graf-import]. It relies on you having configured Prometheus to scrape
Materialize.

## Datadog

Materialize metrics can be sent to Datadog via the
[OpenMetrics agent check](https://docs.datadoghq.com/integrations/openmetrics/),
which is bundled with recent versions of the Datadog agent.

Simply add the following configuration parameters to
`openmetrics.d/conf.yaml`:

Configuration parameter | Value
------------------------|------
`prometheus_url` | `http://<materialized host>/metrics`
`namespace` | Your choice
`metrics` | `[mz*]` to select all metrics, or a list of specific metrics

0 comments on commit 9fe686d

Please sign in to comment.