Nvidia GPU exporter for prometheus, using nvidia-smi
binary to gather metrics.
There are many Nvidia GPU exporters out there however they have problems such as not being maintained, not providing pre-built binaries, having a dependency to Linux and/or Docker, targeting enterprise setups (DCGM) and so on.
This is a simple exporter that uses nvidia-smi(.exe)
binary to collect, parse and export metrics.
This makes it possible to run it on Windows and get GPU metrics while gaming - no Docker or Linux required.
This project is based on a0s/nvidia-smi-exporter. However, this one is written in Go to produce a single, static binary.
If you are a gamer who's into monitoring, you are in for a treat.
- Will work on any system that has
nvidia-smi(.exe)?
binary - Windows, Linux, MacOS... No C bindings required - Doesn't even need to run the monitored machine: can be configured to execute
nvidia-smi
command remotely - No need for a Docker or Kubernetes environment
- Auto-discovery of the metric fields
nvidia-smi
can expose (future-compatible) - Comes with its own Grafana dashboard
You can use the official Grafana dashboard to see your GPU metrics in a nicely visualized way.
Requirements:
- Scoop package manager
- NSSM (get the latest pre-release version)
Installation steps:
- Open a privileged powershell prompt (right click - Run as administrator)
- Run the following commands:
scoop bucket add nvidia_gpu_exporter https://github.com/utkuozdemir/scoop_nvidia_gpu_exporter.git
scoop install nvidia_gpu_exporter/nvidia_gpu_exporter --global
New-NetFirewallRule -DisplayName "Nvidia GPU Exporter" -Direction Inbound -Action Allow -Protocol TCP -LocalPort 9835
nssm install nvidia_gpu_exporter nvidia_gpu_exporter "C:\ProgramData\scoop\apps\nvidia_gpu_exporter\current\nvidia_gpu_exporter.exe"
Start-Service nvidia_gpu_exporter
- Go to the releases and download the latest release archive for your platform.
- Extract the archive.
- Move the binary to somewhere in your
PATH
.
Sample steps for Linux 64-bit:
$ VERSION=0.1.2
$ wget https://github.com/utkuozdemir/nvidia_gpu_exporter/releases/download/v${VERSION}/nvidia_gpu_exporter_${VERSION}_linux_x86_64.tar.gz
$ tar -xvzf nvidia_gpu_exporter_${VERSION}_linux_x86_64.tar.gz
$ mv nvidia_gpu_exporter /usr/local/bin
$ nvidia_gpu_exporter --help
The usage of the binary is the following:
usage: nvidia_gpu_exporter [<flags>]
Flags:
-h, --help Show context-sensitive help (also try --help-long and --help-man).
--web.config.file="" [EXPERIMENTAL] Path to configuration file that can enable TLS or authentication.
--web.listen-address=":9835"
Address to listen on for web interface and telemetry.
--web.telemetry-path="/metrics"
Path under which to expose metrics.
--nvidia-smi-command="nvidia-smi"
Path or command to be used for the nvidia-smi executable
--query-field-names="AUTO"
Comma-separated list of the query fields. You can find out possible fields by running `nvidia-smi --help-query-gpus`. The value `AUTO` will
automatically detect the fields to query.
--log.level=info Only log messages with the given severity or above. One of: [debug, info, warn, error]
--log.format=logfmt Output format of log messages. One of: [logfmt, json]
--version Show application version.
The exporter can be configured to scrape metrics from a remote machine.
An example use case is running the exporter in a Raspberry Pi in your home network while scraping the metrics from your PC over SSH.
The exporter supports arbitrary commands with arguments to produce nvidia-smi
-like output.
Therefore, configuration is pretty straightforward.
Simply override the --nvidia-smi-command
command-line argument (replace SSH_USER
and SSH_HOST
with SSH credentials):
nvidia_gpu_exporter --nvidia-smi-command "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null SSH_USER@SSH_HOST nvidia-smi"
See CONTRIBUTING for details.