Skip to content

Nvidia GPU exporter for prometheus using nvidia-smi binary

License

Notifications You must be signed in to change notification settings

echoblag/nvidia_gpu_exporter

Repository files navigation

nvidia_gpu_exporter

build Coverage Status Go Report Card Latest GitHub release GitHub license GitHub all releases Docker Pulls

Nvidia GPU exporter for prometheus, using nvidia-smi binary to gather metrics.

Introduction

There are many Nvidia GPU exporters out there however they have problems such as not being maintained, not providing pre-built binaries, having a dependency to Linux and/or Docker, targeting enterprise setups (DCGM) and so on.

This is a simple exporter that uses nvidia-smi(.exe) binary to collect, parse and export metrics. This makes it possible to run it on Windows and get GPU metrics while gaming - no Docker or Linux required.

This project is based on a0s/nvidia-smi-exporter. However, this one is written in Go to produce a single, static binary.

If you are a gamer who's into monitoring, you are in for a treat.

Highlights

  • Will work on any system that has nvidia-smi(.exe)? binary - Windows, Linux, MacOS... No C bindings required
  • Doesn't even need to run the monitored machine: can be configured to execute nvidia-smi command remotely
  • No need for a Docker or Kubernetes environment
  • Auto-discovery of the metric fields nvidia-smi can expose (future-compatible)
  • Comes with its own Grafana dashboard

Visualization

You can use the official Grafana dashboard to see your GPU metrics in a nicely visualized way.

Here's how it looks like: Grafana dashboard

Installation

Installing as a Windows Service

Requirements:

Installation steps:

  1. Open a privileged powershell prompt (right click - Run as administrator)
  2. Run the following commands:
scoop bucket add nvidia_gpu_exporter https://github.com/utkuozdemir/scoop_nvidia_gpu_exporter.git
scoop install nvidia_gpu_exporter/nvidia_gpu_exporter --global
New-NetFirewallRule -DisplayName "Nvidia GPU Exporter" -Direction Inbound -Action Allow -Protocol TCP -LocalPort 9835
nssm install nvidia_gpu_exporter nvidia_gpu_exporter "C:\ProgramData\scoop\apps\nvidia_gpu_exporter\current\nvidia_gpu_exporter.exe"
Start-Service nvidia_gpu_exporter

By downloading the binaries (MacOS/Linux/Windows)

  1. Go to the releases and download the latest release archive for your platform.
  2. Extract the archive.
  3. Move the binary to somewhere in your PATH.

Sample steps for Linux 64-bit:

$ VERSION=0.1.2
$ wget https://github.com/utkuozdemir/nvidia_gpu_exporter/releases/download/v${VERSION}/nvidia_gpu_exporter_${VERSION}_linux_x86_64.tar.gz
$ tar -xvzf nvidia_gpu_exporter_${VERSION}_linux_x86_64.tar.gz
$ mv nvidia_gpu_exporter /usr/local/bin
$ nvidia_gpu_exporter --help

Usage

The usage of the binary is the following:

usage: nvidia_gpu_exporter [<flags>]

Flags:
  -h, --help                Show context-sensitive help (also try --help-long and --help-man).
      --web.config.file=""  [EXPERIMENTAL] Path to configuration file that can enable TLS or authentication.
      --web.listen-address=":9835"
                            Address to listen on for web interface and telemetry.
      --web.telemetry-path="/metrics"
                            Path under which to expose metrics.
      --nvidia-smi-command="nvidia-smi"
                            Path or command to be used for the nvidia-smi executable
      --query-field-names="AUTO"
                            Comma-separated list of the query fields. You can find out possible fields by running `nvidia-smi --help-query-gpus`. The value `AUTO` will
                            automatically detect the fields to query.
      --log.level=info      Only log messages with the given severity or above. One of: [debug, info, warn, error]
      --log.format=logfmt   Output format of log messages. One of: [logfmt, json]
      --version             Show application version.

Remote scraping configuration

The exporter can be configured to scrape metrics from a remote machine.

An example use case is running the exporter in a Raspberry Pi in your home network while scraping the metrics from your PC over SSH.

The exporter supports arbitrary commands with arguments to produce nvidia-smi-like output. Therefore, configuration is pretty straightforward.

Simply override the --nvidia-smi-command command-line argument (replace SSH_USER and SSH_HOST with SSH credentials):

nvidia_gpu_exporter --nvidia-smi-command "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null SSH_USER@SSH_HOST nvidia-smi"

Contributing

See CONTRIBUTING for details.

About

Nvidia GPU exporter for prometheus using nvidia-smi binary

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 94.9%
  • PowerShell 4.8%
  • Dockerfile 0.3%