Closed
Description
Motivation
- Currently, users cannot monitor their GPU/NPU utilization in the BAI console. Providing Prometheus metrics for resource usage will allow external tools like Grafana to display utilization data, addressing transparency and user needs.
Required Features
- Export Prometheus Metrics:
- Enable resource usage metrics (e.g., GPU/NPU utilization) to be exported via Prometheus for external monitoring.
- GPU/NPU real-time usage
- GPU/NPU cumulative usage
- Enable resource usage metrics (e.g., GPU/NPU utilization) to be exported via Prometheus for external monitoring.
Impact
- Prometheus Integration
- Metrics export functionality needs to be implemented to expose GPU/NPU utilization data.
- External Monitoring Tools
- Enables tools like Grafana to visualize and monitor the metrics.
Testing Scenarios
- Integration with Grafana:
- Test that the exported Prometheus metrics can be visualized in Grafana.
Metadata
Metadata
Assignees
Labels
No labels
Activity