Victoria Metrics / Operator¶
Overview for operator VictoriaMetrics v0.25.0 or higher
Tags¶
self-monitorvictoriametricsvmoperator
Panels¶
Overview¶
| Name | Description | Thresholds | Repeat |
|---|---|---|---|
| Version | Victoria Metrics operator version. | ||
| CRD Objects count by controller | Number of objects at kubernetes cluster per each controller | Default: Mode: absolute Level 1: 80 |
|
| Uptime | Victoria Metrics operator uptime. | Default: Mode: absolute Level 1: 80 |
|
| Reconciliation rate by controller | Total number of reconciliations per controller. | ||
| Log message rate | Victoria Metrics operator log message rate. |
Troubleshooting¶
| Name | Description | Thresholds | Repeat |
|---|---|---|---|
| reconcile errors by controller | Non zero metrics indicates about error with CR object definition (typos or incorrect values) or errors with kubernetes API connection. | ||
| throttled reconcilation events | Operator limits number of reconcilation events to 5 events per 2 seconds. For now, this limit is applied only for vmalert and vmagent controllers. It should reduce load at kubernetes cluster and increase operator performance. |
||
| Wokring queue depth | Number of objects waiting in the queue for reconciliation. Non-zero values indicate that operator cannot process CR objects changes with the given resources. | ||
| Reconcilation latency by controller | For controllers with StatefulSet it's ok to see latency greater then 3 seconds. It could be vmalertmanager,vmcluster or vmagent in statefulMode. For other controllers, latency greater then 1 second may indicate issues with kubernetes cluster or operator's performance. |
Resources¶
| Name | Description | Thresholds | Repeat |
|---|---|---|---|
| Memory usage ($instance) | Victoria Metrics operator memory usage. | ||
| CPU ($instance) | Victoria Metrics operator CPU usage. | ||
| Goroutines ($instance) | Total number of goroutines. | ||
| GC duration ($instance) | Victoria Metrics operator avg GC duration. |