Victoria Metrics / Operator¶
Overview for operator VictoriaMetrics v0.25.0 or higher
Tags¶
self-monitor
victoriametrics
vmoperator
Panels¶
Overview¶
Name | Description | Thresholds | Repeat |
---|---|---|---|
Version | Victoria Metrics operator version. | ||
CRD Objects count by controller | Number of objects at kubernetes cluster per each controller | Default: Mode: absolute Level 1: 80 |
|
Uptime | Victoria Metrics operator uptime. | Default: Mode: absolute Level 1: 80 |
|
Reconciliation rate by controller | Total number of reconciliations per controller. | ||
Log message rate | Victoria Metrics operator log message rate. |
Troubleshooting¶
Name | Description | Thresholds | Repeat |
---|---|---|---|
reconcile errors by controller | Non zero metrics indicates about error with CR object definition (typos or incorrect values) or errors with kubernetes API connection. | ||
throttled reconcilation events | Operator limits number of reconcilation events to 5 events per 2 seconds. For now, this limit is applied only for vmalert and vmagent controllers. It should reduce load at kubernetes cluster and increase operator performance. |
||
Wokring queue depth | Number of objects waiting in the queue for reconciliation. Non-zero values indicate that operator cannot process CR objects changes with the given resources. | ||
Reconcilation latency by controller | For controllers with StatefulSet it's ok to see latency greater then 3 seconds. It could be vmalertmanager,vmcluster or vmagent in statefulMode. For other controllers, latency greater then 1 second may indicate issues with kubernetes cluster or operator's performance. |
Resources¶
Name | Description | Thresholds | Repeat |
---|---|---|---|
Memory usage ($instance) | Victoria Metrics operator memory usage. | ||
CPU ($instance) | Victoria Metrics operator CPU usage. | ||
Goroutines ($instance) | Total number of goroutines. | ||
GC duration ($instance) | Victoria Metrics operator avg GC duration. |