Skip to content

Overall Cloud Status

Dashboard shows health status of applications are deployed into cloud platform, k8s/OpenShift nodes, applications are deployed out of cloud.

Tags

  • k8s
  • health

Panels

Kubernetes overview

Name Description Thresholds Repeat
API server status Shows status of Kubernetes API server. Default:
Mode: absolute
Level 1: 1

API servers Shows number of API servers. Default:
Mode: absolute
Level 1: 2
Level 2: 3

API server requests Shows count of requests to API server, requests per minute.
API server errors Shows errors in requests to API server. Default:
Mode: absolute
Level 1: 1
Level 2: 3

ETCD status Show status of etcd cluster. May contain no data for PaaS clouds.
ETCD servers Shows number of active ETCD servers. May contain no data for PaaS clouds. Default:
Mode: absolute
Level 1: 1
Level 2: 3

ETCD requests Shows number of requests per second to ETCD servers. May contain no data for PaaS clouds. Default:
Mode: absolute
Level 1: 1
Level 2: 500

ETCD server request error Shows percent of error requests to ETCD server. May contain no data for PaaS clouds. Default:
Mode: absolute
Level 1: 1
Level 2: 3

API server nodes status Shows status of each API server in the cluster. 1 - OK, 0 - Problem
API server failed requests Shows errors in requests to API server, operations per minute.
Etcd nodes status Shows status of each etcd pod in the cluster. 1 - OK, 0 - Problem.
May contain no data for PaaS clouds.
ETCD failed requests Shows number of errors per minute in requests to ETCD server. May contain no data for PaaS clouds.
Total CPU usage Shows overall CPU usage Default:
Mode: absolute
Level 1: 75
Level 2: 90

Total Memory usage Shows overall RAM usage for all nodes against total available RAM on all nodes. Default:
Mode: absolute
Level 1: 75
Level 2: 90

Total Filesystem usage Shows summary file system usage on Kubernetes cluster nodes Default:
Mode: absolute
Level 1: 75
Level 2: 90

Used cores Show used cores for cloud in cores (1 core = 1000 millicores)
Total cores Show total cores available for cloud
Used memory Show total used memory for cloud
Total memory Show total available memory for cloud
Used space Show sum by used space for directories and files on all nodes in cloud where fstype == xfs | ext.. It means that all FS like tmpfs, rootfs will be exclude from value.
Total space Show total available space for directories and files on all nodes in cloud where fstype == xfs | ext.. It means that all FS like tmpfs, rootfs will be exclude from value.
Number of nodes Shows number of active Kubernetes cluster nodes Default:
Mode: absolute
Level 1: 1
Level 2: 3

Nodes Unavailable Shows number of unavailable nodes. Default:
Mode: absolute
Level 1: 1

Running Pods Shows the total number of running pods in cluster. Show only pods with status = ready
Running containers Shows the total number of running containers in pods.

Node health

Name Description Thresholds Repeat
Node State Show running state of all nodes in selected cloud Default:
Mode: absolute
Level 1: 1
Level 2: 2

Nodes Overview Shows cluster nodes overview:
* Node Uptime
* Total available CPU and RAM on node
* Overall resources usage on node
* Can be grouped by node_label
Default:
Mode: absolute
Level 1: 80

Applications health

Name Description Thresholds Repeat
Total pods Shows the total number of pods in cluster.
Running pods Shows the total number of running pods in cluster. Show only pods with status = ready
Not runnning pods Shows the total number of not running / not healthy pods in cluster. Default:
Mode: absolute
Level 1: 1

Help Show information about panels in current section
Not Healthy Pods Show information about the reason the container is currently in waiting or terminated state Default:
Mode: absolute
Level 1: 80

Last Terminated Status Show information about the last reason the container was in terminated state Default:
Mode: absolute
Level 1: 80