Overall Cloud Status¶

Dashboard shows health status of applications are deployed into cloud platform, k8s/OpenShift nodes, applications are deployed out of cloud.

Tags¶

k8s
health

Panels¶

Kubernetes overview¶

Name	Description	Thresholds
API server status	Shows status of Kubernetes API server.	Default: Mode: absolute Level 1: 1
API servers	Shows number of API servers.	Default: Mode: absolute Level 1: 2 Level 2: 3
API server requests	Shows count of requests to API server, requests per minute.
API server errors	Shows errors in requests to API server.	Default: Mode: absolute Level 1: 1 Level 2: 3
ETCD status	Show status of etcd cluster. May contain no data for PaaS clouds.
ETCD servers	Shows number of active ETCD servers. May contain no data for PaaS clouds.	Default: Mode: absolute Level 1: 1 Level 2: 3
ETCD requests	Shows number of requests per second to ETCD servers. May contain no data for PaaS clouds.	Default: Mode: absolute Level 1: 1 Level 2: 500
ETCD server request error	Shows percent of error requests to ETCD server. May contain no data for PaaS clouds.	Default: Mode: absolute Level 1: 1 Level 2: 3
API server nodes status	Shows status of each API server in the cluster. 1 - OK, 0 - Problem
API server failed requests	Shows errors in requests to API server, operations per minute.
Etcd nodes status	Shows status of each etcd pod in the cluster. 1 - OK, 0 - Problem. May contain no data for PaaS clouds.
ETCD failed requests	Shows number of errors per minute in requests to ETCD server. May contain no data for PaaS clouds.
Total CPU usage	Shows overall CPU usage	Default: Mode: absolute Level 1: 75 Level 2: 90
Total Memory usage	Shows overall RAM usage for all nodes against total available RAM on all nodes.	Default: Mode: absolute Level 1: 75 Level 2: 90
Total Filesystem usage	Shows summary file system usage on Kubernetes cluster nodes	Default: Mode: absolute Level 1: 75 Level 2: 90
Used cores	Show used cores for cloud in cores (1 core = 1000 millicores)
Total cores	Show total cores available for cloud
Used memory	Show total used memory for cloud
Total memory	Show total available memory for cloud
Used space	Show sum by used space for directories and files on all nodes in cloud where `fstype == xfs \| ext.`. It means that all FS like `tmpfs`, `rootfs` will be exclude from value.
Total space	Show total available space for directories and files on all nodes in cloud where `fstype == xfs \| ext.`. It means that all FS like `tmpfs`, `rootfs` will be exclude from value.
Number of nodes	Shows number of active Kubernetes cluster nodes	Default: Mode: absolute Level 1: 1 Level 2: 3
Nodes Unavailable	Shows number of unavailable nodes.	Default: Mode: absolute Level 1: 1
Running Pods	Shows the total number of running pods in cluster. Show only pods with `status = ready`
Running containers	Shows the total number of running containers in pods.

Node health¶

Name	Description	Thresholds	Repeat
Node State	Show running state of all nodes in selected cloud	Default: Mode: absolute Level 1: 1 Level 2: 2
Nodes Overview	Shows cluster nodes overview: * Node Uptime * Total available CPU and RAM on node * Overall resources usage on node * Can be grouped by node_label	Default: Mode: absolute Level 1: 80

Applications health¶

Name	Description	Thresholds
Total pods	Shows the total number of pods in cluster.
Running pods	Shows the total number of running pods in cluster. Show only pods with `status = ready`
Not runnning pods	Shows the total number of not running / not healthy pods in cluster.	Default: Mode: absolute Level 1: 1
Help	Show information about panels in current section
Not Healthy Pods	Show information about the reason the container is currently in waiting or terminated state	Default: Mode: absolute Level 1: 80
Last Terminated Status	Show information about the last reason the container was in terminated state	Default: Mode: absolute Level 1: 80