Hardware sizing guide¶

This document describes how to calculate resources requirements for Platform Monitoring components.

The Platform Monitoring application consists of a set of different components each of it requires different amount of hardware resources for smooth work. Some components require permanent amount of resources the other requires resources depends on system loads. This document describes overall amount of hardware resources for a whole application and detailed information for each component.

Components of Platform Monitoring stack¶

Platform Monitoring components can be combined into some logical groups. The main groups it is:

Count of metrics
Cloud size
Count of CR/CRD
Count of users/requests

1. Count of metrics¶

Components in this group depend on count of metrics per minute. Resource consumption increases as the amount of metrics are increased that need to be collected and processed. Points per minute, samples per minute, active targets, etc. - affect to resource usage.

prometheus
vmSingle
vmAgent
prometheus-adapter
graphite-remote-adapter
cloudwatch-exporter
blackbox-exporter
stackdriver-exporter
pushgateway
promitor-agent-scrape
json-exporter

2. Cloud size¶

Components in this group depend on count of object in cloud. Resource consumption increases on cloud with a lot of pods, nodes, configMaps, secrets, etc.

kubeStateMetrics
nodeExporter
cert-exporter
network-latency-exporter
version-exporter
alertManager
vmAlertManager
vmAlert

3. Count of CR/CRD¶

Components in this group depend on count of CR/CRD. Resource consumption increases when components have to compute a lot of CR/CRD. The main part of components in this group it is operators. Operators work with CRD and create CR.

monitoring-operator
grafana-operator
prometheus-operator
prometheus-adapter-operator
vmOperator
configuration-streamer

4. Count of users/requests¶

Components in this group have UI or handle requests. For example, grafana have UI and handle requests which view on dashboards.

grafana
promxy
vmAuth
oAuth2-proxy

HWE profiles¶

Platform Monitoring has several profiles which can be used - small, medium and large. Set global.profile parameter to using one of these profiles else will be used medium values for each component of monitoring stack.

	Nodes	Pods	Points per minute
Small	1-6	less 100	less 1Mil
Medium	6-15	100-500	1-3Mil
Large	15+	500+	3+Mil

The Small profile is suitable for small cloud with nodes less 6, pods less 100 and points per minute less than 1Mil.
The Medium profile is suitable for cloud with 6-15 nodes, 100-500 pods and points per minute about 3Mil.
The Large profile is suitable for huge cloud with big count of nodes, pods and metrics. Nodes more than 15, pods more 500 and points per minute can be 10Mil.

Also, you can specify resource parameter for one or more components to override value from profile that uses for deploy.

Example, overriding resources for monitoring-operator and prometheus, but resources for other components will be used from small profile.

global:
  profile: "small"
monitoringOperator:
  resources:
    limits:
      cpu: 100m
      memory: 150Mi
    requests:
      cpu: 50m
      memory: 50Mi
prometheus:
  install: true
  resources:
    requests:
      cpu: 1000m
      memory: 2Gi
    limits:
      cpu: 3000m
      memory: 8Gi

NOTE: These profiles don't guarantee a stable work of each component. You can increase/override resource parameter if it needs. Our profiles can't cover all cases for all clouds. Examples with resource usage in different clouds.

NOTE: If you do not set profile parameter value for resources will set such as medium value of resources.

Hardware sizing¶

monitoring-operator¶

The monitoring-operator is deploying as a single Pod with one Container. Hardware sizing for this service is a constant and does not depend on system configuration because all handled resources are processing consistently.

		Profiles
		Small	Medium	Large
CPU	requests	50m	50m	70m
CPU	limits	70m	100m	200m
RAM	requests	64Mi	64Mi	64Mi
RAM	limits	256Mi	256Mi	256Mi

You can override resources parameter for monitoring-operator:

monitoringOperator:
  resources:
    requests:
      cpu: 50m
      memory: 50Mi
    limits:
      cpu: 100m
      memory: 100Mi

prometheus¶

The prometheus collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results. Resource usage depends on count of metrics that prometheus have to scrape and compute.

		Profiles
		Small	Medium	Large
CPU	requests	1000m	2000m	2500m
CPU	limits	2000m	3500m	4000m
RAM	requests	4Gi	7Gi	15Gi
RAM	limits	6Gi	12Gi	25Gi

You can override resources parameter for prometheus:

prometheus:
  resources:
    requests:
      cpu: 1000m
      memory: 2Gi
    limits:
      cpu: 2000m
      memory: 3Gi

prometheus-operator¶

The prometheus-operator provides Kubernetes native deployment and management of prometheus and related monitoring components. Resource usage for prometheus-operator depends on count of prometheus custom resources - Prometheus, Alertmanager, ThanosRuler, ServiceMonitor, PodMonitor, Probe, PrometheusRule and AlertmanagerConfig.

		Profiles
		Small	Medium	Large
CPU	requests	30m	50m	50m
CPU	limits	100m	100m	100m
RAM	requests	100Mi	150Mi	150Mi
RAM	limits	250Mi	250Mi	300Mi

You can override resources parameter for prometheus-operator:

prometheus:
  operator:
    resources:
      requests:
        cpu: 50m
        memory: 50Mi
      limits:
        cpu: 100m
        memory: 250Mi

prometheus-adapter¶

The prometheus-adapter is therefore suitable for use with the autoscaling/v2 Horizontal Pod Autoscaler in Kubernetes 1.6+. It can also replace the metrics server on clusters that already run Prometheus and collect the appropriate metrics. Resource usage depends on amount of metrics.

		Profiles
		Small	Medium	Large
CPU	requests	150m	400m	500m
CPU	limits	250m	500m	700m
RAM	requests	1000Mi	2000Mi	3000Mi
RAM	limits	2000Mi	3000Mi	5000Mi

You can override resources parameter for prometheus-adapter:

prometheusAdapter:
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 200m
      memory: 384Mi

prometheus-adapter-operator¶

The prometheus-adapter-operator provides Kubernetes native deployment and management of prometheus-adapter and related monitoring components.

		Profiles
		Small	Medium	Large
CPU	requests	20m	30m	50m
CPU	limits	50m	70m	100m
RAM	requests	20Mi	30Mi	30Mi
RAM	limits	50Mi	70Mi	100Mi

You can override resources parameter for prometheus-adapter:

prometheusAdapter:
  operator:
    resources:
      requests:
        cpu: 20m
        memory: 20Mi
      limits:
        cpu: 50m
        memory: 100Mi

vmOperator¶

The vmOperator allows you to manage VictoriaMetrics applications inside kubernetes cluster and simplifies this process. It installs, upgrades and manages victoria metrics resources. Resources usage depends on count of victoria metrics custom resources. E.g. VMServiceScrape, VMPodScrape, VMProbe, etc.

		Profiles
		Small	Medium	Large
CPU	requests	50m	70m	150m
CPU	limits	100m	150m	300m
RAM	requests	100Mi	150Mi	300Mi
RAM	limits	200Mi	300Mi	500Mi

You can override resources parameter for vmOperator:

victoriametrics:
  vmOperator:
    resources:
      requests:
        cpu: 200m
        memory: 100Mi
      limits:
        cpu: 400m
        memory: 200Mi

vmAgent¶

The vmAgent is a tiny agent which helps you collect metrics from various sources, relabel and filter the collected metrics and store them in VictoriaMetrics. Resource usage depends on count of metrics that vmAgent have to collect and compute.

		Profiles
		Small	Medium	Large
CPU	requests	100m	500m	1500m
CPU	limits	200m	750m	2000m
RAM	requests	100Mi	400Mi	2000Mi
RAM	limits	250Mi	800Mi	3500Mi

You can override resources parameter for vmAgent:

victoriametrics:
  vmAgent:
    resources:
      requests:
        cpu: 200m
        memory: 100Mi
      limits:
        cpu: 400m
        memory: 200Mi

vmSingle¶

The vmSingle is TSDB and resource usage depends on amount of metrics which have to write/read.

		Profiles
		Small	Medium	Large
CPU	requests	300m	1000m	2000m
CPU	limits	600m	1500m	3000m
RAM	requests	1Gi	3Gi	7Gi
RAM	limits	1.5Gi	5Gi	10Gi

You can override resources parameter for vmSingle:

victoriametrics:
  vmSingle:
    resources:
      requests:
        cpu: 500m
        memory: 1000Mi
      limits:
        cpu: 1000m
        memory: 2000Mi

vmAlert¶

The vmAlert executes a list of the given alerting or recording rules. Resource usage depends on count of rules on cloud.

		Profiles
		Small	Medium	Large
CPU	requests	50m	100m	250m
CPU	limits	100m	150m	400m
RAM	requests	150Mi	250Mi	400Mi
RAM	limits	200Mi	400Mi	700Mi

You can override resources parameter for vmAlert:

victoriametrics:
  vmAlert:
    resources:
      requests:
        cpu: 50m
        memory: 200Mi
      limits:
        cpu: 200m
        memory: 500Mi

vmAlertManager¶

The vmAlertManager is deployment which uses alertmanager for handles alerts sent by client applications.

		Profiles
		Small	Medium	Large
CPU	requests	30m	100m	150m
CPU	limits	70m	150m	200m
RAM	requests	50Mi	100Mi	150Mi
RAM	limits	100Mi	150Mi	200Mi

You can override resources parameter for vmAlertManager:

victoriametrics:
  vmAlertManager:
    resources:
      requests:
        cpu: 30m
        memory: 56Mi
      limits:
        cpu: 100m
        memory: 256Mi

vmAuth¶

The vmAuth is a simple auth proxy, router and load balancer for VictoriaMetrics. Resource usage increases with increasing users and requests to proxy.

		Profiles
		Small	Medium	Large
CPU	requests	50m	100m	200m
CPU	limits	100m	200m	350m
RAM	requests	100Mi	150Mi	250Mi
RAM	limits	200Mi	250Mi	400Mi

You can override resources parameter for vmAuth:

victoriametrics:
  vmAuth:
    resources:
      requests:
        cpu: 50m
        memory: 200Mi
      limits:
        cpu: 200m
        memory: 500Mi

vmSelect¶

The vmSelect is a cluster mode VictoriaMetrics TSDB instance used for reading data. Resource usage depends on the amount of metrics which has to be read.

		Profiles
		Small	Medium	Large
CPU	requests	30m	100m	150m
CPU	limits	70m	150m	200m
RAM	requests	50Mi	100Mi	150Mi
RAM	limits	100Mi	150Mi	200Mi

You can override resources parameter for vmSelect:

victoriametrics:
  vmSelect:
    resources:
      requests:
        cpu: 50m
        memory: 64Mi
      limits:
        cpu: 40m
        memory: 64Mi

vmInsert¶

The vmInsert is a cluster mode VictoriaMetrics TSDB instance used for writing data. Resource usage depends on the amount of metrics which has to be written.

		Profiles
		Small	Medium	Large
CPU	requests	30m	100m	150m
CPU	limits	70m	150m	200m
RAM	requests	50Mi	100Mi	150Mi
RAM	limits	100Mi	150Mi	200Mi

You can override resources parameter for vmInsert:

victoriametrics:
  vmInsert:
    resources:
      requests:
        cpu: 50m
        memory: 64Mi
      limits:
        cpu: 40m
        memory: 64Mi

vmStorage¶

The vmStorage is a cluster mode VictoriaMetrics TSDB instance used for store data. Resource usage depends on the amount of metrics which has to be stored.

		Profiles
		Small	Medium	Large
CPU	requests	300m	500m	1000m
CPU	limits	300m	500m	1000m
RAM	requests	256Mi	512Mi	1024Mi
RAM	limits	256Mi	512Mi	1024Mi

You can override resources parameter for vmInsert:

victoriametrics:
  vmStorage:
    resources:
      requests:
        cpu: 500m
        memory: 512Mi
      limits:
        cpu: 500m
        memory: 512Mi

grafana¶

The grafana queries, alerts and visualizes metrics which was collected by prometheus or victoriaMetrics. The number and complexity of dashboards affects resource usage.

		Profiles
		Small	Medium	Large
CPU	requests	250m	400m	700m
CPU	limits	400m	500m	900m
RAM	requests	200Mi	350Mi	600Mi
RAM	limits	300Mi	450Mi	700Mi

You can override resources parameter for grafana:

grafana:
  resources:
    requests:
      cpu: 300m
      memory: 400Mi
    limits:
      cpu: 500m
      memory: 800Mi

grafana-operator¶

The grafana-operator is a Kubernetes operator built to help you manage your Grafana instances in and outside of Kubernetes. Resource usage for grafana-operator depends on count of grafana custom resources - Grafana, GrafanaDashboard, GrafanaDataSource, GrafanaFolder, GrafanaNotificationChannel.

		Profiles
		Small	Medium	Large
CPU	requests	30m	50m	150m
CPU	limits	70m	100m	250m
RAM	requests	50Mi	150Mi	200Mi
RAM	limits	100Mi	250Mi	350Mi

You can override resources parameter for grafana-operator:

grafana:
  operator:
    resources:
      requests:
        cpu: 50m
        memory: 50Mi
      limits:
        cpu: 100m
        memory: 100Mi

grafana-image-renderer¶

The grafana-image-renderer handles rendering panels and dashboards to PNGs using a headless browser (Chromium). Rendering images requires a lot of memory, mainly because Grafana creates browser instances in the background for the actual rendering. We recommend a minimum of 16GB of free memory on the system rendering images on clouds with big count of dashboards. Rendering multiple images in parallel requires an even bigger memory footprint.

		Profiles
		Small	Medium	Large
CPU	requests	100m	300m	500m
CPU	limits	200m	500m	800m
RAM	requests	200Mi	500Mi	1000Mi
RAM	limits	400Mi	800Mi	2000Mi

You can override resources parameter for grafana-image-renderer:

grafana:
  imageRenderer:
    resources:
      requests:
        cpu: 150m
        memory: 250Mi
      limits:
        cpu: 300m
        memory: 500Mi

alertManager¶

The alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration. Resource usage depends on alert which need to watch and compute.

		Profiles
		Small	Medium	Large
CPU	requests	50m	70m	150m
CPU	limits	100m	120m	200m
RAM	requests	50Mi	100Mi	200Mi
RAM	limits	100Mi	150Mi	300Mi

You can override resources parameter for alertManager:

alertManager:
  resources:
    requests:
      cpu: 100m
      memory: 100Mi
    limits:
      cpu: 200m
      memory: 200Mi

kubeStateMetrics¶

The kubeStateMetrics is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods. Resource usage depends on amount of objects on cloud.

		Profiles
		Small	Medium	Large
CPU	requests	50m	70m	100m
CPU	limits	100m	150m	200m
RAM	requests	50Mi	120Mi	200Mi
RAM	limits	100Mi	200Mi	300Mi

You can override resources parameter for kubeStateMetrics:

kubeStateMetrics:
  resources:
    requests:
      cpu: 50m
      memory: 50Mi
    limits:
      cpu: 100m
      memory: 256Mi

pushgateway¶

The pushgateway exists to allow ephemeral and batch jobs to expose their metrics to Prometheus. Since these kinds of jobs may not exist long enough to be scraped, they can instead push their metrics to a Pushgateway. The Pushgateway then exposes these metrics to Prometheus.

		Profiles
		Small	Medium	Large
CPU	requests	50m	150m	250m
CPU	limits	70m	250m	400m
RAM	requests	30Mi	100Mi	150Mi
RAM	limits	50Mi	150Mi	250Mi

You can override resources parameter for pushgateway:

pushgateway:
  resources:
    requests:
      cpu: 100m
      memory: 30Mi
    limits:
      cpu: 200m
      memory: 50Mi

promxy¶

The promxy is a prometheus proxy that makes many shards of prometheus appear as a single API endpoint to the user. This significantly simplifies operations and use of prometheus at scale (when you have more than one prometheus host). Promxy delivers this unified access endpoint without requiring any sidecars, custom-builds, or other changes to your prometheus infrastructure.

		Profiles
		Small	Medium	Large
CPU	requests	50m	100m	200m
CPU	limits	100m	150m	250m
RAM	requests	100Mi	200Mi	300Mi
RAM	limits	150Mi	250Mi	350Mi

You can override resources parameter for promxy:

promxy:
  resources:
    requests:
      cpu: 50m
      memory: 128Mi
    limits:
      cpu: 150m
      memory: 256Mi

promxy-configmap-reloader¶

		Profiles
		Small	Medium	Large
CPU	requests	10m	10m	15m
CPU	limits	15m	15m	20m
RAM	requests	6Mi	10Mi	15Mi
RAM	limits	15Mi	15Mi	20Mi

You can override resources parameter for promxy-configmap-reloader:

promxy:
  configmapReload:
    resources:
      requests:
        cpu: 5m
        memory: 3Mi
      limits:
        cpu: 10m
        memory: 20Mi

graphite-remote-adapter¶

The graphite-remote-adapter is a read/write adapter that receives samples via Prometheus's remote write protocol and stores them in remote storage like Graphite. Resource usage depends on count of samples that have to read/write.

		Profiles
		Small	Medium	Large
CPU	requests	100m	250m	500m
CPU	limits	150m	400m	750m
RAM	requests	150Mi	400Mi	1000Mi
RAM	limits	250Mi	700Mi	1500Mi

You can override resources parameter for graphite-remote-adapter:

graphite_remote_adapter:
  resources:
    requests:
      cpu: 200m
      memory: 300Mi
    limits:
      cpu: 500m
      memory: 1000Mi

promitor-agent-scrape¶

The promitor-agent-scrape is an Azure Monitor scraper which makes the metrics available through a scraping endpoint for Prometheus and resource usage depends on count of metrics that have to collect and compute.

		Profiles
		Small	Medium	Large
CPU	requests	70m	150m	200m
CPU	limits	100m	200m	400m
RAM	requests	100Mi	200Mi	250Mi
RAM	limits	150Mi	200Mi	500Mi

You can override resources parameter for promitor-agent-scrape:

promitorAgentScraper:
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 200m
      memory: 256Mi

nodeExporter¶

The nodeExporter is Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors. Count and size of nodes to affects on resource usage.

		Profiles
		Small	Medium	Large
CPU	requests	30m	50m	50m
CPU	limits	50m	70m	100m
RAM	requests	30Mi	50Mi	50Mi
RAM	limits	50Mi	70Mi	100Mi

You can override resources parameter for nodeExporter:

nodeExporter:
  resources:
    requests:
      cpu: 50m
      memory: 50Mi
    limits:
      cpu: 100m
      memory: 100Mi

cert-exporter¶

The cert-exporter is designed to parse certificates and export expiration information for Prometheus to scrape. Kubernetes uses PKI certificates for authentication between all major components. These certs are critical for the operation of your cluster but are often opaque to an administrator.

DaemonSet¶

The cert-exporter daemonset collects certs from files and/or kubeconfig.

		Profiles
		Small	Medium	Large
CPU	requests	10m	20m	30m
CPU	limits	20m	40m	50m
RAM	requests	20Mi	30Mi	50Mi
RAM	limits	30Mi	50Mi	70Mi

You can override resources parameter for cert-exporter daemonset:

certExporter:
  daemonset:
    resources:
      requests:
        cpu: 10m
        memory: 25Mi
      limits:
        cpu: 20m
        memory: 50Mi

Deployment¶

The cert-exporter deployment collects certs from secrets.

		Profiles
		Small	Medium	Large
CPU	requests	10m	20m	30m
CPU	limits	20m	30m	50m
RAM	requests	30Mi	70Mi	100Mi
RAM	limits	50Mi	150Mi	200Mi

You can override resources parameter for cert-exporter deployment:

certExporter:
  deployment:
    resources:
      requests:
        cpu: 10m
        memory: 50Mi
      limits:
        cpu: 20m
        memory: 150Mi

blackbox-exporter¶

The blackbox-exporter allows blackbox probing of endpoints over HTTP, HTTPS, DNS, TCP, ICMP and gRPC.

		Profiles
		Small	Medium	Large
CPU	requests	20m	50m	100m
CPU	limits	30m	70m	150m
RAM	requests	20Mi	50Mi	100Mi
RAM	limits	50Mi	100Mi	250Mi

You can override resources parameter for blackbox-exporter:

blackboxExporter:
  resources:
    requests:
      cpu: 50m
      memory: 50Mi
    limits:
      cpu: 100m
      memory: 300Mi

cloudwatch-exporter¶

The cloudwatch-exporter is exporter for Amazon CloudWatch. Count of metrics (points, samples) affects on resource usage for cloudwatch-exporter deployment.

		Profiles
		Small	Medium	Large
CPU	requests	50m	100m	150m
CPU	limits	70m	150m	250m
RAM	requests	100Mi	150Mi	200Mi
RAM	limits	150Mi	250Mi	300Mi

You can override resources parameter for cloudwatch-exporter:

cloudwatchExporter:
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 200m
      memory: 256Mi

json-exporter¶

The json-exporter is a prometheus exporter which scrapes remote JSON by JSONPath.

		Profiles
		Small	Medium	Large
CPU	requests	50m	100m	200m
CPU	limits	70m	150m	300m
RAM	requests	100Mi	150Mi	250Mi
RAM	limits	150Mi	200Mi	350Mi

You can override resources parameter for json-exporter:

jsonExporter:
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 100m
      memory: 128Mi

network-latency-exporter¶

The network-latency-exporter is a service which collects RTT and TTL metrics for the list of target hosts and sends collected data to Prometheus. It is possible to use UDP, TCP or ICMP network protocols to sent package during probes. The service collects metrics with mtr tool which accumulates functionality of ping and traceroute tools. Target hosts can be discovered automatically by retrieving all k8s cluster nodes. Resource usage depends on count of targets. More targets have more metrics.

		Profiles
		Small	Medium	Large
CPU	requests	70m	150m	200m
CPU	limits	150m	250m	300m
RAM	requests	100Mi	200Mi	250Mi
RAM	limits	200Mi	300Mi	350Mi

You can override resources parameter for network-latency-exporter:

networkLatencyExporter:
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 200m
      memory: 256Mi

stackdriver-exporter¶

The stackdriver-exporter is a Prometheus exporter for Google Stackdriver Monitoring metrics. It acts as a proxy that requests Stackdriver API for the metric's time-series everytime prometheus scrapes it. Count of metrics (points, samples) affects on resource usage for stackdriver-exporter deployment.

		Profiles
		Small	Medium	Large
CPU	requests	50m	100m	250m
CPU	limits	100m	150m	350m
RAM	requests	70Mi	150Mi	300Mi
RAM	limits	150Mi	200Mi	400Mi

You can override resources parameter for stackdriver-exporter:

stackdriverExporter:
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 100m
      memory: 128Mi

version-exporter¶

The version-exporter is a useful tool that allows you to get product, project, third-party versions of an application and store the results in custom Prometheus metrics.

		Profiles
		Small	Medium	Large
CPU	requests	100m	150m	200m
CPU	limits	150m	200m	300m
RAM	requests	200Mi	250Mi	300Mi
RAM	limits	250Mi	300Mi	400Mi

You can override resources parameter for version-exporter:

versionExporter:
  resources:
    requests:
      cpu: 200m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi

Examples¶

Examples show that count of nodes, pods and metrics don't affect all components equally. Each component resource usage depends on a lot of facts.

These are examples from real clouds. It shows that equal number of pods and nodes doesn't guarantee equal resource usage.

For example, several clouds can be compared:

	Nodes	Pods	Active targets	Dashboards	Monitoring-operator		Prometheus		Grafana
	Nodes	Pods	Active targets	Dashboards	CPU	RAM	CPU	RAM	CPU	RAM
#1	6	1000	550	100	23m	50Mi	2200m	12.5Gi	270m	120Mi
#2	21	400	170	30	21m	56Mi	450m	7Gi	1100m	240Mi
#3	40	2000	220	30	25m	70Mi	530m	7.5Gi	200m	130Mi

There is we can see absolute different clouds. What is it mean?:

The #1 cloud has pods more that in #2 cloud, because on the #1 cloud size of node bigger than on the #2.
On the #1 cloud a lot of active targets and resource usage on this cloud more the on the both other.
Clouds #2 and #3 has equal amount of dashboards, but really different resource usage for grafana. This happened due to the fact that on the #2 cloud there are more complex dashboards(it requires more CPU and RAM).
Resource usage of monitoring-operator almost equal between these clouds because all handled resources are processing consistently.
The prometheus on #2 and #3 cloud use the same amount of resources. Number of active targets and metrics is the same on them.

Example with prometheus¶

Nodes	Pods	Active targets	Dashboards	Monitoring-operator		Prometheus		Prometheus-operator		Grafana		Grafana-operator		AlertManager		KubeStateMetrics		Graphite-remote-adapter		Cert-exporter		CloudWatch-exporter		Network-Latency-exporter
Nodes	Pods	Active targets	Dashboards	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM
6	500	289	65	7m	70Mi	400m	10Gi	4m	120Mi	230m	120Mi	20m	100Mi	6m	50Mi	5m	60Mi	410m	3.5Gi	6m	100Mi	5m	160Mi	10m	250Mi
12	440	226	59	25m	60Mi	1150m	5.6Gi	10m	50Mi	1150m	85Mi	100m	70Mi	10m	30Mi	5m	55Mi	325m	800Mi
16	-	121	-	21m	105Mi	290m	6.2Gi	15m	250Mi	-	-	-	-	-	-	10m	125Mi	600m	950Mi
19	-	153	-	22m	110Mi	450m	5.5Gi	15m	190Mi	-	-	-	-	-	-	5m	100Mi	360m	750Mi
28	290	156	79	100m	195Mi	1600m	6.4Gi	5m	100Mi	60m	125Mi	30m	115Mi	10m	40Mi	5m	90Mi	720m	850Mi
37	228	129	30	20m	56Mi	750m	7.2Gi	7m	160Mi	430m	165Mi	20m	70Mi	10m	50Mi	10m	140Mi	450m	1150Mi

Example with victoriaMetrics¶

Nodes	Pods	Samples per second	vmOperator		vmAgent		NodeExporter		KubeStateMetrics
Nodes	Pods	Samples per second	CPU	RAM	CPU	RAM	CPU	RAM	CPU	RAM
15	600	6K	100m	52Mi	90m	200Mi	30m	16Mi	10m	50Mi
16	370	4K	100m	53Mi	70m	180Mi	20m	25Mi	10m	35Mi
26	2700	43K	100m	320Mi	470m	440Mi	20m	55Mi	10m	140Mi
41	3000	47K	100m	180Mi	480m	550Mi	10m	30Mi	10m	230Mi
57	3700	41K	100m	200Mi	380m	340Mi	10m	35Mi	10m	180Mi
65	4500	67K	100m	270Mi	710m	1100Mi	10m	19Mi	40m	270Mi
73	7500	114K	100m	340Mi	1600m	2200Mi	20m	22Mi	30m	520Mi