This document provides information about the maintenance of the Monitoring deployment and the configuration of its services.
Ways to Make Changes¶
The following sections describe the various ways to make changes in the Monitoring deployment.
Redeploy/Update¶
Manual Deploy using Helm¶
NOTE: if you want to deploy the monitoring-operator
into Kubernetes v1.15 or lower or
OpenShift v3.11 or lower, you must work with v1beta1 CRDs manually. For more information see
Work with legacy CRDs.
This chart installs Monitoring Operator which can create/configure/manage Prometheus and the components in Kubernetes/OpenShift.
Installing the Chart¶
To install the chart with the monitoring-operator
release name, use the following command.
The command deploys monitoring-operator on the Kubernetes/OpenShift cluster in the default configuration. The configuration section lists the parameters that can be configured during the installation.
The default installation includes Prometheus Operator, AlertManager, Exporters, and configuration for scraping the Kubernetes/OpenShift infrastructure.
Upgrading the Chart¶
To upgrade the chart with the monitoring-operator
release name, use the following command.
The command upgrades monitoring-operator on the Kubernetes/OpenShift cluster in the default configuration. The configuration section lists the parameters that can be configured during the upgrade.
Uninstalling the Chart¶
To uninstall or delete the monitoring-operator
deployment, use the following command.
The command removes all the Kubernetes/OpenShift components associated with the chart and deletes the release.
Warning: Note that this step removes CRDs for monitoring and deleting these CRDs deletes all resources of their type. All resources like ServiceMonitor and GrafanaDashboard are removed from the applications.
CRDs created by this chart are not removed by default and should be manually cleaned up.
-
For Kubernetes
kubectl delete crd platformmonitoring.monitoring.qubership.org kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd podmonitors.monitoring.coreos.com kubectl delete crd alertmanagers.monitoring.coreos.com kubectl delete crd grafana.integreatly.org kubectl delete crd grafanadashboard.integreatly.org kubectl delete crd grafanadatasource.integreatly.org
-
For OpenShift
oc delete crd platformmonitoring.monitoring.qubership.org oc delete crd prometheuses.monitoring.coreos.com oc delete crd prometheusrules.monitoring.coreos.com oc delete crd servicemonitors.monitoring.coreos.com oc delete crd podmonitors.monitoring.coreos.com oc delete crd alertmanagers.monitoring.coreos.com oc delete crd grafana.integreatly.org oc delete crd grafanadashboard.integreatly.org oc delete crd grafanadatasource.integreatly.org
Change Parameters in Custom Resource in Runtime¶
For deploy and config, the Monitoring deployment uses monitoring-operator, which is based on operator-sdk.
It means that monitoring-operator controls most part of the Monitoring deployment. The settings to create/update/remove
any part of the deployment is used from the Custom Resource (CR) with type PlatformMonitoring
.
So you can make changes in the Monitoring deployment just by changing the parameters in the PlatformMonitoring
CR.
Note: To change the settings in the PlatformMonitoring
CR, you must have permissions on the PlatformMonitoring
CR:
- apiGroups:
- "monitoring.qubership.org"
resources:
- platformmonitorings
verbs:
- 'create'
- 'delete'
- 'get'
- 'list'
- 'update'
Monitoring-operator watches resources with type PlatformMonitoring
and can handle the following events:
create
- If the operator receives such an event, it runs the monitoring deploy in the namespace where the CR was created.update
- If the operator receives such an event, it runs the update of monitoring deployment in the namespace where the CR was updated.delete
- If the operator receives such an event, it runs remove of all objects (Deployments, StatefulSets, ConfigMaps, and so on), which it earlier created in the namespace where the CR was removed.
Thus using the operator, we can change the monitoring deployment parameters in runtime. To change the parameters:
- Log in to Kubernetes through the
kubectl
cli or using any UI. - Find an object with type
PlatformMonitoring
and nameplatformmonitoring
in the namespace where Monitoring is deployed. For example, use the following cli command.
- Use the following cli command to open the object for editing.
- Change the necessary parameters and save the object.
- Wait while the operator applies changes in runtime.
Using this procedure, you can:
- Change the component's images.
- Change any component specific settings, like retention period for Prometheus.
- Add or change persistence volumes for Prometheus and Grafana.
- Disable some components or stop tracking it by operator, and so on.
All parameters that you can specify in the PlatformMonitoring
objects are described in the API documents. For more
details, refer to the Platform Monitoring chapter in the Cloud Platform Monitoring Guide.
Work with legacy CRDs¶
If you use Kubernetes v1.15 or lower or OpenShift v3.11 or lower, you must use CRDs with v1beta1 API version. But the Helm chart uses CRDs v1 version, which are incompatible with v1beta1 CRDs.
So if you use Kubernetes or OpenShift of specified version or lower, you have to work with v1beta1 CRDs which are contained in the crds directory. Also, you have to use the features of deployment tools to skip installing CRDs to avoid errors.
If you want to deploy the monitoring-operator
in a cluster (Kubernetes v1.15 or lower or
OpenShift v3.11 or lower) for the first time, you have to install CRDs manually before deploying other components.
To create CRDs version v1beta1 need:
- Login into cluster using
kubectl
client for Kubernetes oroc
client forOpenShift
- Navigate to
docs/crds
directory - Execute the following command for Kubernetes:
or for OpenShift:
To update CRDs version v1beta1 need:
- Login into cluster using
kubectl
client for Kubernetes oroc
client forOpenShift
- Navigate to
docs/crds
directory - Execute the following command for Kubernetes:
or for OpenShift:
To remove CRDs you can use instructions from Remove procedure.
NOTE: You must set the following parameters during deploy to skip CRDs creation if you use v1beta1 CRDs:
- for manually Helm installation - add
--skip-crds
tohelm install
orhelm upgrade
command
Provided Procedures¶
The information about the procedures for monitoring deployment is described in the following sections.
Update Procedure¶
Except for one important point, the update process is the same as the installation process.
Important: Helm doesn't have an ability to update/remove CRDs, but can create them. It means that CRDs will create during first deploy. But during any updates and remove deployment Helm will not update or remove CRDs. For more details you can read Issue: CRD update during helm upgrade and Design: CRD Handling in Helm.
The update procedure almost completely repeats the installation, with the exception of one important point.
During install/update you need:
- Create or use previously created configurations.
- Run any job for update or run update manually with using Helm.
But before run update need manually update CRDs.
NOTE: if you use the monitoring-operator
into Kubernetes v1.15 or lower or
OpenShift v3.11 or lower, you must work with v1beta1 CRDs manually. For more information see
Work with legacy CRDs.
To update CRDs need:
- Download chart. How to find it see below.
- Navigate to directory
charts/monitoring-operator
. -
Execute a command:
Remove Procedure¶
There are some ways to uninstall or delete the monitoring-operator
deployment:
- Remove only the
PlatformMonitoring
CR. - Remove the
PlatformMonitoring
CR and operator. - Remove all created CRDs.
Remove PlatformMonitoring CR¶
Monitoring-operator watches the state of the PlatformMonitoring
CR and tracks all events. So when you remove this, the
object operator should remove all objects (Deployments, ConfigMaps, Secrets, and so on), which it created during the
deployment.
This method allows you to remove Monitoring from namespace, but keeps the operator that redeploys all necessary
Monitoring components when you decide to create the PlatformMonitoring
CR again.
To remove the PlatformMonitoring
CR, you can use the following command.
-
For Kubernetes
-
For OpenShift
After executing the command, all Monitoring components are removed from the selected namespace. However, the following objects are not removed:
monitoring-operator
- Some cluster entities like ClusterRoles, ClusterRoleBindings, Security Context Constraints (SCC), and Pod Security Policy (PSP)
Remove PlatformMonitoring CR and Operator¶
The simplest way to remove the CR and operator is by using the following helm delete command.
To find the release name, use the following command.
The command removes all the Kubernetes/OpenShift components associated with the chart and deletes the release.
Remove all Created CRDs¶
Warning: Note that this step removes CRDs for monitoring, and deleting these CRDs causes the deletion of all resources of their type. It means that all resources like ServiceMonitor and GrafanaDashboard are removed from the applications.
CRDs created by this chart are not removed by default and should be manually cleaned up.
-
For Kubernetes
kubectl delete crd grafanas.integreatly.org kubectl delete crd grafanadashboards.integreatly.org kubectl delete crd grafanadatasources.integreatly.org kubectl delete crd grafananotificationchannels.integreatly.org kubectl delete crd alertmanagers.monitoring.coreos.com kubectl delete crd alertmanagerconfigs.monitoring.coreos.com kubectl delete crd podmonitors.monitoring.coreos.com kubectl delete crd probes.monitoring.coreos.com kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd thanosrulers.monitoring.coreos.com kubectl delete crd customscalemetricrules.monitoring.qubership.org kubectl delete crd platformmonitorings.monitoring.qubership.org kubectl delete crd prometheusadapters.monitoring.qubership.org
-
For OpenShift
oc delete crd grafanas.integreatly.org oc delete crd grafanadashboards.integreatly.org oc delete crd grafanadatasources.integreatly.org oc delete crd grafananotificationchannels.integreatly.org oc delete crd alertmanagers.monitoring.coreos.com oc delete crd alertmanagerconfigs.monitoring.coreos.com oc delete crd podmonitors.monitoring.coreos.com oc delete crd probes.monitoring.coreos.com oc delete crd prometheuses.monitoring.coreos.com oc delete crd prometheusrules.monitoring.coreos.com oc delete crd servicemonitors.monitoring.coreos.com oc delete crd thanosrulers.monitoring.coreos.com oc delete crd customscalemetricrules.monitoring.qubership.org oc delete crd platformmonitorings.monitoring.qubership.org oc delete crd prometheusadapters.monitoring.qubership.org
Deploy/Skip/Remove based components¶
Monitoring-operator control (deploy/update/remove) some base components which include into monitoring deployment Components list which it control:
prometheus-operator
- control deployment and settingsprometheus
- control only settings in Prometheus CR, deployment control byprometheus-operator
alertmanager
- control only settings in AlertManager CR, deployment control byprometheus-operator
grafana-operator
- control deployment and settingsgrafana
- control only settings in Grafana, GrafanaDashboard and GrafanaDatasource CRs, deployment control bygrafana-operator
kube-state-metrics
- control deployment and settingsnode-exporter
- control deployment and settings
All these components will be deploy or remove in runtime after deploy. For these purposes in PlatformMonitoring
CR
exists special parameters.
Some of these component's section has a special parameter .<section>.install: [true|false]
. For example:
alertManager:
install: true
grafana:
install: true
prometheus:
install: true
kubeStateMetrics:
install: true
nodeExporter:
install: true
Note: By default all .<section>.install
parameters set as true
.
So you can skip components deploy, or remove it after deploy, or deploy if it was skipped with using these parameters. For example, if I want to skip Grafana deployment I can specify:
For see more information about available deploy parameters please refer to Installation Guide.
Pause reconciliation for components which control by monitoring-operator¶
As described in paragraph above monitoring-operator control components and can create/update/remove them. Also it means that if you want change any settings manually you just can't do it.
Monitoring-operator watch all objects managed by it and will restore expected settings after you will change it.
But in some cases it may be necessary to change some settings in any Deployment directly, for example when CR doesn't allow specify necessary parameter.
In this case almost all components in PlatformMonitoring CR has a special parameter .<section>.paused: [true|false]
.
For example:
alertManager:
paused: false
grafana:
paused: false
operator:
paused: false
prometheus:
paused: false
operator:
paused: false
kubeStateMetrics:
paused: false
nodeExporter:
paused: false
Note: By default all .<section>.paused
parameters set as false
.
Warning: Please pay attention that we are not recommending use this parameter except of cases of real necessity or cases of development.
So you can pause reconciliation process for some components and change them deployments as you want. For example, if I want to change Node-Exporter DaemonSet manually I can:
and next make changes in DaemonSet.
For see more information about available deploy parameters please refer to Installation Guide.
Export Prometheus Data¶
The information about the various ways to export data from Prometheus is described in the following sections.
Export using Snapshots¶
Snapshot creates a snapshot of all current data into snapshots/<datetime>-<rand>
under the TSDB's data directory and
returns the directory as a response. It optionally skips snapshotting the data that is only present in the head block,
and which has not yet been compacted to a disk.
Following are the URL query parameters:
skip_head=<bool>
: Skip data present in the head block. Optional.
$ curl -XPOST http://localhost:9090/api/v1/admin/tsdb/snapshot
{
"status": "success",
"data": {
"name": "20171210T211224Z-2be650b6d019eb54"
}
}
The snapshot now exists at <data-dir>/snapshots/20171210T211224Z-2be650b6d019eb54
.
Copy PV Content¶
The Prometheus data can be copied as a content of PV.
To do so, just copy the PV data to any external storage. For example,
Import Prometheus Data¶
The information about the various ways to import data from Prometheus is described in the following sections.
Import from Snapshots¶
To use the data from a snapshot, just copy the snapshot to --storage.tsdb.path=<dir>
and run Prometheus. By default,
it is --storage.tsdb.path=/prometheus
.
Use early copied PV Content¶
To use such data, just copy it to --storage.tsdb.path=<dir>
and run Prometheus. By default, it
is --storage.tsdb.path=/prometheus
.