Monitoring wire-server using Prometheus and Grafana¶
Introduction¶
The following instructions detail the installation of a monitoring system consisting of a Prometheus instance and corresponding Alert Manager in addition to a Grafana instance for viewing dashboards related to cluster and wire-services health.
Prerequisites¶
You need to have wire-server installed, see either of
How to install Prometheus and Grafana on Kubernetes using Helm¶
Note: the following makes use of overrides for helm charts. You may wish to read :ref:`understand-helm-overrides` first.
Create an override file:
mkdir -p wire-server-metrics
curl -sSL https://raw.githubusercontent.com/wireapp/wire-server-deploy/master/values/wire-server-metrics/demo-values.example.yaml > wire-server-metrics/values.yaml
And edit this file by editing/uncommenting as needed with respect to the next sections.
The monitoring system requires disk space if you wish to be resilient to pod failure. This disk space is given to pods by using a so-called “Storage Class”. You have three options:
If you deploy on a kubernetes cluster hosted on AWS you may install the
aws-storage
helm chart which provides configurations of Storage Classes for AWS’s elastic block storage (EBS). For this, install the aws storage classes withhelm upgrade --install aws-storage wire/aws-storage --wait
.
If you’re not using AWS, but you sill want to have persistent metrics, see Using Custom Storage Classes.
If you don’t want persistence at all, see Monitoring without persistent disk.
Once you have a storage class configured (or put the override configuration to not use persistence), next we can install the monitoring suite itself.
There are a few known issues surrounding the prometheus-operator
helm chart.
You will likely have to install the Custom Resource Definitions manually
before installing the wire-server-metrics
chart:
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/alertmanager.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/prometheus.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/prometheusrule.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/servicemonitor.crd.yaml
Now we can install the metrics chart, run the following:
helm upgrade --install wire-server-metrics wire/wire-server-metrics --wait -f wire-server-metrics/values.yaml
See the Prometheus Operator README for more information and troubleshooting help.
Adding Dashboards¶
Grafana dashboard configurations are included as JSON inside the
charts/wire-server-metrics/dashboards
directory. You may import
these via Grafana’s web UI. See Accessing
grafana.
Monitoring in a separate namespace¶
It is advisable to separate your monitoring services from your
application services. To accomplish this you may deploy
wire-server-metrics
into a separate namespace from wire-server
.
Simply provide a different namespace to the helm upgrade --install
calls with --namespace your-desired-namespace
.
The wire-server-metrics chart will monitor all wire services across all namespaces.
Accessing grafana¶
Forward a port from your localhost to the grafana service running in your cluster:
kubectl port-forward service/<release-name>-grafana 3000:80 -n <namespace>
Now you can access grafana at http://localhost:3000
The username and password are stored in the grafana
secret of your
namespace
By default this is:
username:
admin
password:
admin
Accessing prometheus¶
Forward a port from your localhost to the prometheus service running in your cluster:
kubectl port-forward service/<release-name>-prometheus 9090:9090 -n <namespace>
Now you can access prometheus at http://localhost:9090
Customization¶
Monitoring without persistent disk¶
If you wish to deploy monitoring without any persistent disk (not
recommended) you may add the following overrides to your values.yaml
file.
# This configuration switches to use memory instead of disk for metrics services
# NOTE: If the pods are killed you WILL lose all your metrics history
prometheus-operator:
grafana:
persistence:
enabled: false
prometheusSpec:
storageSpec: null
alertmanager:
alertmanagerSpec:
storage: null
Using Custom Storage Classes¶
If you’re using a provider other than AWS please reference the Kubernetes documentation on storage classes for configuring a storage class for your kubernetes cluster.
If you wish to use a different storage class (for instance if you don’t
run on AWS) you may add the following overrides to your values.yaml
file.
prometheus-operator:
grafana:
persistence:
storageClassName: "<my-storage-class>"
prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: "<my-storage-class>"
alertmanager:
alertmanagerSpec:
storage:
volumeClaimTemplate:
spec:
storageClassName: "<my-storage-class>"
Troubleshooting¶
“validation failed”¶
If you receive the following error:
Error: validation failed: [unable to recognize "": no matches for kind "Alertmanager" in version
"monitoring.coreos.com/v1", unable to recognize "": no matches for kind "Prometheus" in version
"monitoring.coreos.com/v1", unable to recognize "": no matches for kind "PrometheusRule" in version
Please run the script to install Custom Resource Definitions which is detailed in the installation instructions above.
“object is being deleted”¶
When upgrading you may see the following error:
Error: object is being deleted: customresourcedefinitions.apiextensions.k8s.io "prometheusrules.monitoring.coreos.com" already exists
Helm sometimes has trouble cleaning up or defining Custom Resource Definitions. Try manually deleting the resource definitions and trying your helm install again:
kubectl delete customresourcedefinitions \
alertmanagers.monitoring.coreos.com \
prometheuses.monitoring.coreos.com \
servicemonitors.monitoring.coreos.com \
prometheusrules.monitoring.coreos.com