Now that our service is deployed in the K8s cluster, we want to deploy a monitoring service to automatically monitor the resources in the K8s cluster. This blog is divided into two parts. First is using Prometheus to monitor the service for our K8s cluster, when the monitor service finds some node or pod has issues and sends an alert as soon as possible, our deployer can fix the issues asap. And the second page is coded as simply Shell scripts to test the cluster’s network.
And for the K8s cluster, the main monitor is part into three: Node, Namespace, and Pod.
I. Node Monitoring
For the node side, we mainly monitor the memory, CPU usage, disk, and inodes usages, send alert when usage is too high, and need to cover the NodeNotReady situation.
1. NodeMemorySpaceFillingUp
Monitoring the node’s memory usage, sending an alert when usage > 80%.
1 2 3 4 5 6 7 8 9
alert:NodeMemorySpaceFillingUp expr:((1 - (node_memory_MemAvailable_bytes{job="node-exporter"} / node_memory_MemTotal_bytes{job="node-exporter"}) * on(instance) group_left(nodename) (node_uname_info) > 0.8) * 100) for: 5m labels: cluster: critical type: node annotations: description: Memory usage on `{{$labels.nodename}}`({{ $labels.instance }}) up to {{ printf "%.2f" $value }}%. summary: Node memory will be exhausted.
2. NodeCpuUtilisationHigh
Monitoring the node’s CPU usage, sending an alert when usage > 80%.
1 2 3 4 5 6 7 8 9
alert:NodeFilesystemAlmostOutOfSpace expr:((node_filesystem_avail_bytes{fstype!="",job="node-exporter"} / node_filesystem_size_bytes{fstype!="",job="node-exporter"} * 100 < 20 and node_filesystem_readonly{fstype!="",job="node-exporter"} == 0) * on(instance) group_left(nodename) (node_uname_info)) for: 5m labels: cluster: critical type: node annotations: description: Filesystem on `{{ $labels.device }}` at `{{$labels.nodename}}`({{ $labels.instance }}) has only {{ printf "%.2f" $value }}% available space left. summary: Node filesystem has less than 20% space left.
3. NodeFilesystemAlmostOutOfSpace
Monitoring the node’s disk usage, sending an alert when left space < 10%.
1 2 3 4 5 6 7 8 9
alert:NodeFilesystemAlmostOutOfSpace expr:((node_filesystem_avail_bytes{fstype!="",job="node-exporter"} / node_filesystem_size_bytes{fstype!="",job="node-exporter"} * 100 < 10 and node_filesystem_readonly{fstype!="",job="node-exporter"} == 0) * on(instance) group_left(nodename) (node_uname_info)) for: 5m labels: cluster: critical type: node annotations: description: Filesystem on `{{ $labels.device }}` at `{{$labels.nodename}}`({{ $labels.instance }}) has only {{ printf "%.2f" $value }}% available space left. summary: Node filesystem has less than 10% space left.
4. NodeFilesystemAlmostOutOfFiles
Monitoring the node’s index usage, sending an alert when left inodes < 10%.
1 2 3 4 5 6 7 8 9
alert:NodeFilesystemAlmostOutOfFiles expr:((node_filesystem_files_free{fstype!="",job="node-exporter"} / node_filesystem_files{fstype!="",job="node-exporter"} * 100 < 10 and node_filesystem_readonly{fstype!="",job="node-exporter"} == 0) * on(instance) group_left(nodename) (node_uname_info)) for: 5m labels: cluster: critical type: node annotations: description: Filesystem on `{{ $labels.device }}` at `{{$labels.nodename}}`({{ $labels.instance }}) has only {{ printf "%.2f" $value }}% available inodes left. summary: Node filesystem has less than 10% inodes left.
5. KubeNodeNotReady
Monitoring the node’s stage, sending alerts when some nodes are not ready.
1 2 3 4 5 6 7 8 9
alert:KubeNodeNotReady expr:(kube_node_status_condition{condition="Ready",job="kube-state-metrics",status="true"} == 0) for: 5m labels: cluster: critical type: node annotations: description: {{ $labels.node }} has been unready for more than 15 minutes. summary: Node is not ready.
6. KubeNodePodsTooMuch
Monitoring the node’s pod number, the max pod’s number in every node is 110, send alert when usage > 80%.
1 2 3 4 5 6 7 8 9
alert:KubeNodePodsTooMuch expr:(sum by(node) (kube_pod_info) * 100 / 110 > 80) for: 5m labels: cluster: critical type: node annotations: description: Pods usage on `{{$labels.node}}` up to {{ printf "%.2f" $value }}%. summary: Node pods too much.
II. Namespace Monitoring
There is three num of Namespace, is limit, request, and usage.
cpu:sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate) by (namespace)
memory:sum(node_namespace_pod_container:container_memory_working_set_bytes) by (namespace)
We can define the namespace’s limit memory and CPU, I don’t know how to get this value from Prometheus yet, put more attention.
We need to increase requests when request/limit > is 80%.
1. NamespaceCpuUtilisationHigh
Monitoring the namespace’s CPU usage, sending an alert when usage > 90%.
1 2 3 4 5 6 7 8 9
alert:NamespaceCpuUtilisationHigh expr:(sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_limits:sum) by (namespace) * 100 > 90) for: 5m labels: cluster: critical type: namespace annotations: description: CPU utilisation on `{{$labels.namespace}}` up to {{ printf "%.2f" $value }}%. summary: Namespace CPU utilisation high.
2. NamespaceCpuUtilisationLow
Monitoring the namespace’s CPU usage, sending an alert when usage < 10%.
1 2 3 4 5 6 7 8 9
alert:NamespaceCpuUtilisationLow expr:(sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_limits:sum) by (namespace) * 100 < 10) for: 5m labels: cluster: critical type: namespace annotations: description: CPU utilisation on `{{$labels.namespace}}` as low as {{ printf "%.2f" $value }}%. summary: Namespace CPU underutilization.
3. NamespaceMemorySpaceFillingUp
Monitoring the namespace’s memory usage, sending an alert when usage > 90%.
1 2 3 4 5 6 7 8 9
alert:NamespaceMemorySpaceFillingUp expr:(sum(node_namespace_pod_container:container_memory_working_set_bytes) by (namespace) / sum(namespace_memory:kube_pod_container_resource_limits:sum) by (namespace) * 100 > 90) for: 5m labels: cluster: critical type: namespace annotations: description: Memory usage on `{{$labels.namespace}}` up to {{ printf "%.2f" $value }}%. summary: Namespace memory will be exhausted.
4. NamespaceMemorySpaceLow
Monitoring the namespace’s memory usage, sending an alert when usage < 10%.
1 2 3 4 5 6 7 8 9
alert:NamespaceMemorySpaceLow expr:(sum(node_namespace_pod_container:container_memory_working_set_bytes) by (namespace) / sum(namespace_memory:kube_pod_container_resource_limits:sum) by (namespace) * 100 < 10) for: 5m labels: cluster: critical type: namespace annotations: description: Memory usage on `{{$labels.namespace}}` as low as {{ printf "%.2f" $value }}%. summary: Under-utilized namespace memory.
5. KubePodNotReady
Monitoring the pod’s states, send an alert if any pod continues not-ready states exist until 15 mins.
1 2 3 4 5 6 7 8 9
alert:KubePodNotReady expr:(sum by(namespace, pod) (max by(namespace, pod) (kube_pod_status_phase{job="kube-state-metrics",namespace=~".*",phase=~"Pending|Unknown"}) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"}))) > 0) for: 5m labels: cluster: critical type: namespace annotations: description: Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than 15 minutes. summary: Pod has been in a non-ready state for more than 15 minutes.
6. KubeContainerWaiting
Monitoring the pod’s states, send an alert if any pod continues waiting until 15 mins.
1 2 3 4 5 6 7 8 9
alert:KubeContainerWaiting expr:(sum by(namespace, pod, container) (kube_pod_container_status_waiting_reason{job="kube-state-metrics",namespace=~".*"}) > 0) for: 5m labels: cluster: critical type: namespace annotations: description:Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{$labels.container}} has been in waiting state for longer than 15 minutes. summary: Pod container waiting longer than 15 minutes.
7. PodRestart
This monitor is configured that if some pod in the Kube-system namespace restart and sends an alert.
Because pods in the Kube-system namespace have some pod-related clusters, like our log college tools and cordons, etc. So if any pods in this namespace restarted, we need to pay more action to the cluster that has some issues.
1 2 3 4 5 6 7 8 9
alert:PodRestart expr:(floor(increase(kube_pod_container_status_restarts_total{namespace="kube-system"}[1m])) > 0) for: 5m labels: cluster: critical type: namespace annotations: description:Pod {{ $labels.namespace }}/{{ $labels.pod }} restart {{ $value }} times in last 1 minutes. summary: Pod restart in last 1 minutes.
8. PrometheusOom
Prometheus has some down risk, so we ass a monitor for monitoring Prometheus, if memory is up to 90% maybe has some issues, so send an alert.
1 2 3 4 5 6 7 8 9
alert:PrometheusOom expr:(container_memory_working_set_bytes{container="prometheus"} / container_spec_memory_limit_bytes{container="prometheus"} > 0.9) for: 5m labels: cluster: critical type: namespace annotations: description:Memory usage on `Prometheus` up to {{ printf "%.2f" $value }}%. summary: Prometheus will be oom.
III. Network Monitoring
We have some case of some machine’s cores being down, and the networking machine is normal but all pod in this machine has no network, for this case, we add some simply monitor to keep the service stable, in the next blog I will control the detail of this monitoring.