一、cAdvisor简介
监控Pod指标数据需要使⽤cadvisor, cadvisor由⾕歌开源, cadvisor不仅可以搜集⼀台机器上所有运⾏的容器信息,还提供基础查询界⾯和http接⼝,⽅便其他组件如Prometheus进⾏数据抓取cAdvisor可以对节点机器上的资源及容器进⾏实时监控和性能数据采集,包括CPU使⽤情况、内存使⽤情况、⽹络吞吐量及⽂件系统使⽤情况。
二、DaemonSet部署cAdvisor
1.准备清单文档
清单文件参考:https://github.com/google/cadvisor/tree/master/deploy/kubernetes/base
清单文件使用了kustomize配置,我这省略了,配置文件如下
apiVersion: v1
kind: Namespace
metadata:name: cadvisor #自定义了名称空间,按需修改
---
apiVersion: apps/v1 # for Kubernetes versions before 1.9.0 use apps/v1beta2
kind: DaemonSet
metadata:name: cadvisornamespace: cadvisorannotations:seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:selector:matchLabels:name: cadvisortemplate:metadata:labels:name: cadvisorspec:tolerations: #污点容忍,忽略master的NoSchedule,具体污点可以通过descript命令查看- effect: NoSchedulekey: node-role.kubernetes.io/control-plane #你的污点未必和我一致,请确认hostNetwork: truecontainers:- name: cadvisorimage: gcr.io/cadvisor/cadvisor:v0.39.3 #默认国内无法下载,需要自行解决resources:requests:memory: 400Micpu: 400mlimits:memory: 2000Micpu: 800msecurityContext:privileged: true #需要开启特权模式volumeMounts: #删除readOnly挂载选项- name: rootfsmountPath: /rootfs- name: var-runmountPath: /var/run- name: sysmountPath: /sys- name: dockermountPath: /var/lib/dockerports:- name: httpcontainerPort: 8080hostPort: 8080 #如果不指定则和容器的port保持一致,看实际情况修改protocol: TCPvolumes:- name: rootfshostPath:path: /- name: var-runhostPath:path: /var/run- name: syshostPath:path: /sys- name: dockerhostPath:path: /var/lib/containerd/ #应为我的runc用的containerd,如果是docker,改成/var/lib/docker即可
2.应用清单配置
kubectl apply -f daemonset.yaml
kubectl get pods -n cadvisor
NAME READY STATUS RESTARTS AGE
cadvisor-5d2wq 1/1 Running 0 5m
cadvisor-lgb2b 1/1 Running 0 5m
cadvisor-wsvh7 1/1 Running 0 5mnetstat -tnlp|grep 8080 #与清单的hostPort保持一致
3.访问web界面验证
访问集群节点的8080端口
查看 metrics 接口
三、cadvisor常用指标数据及示例
常用示例
(1)获取容器CPU使用率
sum(irate(container_cpu_usage_seconds_total{image!=""}[1m])) without (cpu)(2)查询容器内存使用量(单位:字节)
container_memory_usage_bytes{image!=""}(3)查询容器网络接收量(速率)(单位:字节/秒)
sum(rate(container_network_receive_bytes_total{image!=""}[1m])) without(interface)(4)容器网络传输量 字节/秒
sum(rate(container_network_transmit_bytes_total{image!=""}[1m])) without(interface)(5)容器文件系统读取速率 字节/秒
sum(rate(container_fs_reads_bytes_total{image!=""}[1m])) without (device)(6)容器文件系统写入速率 字节/秒
sum(rate(container_fs_writes_bytes_total{image!=""}[1m])) without (device)(7)容器网络接收的字节数(1分钟内),根据名称查询 name=~".+"
sum(rate(container_network_receive_bytes_total{name=~".+"}[1m])) by (name)(8)容器网络传输的字节数(1分钟内),根据名称查询 name=~".+"
sum(rate(container_network_transmit_bytes_total{name=~".+"}[1m])) by (name)(9)所用容器system cpu的累计使用时间(1min内)
sum(rate(container_cpu_system_seconds_total[1m]))(10)每个容器system cpu的使用时间(1min内)
sum(irate(container_cpu_system_seconds_total{image!=""}[1m])) without (cpu)(11)每个容器的cpu使用率
sum(rate(container_cpu_usage_seconds_total{name=~".+"}[1m])) by (name) * 100(12)总容器的cpu使用率
sum(sum(rate(container_cpu_usage_seconds_total{name=~".+"}[1m])) by (name) * 100)
四、配置prometheus采集cadvisor
1.配置prometheus
vim /usr/local/prometheus/prometheus.yml #在文件最后添加一个job- job_name: "cadvisor"static_configs: #改成你集群的节点IP和cadvisor的端口- targets: ["192.168.100.131:8080","192.168.100.132:8080","192.168.100.133:8080"]curl -X POST http://127.0.0.1:9090/-/reload #如果没有配置热更新则需要重启
2.prometheus验证cadvisor数据
五、grafana配置 模板监控pod
1.创建新的dashboard
2.导入对应的模板,这来使用的模板ID为14282
3.查看dashboard数据