前言
整理技术,在这篇文章中,将会搭建grafana+prometheus+cadvisor监控容器,并使用一个热门数据看板,再监控容器的性能指标
dashboard效果
这个是node-exporter采集到的数据,我没装node-exporter,而且这也不是本文的内容,所以这个看板就没东西
这个是容器性能指标
这个性能指标里东西就比较多了
准备配置文件
docker-compose.yaml
version: "3"services:grafana:image: grafana/grafana:latestcontainer_name: grafanaenvironment:- TZ=Asia/Shanghaiports:- 3000:3000volumes:- ./grafana-data:/var/lib/grafananetworks:custom-bridge:restart: unless-stoppedlogging:options:max-size: "10m"prometheus:image: prom/prometheus:latestcontainer_name: prometheusrestart: unless-stoppednetworks:custom-bridge:volumes:- ./prometheus.yml:/etc/prometheus/prometheus.yml- ./prometheus_data:/prometheusports:- 19090:9090logging:options:max-size: "10m"cadvisor:image: gcr.io/cadvisor/cadvisor:latestcontainer_name: cadvisorrestart: unless-stoppednetworks:custom-bridge:volumes:- /:/rootfs:ro- /var/run:/var/run/:ro- /sys:/sys:ro- /var/lib/docker:/var/lib/docker:roports:- 9090:9090logging:options:max-size: "10m"
networks:custom-bridge:external: true
在拉取cadvisor镜像时可能遇到网络问题,解决方法是参考这篇文章:docker daemon配置网络代理
prometheus.yml
# my global config
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configuration
alerting:alertmanagers:- static_configs:- targets:# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:# - "first_rules.yml"# - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.static_configs:- targets: ["localhost:9090"]- job_name: cadvisorscrape_interval: 5sstatic_configs:- targets:- cadvisor:8080
创建数据文件夹并设置权限码
mkdir grafana-data
mkdir prometheus_data
chmod 777 grafana-data
chmod 777 prometheus_data
启动并进入grafana配置数据源
运行docker-compose up -d
启动,启动后进入grafana网页端:http://pet.anarckk.me:3000/
,默认账号密码是 admin/admin
点击add new connection
搜索并选择prometheus
修改connection地址
最后测试并保存
选择一个热门的dashboard引用过来
先创建一个dashboard
再找一个热门的dashboard,我这里用的是 https://grafana.com/grafana/dashboards/16314-docker-container-os-node-node-exporter-cadvisor/ ,dashboard id 是 16314
选择import dashboard
复制id进去,然后点击load
选择数据源prometheus, 最后再点击import
全部完成,到这里,就可以看到前面的dashboard效果了
prometheus也可以单独查询指定的指标
打开prometheus的网页端: http://pet.anarckk.me:19090/graph
# 查询容器的下行速度
rate(container_network_receive_bytes_total{name="alist"}[10s])
# 查询容器的上行速度
rate(container_network_transmit_bytes_total{name="alist"}[10s])