备注:1.不推荐yum或者源码安装,安装包跟系统架构不兼容,推荐docker方式部署,这样就可以忽略系统不兼容的问题。
2.准备工作:开通端口映射,即公网的ip加grafana的默认端口9090到内网部署grafana服务的服务器的9090端口的映射,就可以通过外网访问。
一、目标:收集所有节点的负载情况:cpu、磁盘和内存使用率。
二、架构:
机器:172.23.73.10-15,10作为监控端口,10-15都作为被监控端机器。
部署服务器:
172.23.73.10:prometheus、grafana、node-exporter。
172.23.73.11-15:node-exporter
三、6台机器离线安装docker(可以拿以下教程作为参考,具体根据实际情况去做,不用安装docker-compoes):
1、准备服务器账号(本次全程使用root账号)
(因为是第一次安装这玩意儿,尽量使用root账号,避免操作的时候碰到一些权限问题)
2、查看操作系统版本
[root@ArmServer docker]# uname -a
Linux ArmServer.XHKJ 4.19.90-24.4.v2101.ky10.aarch64 #1 SMP Mon May 24 14:45:37 CST 2021 aarch64 aarch64 aarch64 GNU/Linux
[root@ArmServer docker]# cat /proc/version
Linux version 4.19.90-24.4.v2101.ky10.aarch64 (KYLINSOFT@localhost.localdomain) (gcc version 7.3.0 (GCC)) #1 SMP Mon May 24 14:45:37 CST 2021
3、查看操作系统架构
[root@ArmServer docker]# uname -m
aarch64
二、安装docker
1、下载docker离线包
下载地址:https://download.docker.com/linux/static/stable/
选择系统架构对应的文件目录:aarch64
(我目前使用的docker版本是:docker-20.10.7.tgz)
2、下载 docker-compose离线包
2.1、下载地址:https://github.com/docker/compose/releases
选择对应系统架构的离线安装包
(我目前使用的版本是:v2.17.2)
2.2、 Github的链接有时候可能访问不了,所以我自己建了个链接,有需要的自行下载(免费)https://download.csdn.net/download/qq_34725844/88633719
3、准备 docker.service系统配置文件
(复制以下内容保存为 docker.service 文件)
docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
4、安装 docker 和 docker compose 离线包
4.1、安装docker
# 解压 docker 到当前目录
tar -xvf docker-20.10.7.tgz
# 将 docker 文件移动到 /usr/bin 目录下
cp -p docker/* /usr/bin
# 将 docker-compose 文件复制到 /usr/local/bin/ 目录下,并重命名为 docker-compose
cp docker-compose-linux-aarch64 /usr/local/bin/docker-compose
# 设置 docker-compose 文件权限
chmod +x /usr/local/bin/docker-compose
# 将 docker.service 移到 /etc/systemd/system/ 目录
cp docker.service /etc/systemd/system/
# 设置 docker.service 文件权限
chmod +x /etc/systemd/system/docker.service
# 重新加载配置文件
systemctl daemon-reload
# 启动docker
systemctl start docker
# 设置 docker 开机自启
systemctl enable docker.service
4.2、验证安装是否成功
4.2.1查看 docker 版本
[root@ArmServer bin]# docker -v
Docker version 20.10.7, build f0df350
4.2.2查看docker-compose 版本
[root@ArmServer bin]# docker-compose -v
Docker Compose version v2.17.2
1.修改/etc/docker/daemon.conf
{
"insecure-registries":["https://cmaharbor.zlx.com"],
"log-driver": "json-file",
"log-opts": {"max-size":"100m", "max-file":"1"},
"registry-mirrors": ["https://docker-proxy.741001.xyz","https://registry.docker-cn.com"]
}
sudo systemctl daemon-reload
sudo systemctl restart docker
3.docker部署三个服务(10部署三个,其他的只用部署node-exporter)
docker pull prom/prometheus
2.启动
docker run -d -p 9091:9090 \
> -v /data/prometheus:/etc/prometheus \
> -v /data/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml \
> --name prometheus prom/prometheus
-d参数表示我们要在后台运行容器,
-p 9091:9090参数表示我们将主机的9091端口映射到容器的9090端口,因为我这里9090提示端口冲突所以我改成9091,
-v /data/prometheus:/etc/prometheus参数表示挂载目录,启动命令运行前创建好/data/prometheus目录,
-v /data/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml参数表示挂载配置文件,方便日后修改prometheus.yml文件,
-name prometheus参数表示我们给容器取一个名字为prometheus。
3.访问普罗米修斯网页界面
最后,我们可以通过浏览器访问普罗米修斯(Prometheus)的Web界面。在浏览器中输入"http://localhost:9090",你将看到普罗米修斯的控制台界面。
通过以上步骤,你已经成功地使用Docker(Docker)安装并启动了Prometheus。现在你可以开始配置Prometheus(Prometheus)以监控你的应用程序。
4.进入Prometheus容器,执行命令:
docker exec -it b2df256f2e10 /bin/sh
b2df256f2e10是我当前的容器id,你换成你自己的即可
5.查看宿主机上的/data/prometheus下的prometheus.yml文件
如果没有,自己创建一个prometheus.yml
prometheus.yml文件内容
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
二、node-expoter安装
1.node-expoter的镜像安装可以参考Docker部署Prometheus+Grafana+node-exporter_docker安装普罗米修斯-CSDN博客
2.拉取镜像:
docker pull prom/node-exporter
拉取失败
可能是原来配置的加速器都不可用了,所以新增加速器
"https://docker.m.daocloud.io"
新增加速器后需要重启daocker
sudo systemctl restart docker
再次重新拉取镜像,拉取成功了。
3.启动
docker run --name exporter -p 9102:9100 -d prom/node-exporter
-p 9102:9100参数表示宿主机端口9102对应exporter端口9100,因为9100在我本地冲突,随意我这里修改端口为9102,如果你不冲突还是可以用-p 9100:9100.
前端访问:http://ip:9102/
三、prometheus配置node-expoter
1.修改prometheus.yml文件
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "node-exporter"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["你自己的ip:9102"]
也就是在原prometheus.yml文件上添加
- job_name: "node-exporter"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["你自己的ip:9102"]
2.更新prometheus上的prometheus.yml文件
执行命令:
sudo docker exec -it prometheus /bin/sh -c 'kill -HUP $(pidof prometheus)'
这个命令直接更新prometheus.yml文件,不用重启prometheus。
五、grafana配置prometheus数据源,导入模板,查看效果。