1.Loki集群介绍
1.1 说明
Loki是专为日志设计的轻量级聚合系统,通过只索引元数据(标签)而非日志内容,结合对象存储(如S3),实现低成本、高吞吐的日志存储与查询,尤其适合云原生环境(如Kubernetes)与Prometheus/Grafana生态无缝集成。
1.2 核心分工
- Promtail(客户端):
-
-
功能:运行在日志源(如 Kubernetes Pod 所在节点),负责:
-
读取日志文件(如容器日志)。
-
为日志添加标签(
labels
,如namespace
,pod
,job
等)。 -
将日志数据推送到 Loki。
-
-
特点:无状态、不存储数据,仅负责日志的采集、过滤和转发。
-
-
Loki(服务端):
-
功能:接收来自 Promtail 的日志数据,负责:
-
存储日志内容:原始日志内容通常存储在廉价的对象存储(如 AWS S3、MinIO、GCS 等)。
-
存储索引:通过标签(
labels
)生成轻量级索引(索引可存储在 BoltDB、Cassandra 等)。 -
提供日志查询接口(与 Grafana 集成)。
-
-
特点:有状态、负责数据持久化,核心存储逻辑在 Loki 服务端。
-
2.部署环境
- 部署单点grafana、单点loki、promtail每个node上一个daemonset。
IP | 节点 | 操作系统 | k8s版本 |
Loki版本(grafana、loki、promtail) |
docker版本 |
172.16.4.85 | master1 | centos7.8 | 1.23.17 | promtail:2.9.4 | 20.10.9 |
172.16.4.86 | node1 | centos7.8 | 1.23.17 | promtail:2.9.4 | 20.10.9 |
172.16.4.87 | node2 | centos7.8 | 1.23.17 | promtail:2.9.4 | 20.10.9 |
172.16.4.89 | node3 | centos7.8 | 1.23.17 | loki:2.9.4、promtail:2.9.4 | 20.10.9 |
172.16.4.90 | node4 | centos7.8 | 1.23.17 | grafana:latest、promtail:2.9.4 | 20.10.9 |
3.nfs部署
- centos7安装nfs
yum install -y nfs-utils
- 创建nfs共享目录(grafana、loki、promtail)
mkdir -p /nfs_share/k8s/grafana/pv1 /nfs_share/k8s/loki/pv1 /nfs_share/k8s/promtail/pv1
chmod 777 /nfs_share/k8s/grafana/pv1 /nfs_share/k8s/loki/pv1 /nfs_share/k8s/promtail/pv1
- nfs配置文件编辑
[root@localhost loki]# cat /etc/exports
/nfs_share/k8s/grafana/pv1 *(rw,sync,no_subtree_check,no_root_squash)
/nfs_share/k8s/loki/pv1 *(rw,sync,no_subtree_check,no_root_squash)
/nfs_share/k8s/promtail/pv1 *(rw,sync,no_subtree_check,no_root_squash)
- 启动nfs服务
# 启动 NFS 服务
systemctl start nfs-server
# 设置 NFS 服务在系统启动时自动启动
systemctl enable nfs-server
- 加载配置文件,并输出
[root@localhost es]# exportfs -r
[root@localhost es]# exportfs -v
/nfs_share/k8s/loki/pv1<world>(sync,wdelay,hide,no_subtree_check,anonuid=10001,anongid=10001,sec=sys,rw,secure,no_root_squash,all_squash)
/nfs_share/k8s/promtail/pv1<world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
/nfs_share/k8s/grafana/pv1<world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
4.创建namespace
apiVersion: v1
kind: Namespace
metadata:name: loki
kubectl apply -f loki-ns.yaml
5.Loki部署
5.1 Loki部署pv
apiVersion: v1
kind: PersistentVolume
metadata:name: loki-pv
spec:capacity:storage: 15Gi # 根据实际需求调整容量volumeMode: FilesystemaccessModes:- ReadWriteMany # 允许多节点读写persistentVolumeReclaimPolicy: Retain # 保留数据(生产推荐)storageClassName: nfs # 存储类名称(需与PVC匹配)nfs:server: 172.16.4.60path: /nfs_share/k8s/loki/pv1
kubectl apply -f loki-pv.yaml
5.2 Loki部署pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:name: loki-pvcnamespace: loki
spec:accessModes:- ReadWriteManystorageClassName: nfs # 必须与PV的storageClassName一致resources:requests:storage: 15Gi # 必须 ≤ PV容量
kubectl apply -f loki-pvc.yaml
5.3 Loki部署configmap
apiVersion: v1
kind: ConfigMap
metadata:name: loki-confignamespace: loki
data:loki.yaml: |auth_enabled: falseserver:http_listen_port: 3100grpc_listen_port: 9095common:path_prefix: /data/lokistorage:filesystem:chunks_directory: /data/loki/chunksrules_directory: /data/loki/rulesreplication_factor: 1 ingester:max_transfer_retries: 0 # 必须设为0lifecycler:ring:kvstore:store: inmemoryreplication_factor: 1wal:enabled: true # 显式启用 WALdir: /data/loki/wal # 新增 WAL 目录配置limits_config:ingestion_rate_mb: 50 # >>> 调高全局速率限制(16 → 20)ingestion_burst_size_mb: 100 #>>> 调高突发速率限制(32 → 40)per_stream_rate_limit: 50MB #>>> 新增单 Stream 速率限制(默认无此配置)per_stream_rate_limit_burst: 100MB #>>> 新增单 Stream 突发限制(默认无此配置)max_streams_per_user: 100000max_line_size: 10485760retention_period: 720hreject_old_samples: truereject_old_samples_max_age: 168hschema_config:configs:- from: 2024-01-01store: boltdb-shipperobject_store: filesystemschema: v11index:prefix: index_period: 24hstorage_config:boltdb_shipper:active_index_directory: /data/loki/indexcache_location: /data/loki/boltdb-cacheshared_store: filesystemcompactor:working_directory: /data/loki/compactorshared_store: filesystemcompaction_interval: 10mretention_enabled: truequery_range:max_retries: 3cache_results: trueresults_cache:cache:enable_fifocache: truefifocache:max_size_bytes: 512MB
kubectl apply -f loki-cm.yaml
5.4 Loki部署deployment
apiVersion: apps/v1
kind: Deployment
metadata:name: lokinamespace: loki
spec:replicas: 1selector:matchLabels:app: lokitemplate:metadata:labels:app: lokispec:containers:- name: lokiimage: 172.16.4.17:8090/tools/grafana/loki:2.9.4args:- -config.file=/etc/loki/loki.yaml # 指定配置文件路径ports:- containerPort: 3100volumeMounts:- name: configmountPath: /etc/loki # 挂载 ConfigMap- name: storagemountPath: /data/loki # 挂载持久化数据目录resources:limits:memory: 4Gicpu: "2"requests:memory: 2Gicpu: "1"volumes:- name: configconfigMap:name: loki-config # 关联 ConfigMap- name: storagepersistentVolumeClaim:claimName: loki-pvc # 关联 PVC
kubectl apply -f loki.yaml
5.5 Loki部署service
apiVersion: v1
kind: Service
metadata:name: lokinamespace: loki
spec:ports:- port: 3100targetPort: 3100selector:app: lokitype: ClusterIP
kubectl apply -f loki-svc.yaml
6.Promtail部署
6.1 Promtail部署pv
apiVersion: v1
kind: PersistentVolume
metadata:name: promtail-pv
spec:capacity:storage: 15GiaccessModes:- ReadWriteManypersistentVolumeReclaimPolicy: RetainstorageClassName: promtail-nfs # 与 PVC 匹配nfs:server: 172.16.4.60path: /nfs_share/k8s/promtail/pv1
kubectl apply -f pr-pv.yaml
6.2 Promtail部署pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:name: promtail-pvcnamespace: loki
spec:accessModes:- ReadWriteManystorageClassName: promtail-nfsresources:requests:storage: 15Gi
kubectl apply -f pr-pvc.yaml
6.3 Promtail部署rbac
apiVersion: v1
kind: ServiceAccount
metadata:name: promtailnamespace: lokilabels:app: promtailcomponent: log-collector
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:name: promtaillabels:app: promtailcomponent: log-collector
rules:
- apiGroups: [""]resources:- nodes # 节点基本信息- nodes/proxy # 新增:访问 Kubelet API(需谨慎)- pods # Pod 发现- pods/log # 日志读取(核心权限)- services # 服务发现- endpoints # 新增:端点监控- namespaces # 命名空间元数据verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: promtaillabels:app: promtailcomponent: log-collector
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: promtail
subjects:
- kind: ServiceAccountname: promtailnamespace: loki
kubectl apply -f pr-rbac.yaml
6.4 Promtail部署configmap
- 非常重要,否则过滤不到日志,我这边也搞了好久
apiVersion: v1
kind: ConfigMap
metadata:name: promtail-confignamespace: lokilabels:app: promtail
data:promtail.yaml: |# ================= 全局配置 =================server:http_listen_port: 3101 # 与 DaemonSet 中健康检查端口对齐grpc_listen_port: 0log_level: info # 生产环境建议使用 info 级别client:backoff_config:max_period: 5m max_retries: 10min_period: 500msbatchsize: 1048576batchwait: 1sexternal_labels: {}timeout: 10surl: http://loki.loki.svc.cluster.local:3100/loki/api/v1/push # 保持与 DaemonSet 参数一致positions:filename: /var/lib/promtail-positions/positions.yaml # 与 PVC 挂载路径匹配# ================= 日志抓取规则 =================scrape_configs:# ========== Docker 容器日志采集 ==========- job_name: docker-containerspipeline_stages:- docker: {} # 使用 Docker 日志解析static_configs:- targets: [localhost]labels:job: docker__path__: /data/docker_storage/containers/*/*.log # 匹配您的自定义路径host: ${HOSTNAME} # 使用 DaemonSet 注入的环境变量# ========== Kubernetes Pod 日志主配置 ==========- job_name: kubernetes-podskubernetes_sd_configs:- role: podpipeline_stages:- cri: {} # 改为 CRI 解析器以更好支持 containerdrelabel_configs:# 系统命名空间过滤- action: dropregex: 'kube-system|kube-public|loki' # 增加自身命名空间过滤source_labels: [__meta_kubernetes_namespace]# 路径生成规则优化- action: replacesource_labels: [__meta_kubernetes_pod_uid, __meta_kubernetes_pod_container_name]separator: /target_label: __path__replacement: /var/log/pods/*$1/*.log# 标准标签映射- action: labelmapregex: __meta_kubernetes_pod_label_(.+)- action: replacesource_labels: [__meta_kubernetes_namespace]target_label: namespace- action: replacesource_labels: [__meta_kubernetes_pod_name]target_label: pod- action: replacesource_labels: [__meta_kubernetes_pod_container_name]target_label: container- action: replacesource_labels: [__meta_kubernetes_node_name]target_label: node# 自动发现业务标签- action: replacesource_labels: [__meta_kubernetes_pod_label_app]target_label: appreplacement: ${1}regex: (.+)- action: replacesource_labels: [__meta_kubernetes_pod_label_release]target_label: releasereplacement: ${1}regex: (.+)# ========== 精简控制器日志采集 ==========- job_name: kubernetes-controllerskubernetes_sd_configs:- role: podpipeline_stages:- cri: {}relabel_configs:- action: dropregex: 'kube-system|kube-public|loki'source_labels: [__meta_kubernetes_namespace]- action: keepregex: '[0-9a-z-.]+-[0-9a-f]{8,10}'source_labels: [__meta_kubernetes_pod_controller_name]- action: replaceregex: '([0-9a-z-.]+)-[0-9a-f]{8,10}'source_labels: [__meta_kubernetes_pod_controller_name]target_label: controller- action: replacesource_labels: [__meta_kubernetes_pod_node_name]target_label: node
kubectl apply -f pr-cm.yaml
6.5 Promtail部署daemonset
apiVersion: apps/v1
kind: DaemonSet
metadata:name: promtailnamespace: lokilabels:app: promtail # 添加统一标签
spec:selector:matchLabels:app: promtailupdateStrategy: # 添加更新策略(新增内容)type: RollingUpdaterollingUpdate:maxUnavailable: 1template:metadata:labels:app: promtailspec:serviceAccountName: promtail# 安全上下文调整(保持 root 但限制权限)securityContext:runAsUser: 0runAsGroup: 0fsGroup: 0containers:- name: promtailimage: 172.16.4.17:8090/tools/grafana/promtail:2.9.4imagePullPolicy: IfNotPresent # 新增镜像拉取策略args:- -config.file=/etc/promtail/promtail.yaml# 建议添加 Loki 地址(重要!根据实际情况修改)- -client.url=http://loki.loki.svc.cluster.local:3100/loki/api/v1/pushenv:- name: HOSTNAME # 新增节点名称获取(重要)valueFrom:fieldRef:fieldPath: spec.nodeNameports:- containerPort: 3101 # 添加监控端口(新增)name: http-metricsprotocol: TCPvolumeMounts:- name: configmountPath: /etc/promtail- name: docker-logsmountPath: /data/docker_storage/containersreadOnly: true- name: pods-logsmountPath: /var/log/podsreadOnly: true- name: positionsmountPath: /var/lib/promtail-positionssecurityContext: # 调整容器安全上下文(重要)readOnlyRootFilesystem: true # 增强安全性privileged: false # 移除特权模式readinessProbe: # 新增健康检查(重要)httpGet:path: /readyport: http-metricsinitialDelaySeconds: 10timeoutSeconds: 1tolerations: # 新增容忍度(重要)- operator: Exists # 允许调度到所有节点包括 mastervolumes:- name: configconfigMap:name: promtail-config- name: docker-logshostPath:path: /data/docker_storage/containerstype: Directory- name: pods-logshostPath:path: /var/log/podstype: Directory- name: positionspersistentVolumeClaim:claimName: promtail-pvc
kubectl apply -f pr-dm.yaml
7.Grafana部署
7.1 Grafana部署pv、pvc
# grafana-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:name: grafana-pv
spec:capacity:storage: 10Gi # 根据需求调整accessModes:- ReadWriteManypersistentVolumeReclaimPolicy: RetainstorageClassName: grafana-nfsnfs:server: 172.16.4.60path: /nfs_share/k8s/grafana/pv1 # 你的 NFS 路径---
# grafana-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:name: grafana-pvcnamespace: loki # 与 Grafana 同一命名空间
spec:accessModes:- ReadWriteManystorageClassName: grafana-nfsresources:requests:storage: 10Gi
7.2 Grafana部署deployment
apiVersion: apps/v1
kind: Deployment
metadata:name: grafananamespace: loki
spec:replicas: 1selector:matchLabels:app: grafanatemplate:metadata:labels:app: grafanaspec:containers:- name: grafanaimage: 172.16.4.17:8090/tools/grafana/grafana:latestports:- containerPort: 3000volumeMounts:- name: storagemountPath: /var/lib/grafana # Grafana 数据目录volumes:- name: storagepersistentVolumeClaim:claimName: grafana-pvc # 关联 PVC
7.3 Grafana部署service
apiVersion: v1
kind: Service
metadata:name: grafananamespace: loki # 确保与 Grafana Deployment 同一命名空间
spec:type: NodePort # 关键配置ports:- port: 3000 # Service 端口(集群内部访问)targetPort: 3000 # 容器端口(与 Grafana 容器端口一致)nodePort: 30030 # 节点端口(范围 30000-32767)selector:app: grafana # 必须与 Grafana Deployment 的 Pod 标签匹配
8.部署验证loki、promtail、grafana
[root@master1 loki]# kubectl get pv | egrep "loki|promtail|grafana"
grafana-pv 10Gi RWX Retain Bound loki/grafana-pvc grafana-nfs 11d
loki-pv 15Gi RWX Retain Bound loki/loki-pvc nfs 11d
promtail-pv 15Gi RWX Retain Bound loki/promtail-pvc promtail-nfs 2d16h
[root@master1 loki]# kubectl get pvc -n loki
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
grafana-pvc Bound grafana-pv 10Gi RWX grafana-nfs 11d
loki-pvc Bound loki-pv 15Gi RWX nfs 11d
promtail-pvc Bound promtail-pv 15Gi RWX promtail-nfs 2d16h
[root@master1 loki]# kubectl get cm -n loki
NAME DATA AGE
kube-root-ca.crt 1 18d
loki-config 1 11d
promtail-config 1 2d6h
[root@master1 loki]# kubectl get daemonset -n loki
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
promtail 6 6 6 6 6 <none> 2d6h
[root@master1 loki]# kubectl get deployment -n loki
NAME READY UP-TO-DATE AVAILABLE AGE
grafana 1/1 1 1 11d
loki 1/1 1 1 2d5h
[root@master1 loki]# kubectl get pods -n loki -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
grafana-688d87bd79-k4lw6 1/1 Running 0 11d 10.244.3.114 node4 <none> <none>
loki-69d658dcdf-dnwdk 1/1 Running 0 2d5h 10.244.135.20 node3 <none> <none>
promtail-bkxpr 1/1 Running 0 2d6h 10.244.3.75 node4 <none> <none>
promtail-d96m9 1/1 Running 0 2d6h 10.244.166.133 node1 <none> <none>
promtail-h4lfr 1/1 Running 0 2d6h 10.244.137.67 master1 <none> <none>
promtail-rff8m 1/1 Running 0 2d6h 10.244.104.34 node2 <none> <none>
promtail-szmkr 1/1 Running 0 2d6h 10.244.180.2 master2 <none> <none>
promtail-w25qb 1/1 Running 0 2d6h 10.244.135.1 node3 <none> <none>
[root@master1 loki]# kubectl get serviceaccount -n loki
NAME SECRETS AGE
default 1 18d
promtail 1 2d6h
[root@master1 loki]# kubectl get clusterrole -n loki | grep promtail
promtail 2025-03-22T01:50:30Z
[root@master1 loki]# kubectl get clusterrolebinding -n loki | grep promtail
promtail ClusterRole/promtail 2d6h
[root@master1 loki]#
9.Loki日志系统验证
10.参考文档
https://blog.csdn.net/tianmingqing0806/article/details/126766308
至此Loki日志系统就部署完整了!!!