架构说明:

prometheus是云原生系统内的事实上的监控标准,而kubernetes集群内部自然还是需要就地取材的部署prometheus服务了

那么,prometheus-server部署的方式其实是非常多的,比如,kubesphere集成方式,helm包方式,yaml文件清单方式,all in one 方式,在本例中,选择使用yaml文件清单方式

部署前需要考虑一个问题,那就是prometheus-server的时序数据库的数据存储问题,在本例中使用的是本地目录挂载方式,也就是host本地挂载,挂载目录  /data

kubernetes集群的版本如下(1.23.16版本,3master,1个工作节点,部署方式为kubekey):

[root@node4 yaml]# k get no -owide

NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME

node1 Ready control-plane,master 10d v1.23.16 192.168.123.11 CentOS Linux 7 (Core) 3.10.0-1062.el7.x86_64 docker://20.10.8

node2 Ready control-plane,master 10d v1.23.16 192.168.123.12 CentOS Linux 7 (Core) 3.10.0-1062.el7.x86_64 docker://20.10.8

node3 Ready control-plane,master 10d v1.23.16 192.168.123.13 CentOS Linux 7 (Core) 3.10.0-1062.el7.x86_64 docker://20.10.8

node4 Ready worker 10d v1.23.16 192.168.123.14 CentOS Linux 7 (Core) 3.10.0-1062.el7.x86_64 docker://20.10.8

prometheus-server的版本为(v2.2.1):

[root@node4 yaml]# k get deployments.apps -n monitor-sa -owide

NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR

prometheus-server 2/2 2 2 9d prometheus prom/prometheus:v2.2.1 app=prometheus,component=server

grafana的版本为(rpm 方式安装的9.4.3):

[root@node4 yaml]# rpm -qa |grep grafana

grafana-enterprise-9.4.3-1.x86_64

node-exporter的版本为(v0.16,damonsets控制器):

[root@node4 yaml]# k get ds -n monitor-sa -owide

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR

node-exporter 4 4 4 4 4 10d node-exporter prom/node-exporter:v0.16.0 name=node-exporter

部署成功的pod状态如下:

[root@node4 yaml]# k get po -n monitor-sa

NAME READY STATUS RESTARTS AGE

node-exporter-6ttbl 1/1 Running 1 (77m ago) 10d

node-exporter-7ls5t 1/1 Running 1 (76m ago) 10d

node-exporter-r287q 1/1 Running 3 (77m ago) 10d

node-exporter-z85dm 1/1 Running 1 (77m ago) 10d

prometheus-server-fb59774d6-bgmn7 1/1 Running 0 62m

prometheus-server-fb59774d6-wrq27 1/1 Running 0 62m

下面就如何在kubernetes内   部署一个prometheus做一个介绍

一,

node-exporter的部署

这里需要说明一下,node-exporter是做数据收集工作的,因此,如何收集数据,哪些数据需要收集,哪些数据需要舍弃这些是应该考虑的,虽然exporter只是收集数据,数据并不主动推送到prometheus,而是由prometheus自己来抓取,因此,无需配置存储,但如果node-exporter什么数据都收集,那毫无疑问的,对prometheus会是一种负担。

本例中相关配置是(表示磁盘挂载点的信息不收集):       

- --collector.filesystem.ignored-mount-points         - '"^/(sys|proc|dev|host|etc)($|/)"'

prometheus的优化部分,根据以下内容配置

--collector.arp 启用 arp 收集器(默认值:启用)。

--collector.bcache 启用 bcache 收集器(默认值:启用)。

--collector.bonding 启用绑定收集器(默认值:启用)。

--collector.btrfs 启用 btrfs 收集器(默认值:启用)。

--collector.buddyinfo 启用 buddyinfo 收集器(默认值:禁用)。

--collector.conntrack 启用 conntrack 收集器(默认值:启用)。

--collector.cpu 启用 CPU 收集器(默认值:启用)。

--collector.cpufreq 启用 cpufreq 收集器(默认值:启用)。

--collector.diskstats 启用 diskstats 收集器(默认值:启用)。

--collector.drbd 启用 drbd 收集器(默认值:禁用)。

--collector.edac 启用 edac 收集器(默认值:启用)。

--collector.entropy 启用熵收集器(默认值:启用)。

--collector.ethtool 启用 ethtool 收集器(默认值:禁用)。

--collector.fiberchannel 启用光纤通道收集器(默认值:启用)。

--collector.filefd 启用 filefd 收集器(默认值:启用)。

--collector.filesystem 启用文件系统收集器(默认值:启用)。

--collector.hwmon 启用 hwmon 收集器(默认值:启用)。

--collector.infiniband 启用 infiniband 收集器(默认值:启用)。

--collector.interrupts 启用中断收集器(默认值:禁用)。

--collector.ipvs 启用 ipvs 收集器(默认值:启用)。

--collector.ksmd 启用 ksmd 收集器(默认值:禁用)。

--collector.loadavg 启用 loadavg 收集器(默认值:启用)。

--collector.logind 启用登录收集器(默认值:禁用)。

--collector.mdadm 启用 mdadm 收集器(默认值:启用)。

--collector.meminfo 启用 meminfo 收集器(默认值:启用)。

--collector.meminfo_numa 启用 meminfo_numa 收集器(默认值:禁用)。

--collector.mountstats 启用 mountstats 收集器(默认值:禁用)。

--collector.netclass 启用网络类收集器(默认:启用)。

--collector.netdev 启用 netdev 收集器(默认值:启用)。

--collector.netstat 启用 netstat 收集器(默认值:启用)。

--collector.network_route 启用 network_route 收集器(默认值:禁用)。

--collector.nfs 启用 nfs 收集器(默认值:启用)。 --collector.nfsd 启用 nfsd 收集器(默认值:启用)。

--collector.ntp 启用 ntp 收集器(默认值:禁用)。 --collector.nvme 启用 nvme 收集器(默认值:启用)。

--collector.perf 启用性能收集器(默认值:禁用)。 --collector.powersupplyclass 启用 powersupplyclass 收集器(默认值:启用)。

--collector.pressure 启用压力收集器(默认值:启用)。 --collector.processes 启用进程收集器(默认值:禁用)。

--collector.qdisc 启用 qdisc 收集器(默认值:禁用)。 --collector.rapl 启用 rapl 收集器(默认值:启用)。

--collector.runit 启用 runit 收集器(默认值:禁用)。 --collector.schedstat 启用 schedstat 收集器(默认值:启用)。

--collector.sockstat 启用 sockstat 收集器(默认值:启用)。 --collector.softnet 启用软网络收集器(默认值:启用)。

--collector.stat 启用统计收集器(默认值:启用)。 --collector.supervisord 启用 supervisord 收集器(默认值:禁用)。

--collector.systemd 启用 systemd 收集器(默认值:禁用)。 --collector.tapestats 启用tapestats 收集器(默认值:启用)。

--collector.tcpstat 启用 tcpstat 收集器(默认值:禁用)。 --collector.textfile 启用文本文件收集器(默认值:启用)。

--collector.thermal_zone 启用热区收集器(默认值:启用)。 --collector.time 启用时间收集器(默认:启用)。

--collector.timex 启用 timex 收集器(默认值:启用)。 --collector.udp_queues 启用 udp_queues 收集器(默认值:启用)。

--collector.uname 启用 uname 收集器(默认值:启用)。 --collector.vmstat 启用 vmstat 收集器(默认值:启用)。

--collector.wifi 启用 wifi 收集器(默认值:禁用)。 --collector.xfs 启用 xfs 收集器(默认值:启用)。

--collector.zfs 启用 zfs 收集器(默认值:启用)。 --collector.zoneinfo 启用 zoneinfo 收集器(默认值:禁用)。

Example:

--collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)

List:

CollectorScopeInclude FlagExclude Flagarpdevice--collector.arp.device-include--collector.arp.device-excludecpubugs--collector.cpu.info.bugs-includeN/Acpuflags--collector.cpu.info.flags-includeN/Adiskstatsdevice--collector.diskstats.device-include--collector.diskstats.device-excludeethtooldevice--collector.ethtool.device-include--collector.ethtool.device-excludeethtoolmetrics--collector.ethtool.metrics-includeN/Afilesystemfs-typesN/A--collector.filesystem.fs-types-excludefilesystemmount-pointsN/A--collector.filesystem.mount-points-excludehwmonchip--collector.hwmon.chip-include--collector.hwmon.chip-excludenetdevdevice--collector.netdev.device-include--collector.netdev.device-excludeqdiskdevice--collector.qdisk.device-include--collector.qdisk.device-excludesysctlall--collector.sysctl.includeN/Asystemdunit--collector.systemd.unit-include--collector.systemd.unit-exclude

Enabled by default

NameDescriptionOSarpExposes ARP statistics from /proc/net/arp.LinuxbcacheExposes bcache statistics from /sys/fs/bcache/.LinuxbondingExposes the number of configured and active slaves of Linux bonding interfaces.LinuxbtrfsExposes btrfs statisticsLinuxboottimeExposes system boot time derived from the kern.boottime sysctl.Darwin, Dragonfly, FreeBSD, NetBSD, OpenBSD, SolarisconntrackShows conntrack statistics (does nothing if no /proc/sys/net/netfilter/ present).LinuxcpuExposes CPU statisticsDarwin, Dragonfly, FreeBSD, Linux, Solaris, OpenBSDcpufreqExposes CPU frequency statisticsLinux, SolarisdiskstatsExposes disk I/O statistics.Darwin, Linux, OpenBSDdmiExpose Desktop Management Interface (DMI) info from /sys/class/dmi/id/LinuxedacExposes error detection and correction statistics.LinuxentropyExposes available entropy.LinuxexecExposes execution statistics.Dragonfly, FreeBSDfibrechannelExposes fibre channel information and statistics from /sys/class/fc_host/.LinuxfilefdExposes file descriptor statistics from /proc/sys/fs/file-nr.LinuxfilesystemExposes filesystem statistics, such as disk space used.Darwin, Dragonfly, FreeBSD, Linux, OpenBSDhwmonExpose hardware monitoring and sensor data from /sys/class/hwmon/.LinuxinfinibandExposes network statistics specific to InfiniBand and Intel OmniPath configurations.LinuxipvsExposes IPVS status from /proc/net/ip_vs and stats from /proc/net/ip_vs_stats.LinuxloadavgExposes load average.Darwin, Dragonfly, FreeBSD, Linux, NetBSD, OpenBSD, SolarismdadmExposes statistics about devices in /proc/mdstat (does nothing if no /proc/mdstat present).LinuxmeminfoExposes memory statistics.Darwin, Dragonfly, FreeBSD, Linux, OpenBSDnetclassExposes network interface info from /sys/class/net/LinuxnetdevExposes network interface statistics such as bytes transferred.Darwin, Dragonfly, FreeBSD, Linux, OpenBSDnetisrExposes netisr statisticsFreeBSDnetstatExposes network statistics from /proc/net/netstat. This is the same information as netstat -s.LinuxnfsExposes NFS client statistics from /proc/net/rpc/nfs. This is the same information as nfsstat -c.LinuxnfsdExposes NFS kernel server statistics from /proc/net/rpc/nfsd. This is the same information as nfsstat -s.LinuxnvmeExposes NVMe info from /sys/class/nvme/LinuxosExpose OS release info from /etc/os-release or /usr/lib/os-releaseanypowersupplyclassExposes Power Supply statistics from /sys/class/power_supplyLinuxpressureExposes pressure stall statistics from /proc/pressure/.Linux (kernel 4.20+ and/or CONFIG_PSI)raplExposes various statistics from /sys/class/powercap.LinuxschedstatExposes task scheduler statistics from /proc/schedstat.LinuxselinuxExposes SELinux statistics.LinuxsockstatExposes various statistics from /proc/net/sockstat.LinuxsoftnetExposes statistics from /proc/net/softnet_stat.LinuxstatExposes various statistics from /proc/stat. This includes boot time, forks and interrupts.LinuxtapestatsExposes statistics from /sys/class/scsi_tape.LinuxtextfileExposes statistics read from local disk. The --collector.textfile.directory flag must be set.anythermalExposes thermal statistics like pmset -g therm.Darwinthermal_zoneExposes thermal zone & cooling device statistics from /sys/class/thermal.LinuxtimeExposes the current system time.anytimexExposes selected adjtimex(2) system call stats.Linuxudp_queuesExposes UDP total lengths of the rx_queue and tx_queue from /proc/net/udp and /proc/net/udp6.LinuxunameExposes system information as provided by the uname system call.Darwin, FreeBSD, Linux, OpenBSDvmstatExposes statistics from /proc/vmstat.LinuxxfsExposes XFS runtime statistics.Linux (kernel 4.4+)zfsExposes ZFS performance statistics.FreeBSD, Linux, Solaris

node-exporter的部署文件: 

cat >node-export.yaml <

apiVersion: apps/v1

kind: DaemonSet

metadata:

name: node-exporter

namespace: monitor-sa

labels:

name: node-exporter

spec:

selector:

matchLabels:

name: node-exporter

template:

metadata:

labels:

name: node-exporter

spec:

hostPID: true

hostIPC: true

hostNetwork: true

containers:

- name: node-exporter

image: prom/node-exporter:v0.16.0

ports:

- containerPort: 9100

resources:

requests:

cpu: 0.15

securityContext:

privileged: true

args:

- --path.procfs

- /host/proc

- --path.sysfs

- /host/sys

- --collector.filesystem.ignored-mount-points

- '"^/(sys|proc|dev|host|etc)($|/)"'

volumeMounts:

- name: dev

mountPath: /host/dev

- name: proc

mountPath: /host/proc

- name: sys

mountPath: /host/sys

- name: rootfs

mountPath: /rootfs

tolerations:

- key: "node-role.kubernetes.io/master"

operator: "Exists"

effect: "NoSchedule"

volumes:

- name: proc

hostPath:

path: /proc

- name: dev

hostPath:

path: /dev

- name: sys

hostPath:

path: /sys

- name: rootfs

hostPath:

path: /

EOF

二,

kube-state-metrics收集器的部署

kube-state-metrics是kubernetes内部专门收集pod,deployment,ds,sts等等资源的状态的收集器,该收集器收集到的数据由prometheus-server 服务自己主动来抓取

例如,我们查询该服务的日志可以看到,有一些资源它没有收集到,原因是sa权限不足,但这些无需担心,和node-exporter一样,某些数据我们是并不需要收集的:

E1202 13:10:33.591335 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "secrets" in API group "" at the cluster scope

E1202 13:10:33.592118 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1beta1.MutatingWebhookConfiguration: mutatingwebhookconfigurations.admissionregistration.k8s.io is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "mutatingwebhookconfigurations" in API group "admissionregistration.k8s.io" at the cluster scope

E1202 13:10:33.593079 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1.Namespace: networkpolicies.networking.k8s.io is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "networkpolicies" in API group "networking.k8s.io" at the cluster scope

E1202 13:10:33.597030 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "replicasets" in API group "apps" at the cluster scope

E1202 13:10:33.599890 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1beta1.ValidatingWebhookConfiguration: validatingwebhookconfigurations.admissionregistration.k8s.io is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "validatingwebhookconfigurations" in API group "admissionregistration.k8s.io" at the cluster scope

E1202 13:10:34.580372 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1.StorageClass: storageclasses.storage.k8s.io is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope

E1202 13:10:34.580373 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1.ConfigMap: configmaps is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "configmaps" in API group "" at the cluster scope

E1202 13:10:34.586583 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1beta1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope

E1202 13:10:34.586669 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1.Deployment: deployments.apps is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "deployments" in API group "apps" at the cluster scope

E1202 13:10:34.587055 1 reflector.go:156] pkg/mod/k8s.io/client-go@v0.0.0-20191109102209-3c0d1af94be5/tools/cache/reflector.go:108: Failed to list *v1beta1.VolumeAttachment: volumeattachments.storage.k8s.io is forbidden: User "system:serviceaccount:kube-system:kube-state-metrics" cannot list resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope

kube-state-metrics的RBAC:

这里上面的缺的收集cm的权限我已经补上了

cat> kube-state-metrics-rbac.yaml <

---

apiVersion: v1

kind: ServiceAccount

metadata:

name: kube-state-metrics

namespace: kube-system

---

apiVersion: rbac.authorization.k8s.io/v1

kind: ClusterRole

metadata:

name: kube-state-metrics

rules:

- apiGroups: [""]

resources: ["nodes", "pods", "services", "resourcequotas", "replicationcontrollers", "limitranges", "persistentvolumeclaims", "persistentvolumes", "namespaces", "endpoints"]

verbs: ["list", "watch"]

- apiGroups: ["extensions"]

resources: ["daemonsets", "deployments", "replicasets"]

verbs: ["list", "watch"]

- apiGroups: ["apps"]

resources: ["statefulsets","daemonsets","replicasets","deployments"]

verbs: ["list", "watch"]

- apiGroups: ["batch"]

resources: ["cronjobs", "jobs"]

verbs: ["list", "watch"]

- apiGroups: ["autoscaling"]

resources: ["horizontalpodautoscalers"]

verbs: ["list", "watch"]

- apiGroups: [""]

resources: ["configmaps","secrets"]

verbs: ["list", "watch"]

---

apiVersion: rbac.authorization.k8s.io/v1

kind: ClusterRoleBinding

metadata:

name: kube-state-metrics

roleRef:

apiGroup: rbac.authorization.k8s.io

kind: ClusterRole

name: kube-state-metrics

subjects:

- kind: ServiceAccount

name: kube-state-metrics

namespace: kube-system

EOF

 kube-state-metrics的svc:

这里有一个注解,prometheus.io/scrape: 'true'  表示允许prometheus收集数据

cat> kube-state-metrics-svc.yaml <

apiVersion: v1

kind: Service

metadata:

annotations:

prometheus.io/scrape: 'true'

name: kube-state-metrics

namespace: kube-system

labels:

app: kube-state-metrics

spec:

ports:

- name: kube-state-metrics

port: 8080

protocol: TCP

selector:

app: kube-state-metrics

EOF

kube-state-metrics的deployment:

cat >kube-state-metrics-deploy.yaml <

apiVersion: apps/v1

kind: Deployment

metadata:

name: kube-state-metrics

namespace: kube-system

spec:

replicas: 1

selector:

matchLabels:

app: kube-state-metrics

template:

metadata:

labels:

app: kube-state-metrics

spec:

serviceAccountName: kube-state-metrics

containers:

- name: kube-state-metrics

# image: gcr.io/google_containers/kube-state-metrics-amd64:v1.3.1

image: quay.io/coreos/kube-state-metrics:v1.9.0

ports:

- containerPort: 8080

EOF

三,

prometheus-server的部署

1,

prometheus-svc

cat >prometheus-cfg.yaml <

---

kind: ConfigMap

apiVersion: v1

metadata:

labels:

app: prometheus

name: prometheus-config

namespace: monitor-sa

data:

prometheus.yml: |

global:

scrape_interval: 15s

scrape_timeout: 10s

evaluation_interval: 1m

scrape_configs:

- job_name: 'kubernetes-node'

kubernetes_sd_configs:

- role: node

relabel_configs:

- source_labels: [__address__]

regex: '(.*):10250'

replacement: '${1}:9100'

target_label: __address__

action: replace

- action: labelmap

regex: __meta_kubernetes_node_label_(.+)

- job_name: 'kubernetes-node-cadvisor'

kubernetes_sd_configs:

- role: node

scheme: https

tls_config:

ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

relabel_configs:

- action: labelmap

regex: __meta_kubernetes_node_label_(.+)

- target_label: __address__

replacement: kubernetes.default.svc:443

- source_labels: [__meta_kubernetes_node_name]

regex: (.+)

target_label: __metrics_path__

replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

- job_name: 'kubernetes-apiserver'

kubernetes_sd_configs:

- role: endpoints

scheme: https

tls_config:

ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

relabel_configs:

- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

action: keep

regex: default;kubernetes;https

EOF

2,

prometheus-svc:

cat >prometheus-svc.yaml <

---

apiVersion: v1

kind: Service

metadata:

name: prometheus

namespace: monitor-sa

labels:

app: prometheus

spec:

type: NodePort

ports:

- port: 9090

targetPort: 9090

protocol: TCP

selector:

app: prometheus

component: server

EOF

3,

cat >prometheus-deploy.yaml <

---

apiVersion: apps/v1

kind: Deployment

metadata:

name: prometheus-server

namespace: monitor-sa

labels:

app: prometheus

spec:

replicas: 2

selector:

matchLabels:

app: prometheus

component: server

#matchExpressions:

#- {key: app, operator: In, values: [prometheus]}

#- {key: component, operator: In, values: [server]}

template:

metadata:

labels:

app: prometheus

component: server

annotations:

prometheus.io/scrape: 'false'

spec:

nodeName: node4

serviceAccountName: monitor

containers:

- name: prometheus

image: prom/prometheus:v2.2.1

imagePullPolicy: IfNotPresent

command:

- prometheus

- --config.file=/etc/prometheus/prometheus.yml

- --storage.tsdb.path=/prometheus

- --storage.tsdb.retention=720h

ports:

- containerPort: 9090

protocol: TCP

volumeMounts:

- mountPath: /etc/prometheus/prometheus.yml

name: prometheus-config

subPath: prometheus.yml

- mountPath: /prometheus/

name: prometheus-storage-volume

volumes:

- name: prometheus-config

configMap:

name: prometheus-config

items:

- key: prometheus.yml

path: prometheus.yml

mode: 0644

- name: prometheus-storage-volume

hostPath:

path: /data

type: Directory

EOF

以上所有部署执行完毕后,查看prometheus-server的svc:

[root@node4 yaml]# k get svc -n monitor-sa

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

prometheus NodePort 10.96.0.120 9090:32661/TCP 10d

根据该port,打开浏览器,进入prometheus的web界面:

至此,kubernetes集群内的prometheus-server服务就安装完毕了!!!!!! 

grafana默认安装就可以了,rpm方式安装,没什么好说的,主要是数据源设置如下:

好文阅读

评论可见,请评论后查看内容,谢谢!!!评论后请刷新页面。