寫在前面
- 博文偏實戰,內容涉及:
- 集群核心指標(Core Metrics)監控工具Metrics Server的簡介
- Metrics Server的安裝Demo
- 集群自定義指標(Custom Metrics)監控平台簡介:Prometheus、Grafana、NodeExporter
- 通過helm(kube-prometheus-stack)安裝監控平台的Demo
帶著凡世的夢想,將污穢的靈魂依偎在純潔的天邊。它們是所有流浪、追尋、渴望與鄉愁的永恆象徵。 ——赫爾曼·黑塞《彼得.卡門青》
Kubernetes集群性能監控管理
Kubernetes平台搭建好後,了解Kubernetes平台及在此平台上部署的應用的運行狀況,以及處理系統主要告誓及性能瓶頸,這些都依賴監控管理系統。
Kubernetes的早期版本依靠Heapster來實現完整的性能數據採集和監控功能, Kubernetes從1.8版本開始,性能數據開始以Metrics APl的方式提供標準化接口,並且從1.10版本開始將Heapster替換為MetricsServer。
在Kubernetes新的監控體系中:Metrics Server用於提供核心指標(Core Metrics) ,包括Node, Pod的CPU和內存使用指標。對其他自定義指標(Custom Metrics)的監控則由Prometheus等組件來完成。
監控節點狀態,我們使用Docker的話可以通過docker stats.
┌──[root@vms81.liruilongs.GitHub.io]-[~]
└─$docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
781c898eea19 k8s_kube-scheduler_kube-scheduler-vms81.liruilongs.github.io_kube-system_5bd71ffab3a1f1d18cb589aa74fe082b_18 0.15% 23.22MiB / 3.843GiB 0.59% 0B / 0B 0B / 0B 7
acac8b21bb57 k8s_kube-controller-manager_kube-controller-manager-vms81.liruilongs.github.io_kube-system_93d9ae7b5a4ccec4429381d493b5d475_18 1.18% 59.16MiB / 3.843GiB 1.50% 0B / 0B 0B / 0B 6
fe97754d3dab k8s_calico-node_calico-node-skzjp_kube-system_a211c8be-3ee1-44a0-a4ce-3573922b65b2_14 4.89% 94.25MiB / 3.843GiB 2.39% 0B / 0B 0B / 4.1kB 40
那使用k8s的話,我們可以通過Metrics Server監控Pod和Node的CPU和內存資源使用數據
Metrics Server:集群性能監控平台
Metrics Server在部署完成後,將通過Kubernetes核心API Server 的/apis/metrics.k8s.io/v1beta1路徑提供Pod和Node的監控數據。
安裝Metrics Server
Metrics Server原始碼和部署配置可以在GitHub代碼庫
curl -Ls https://api.github.com/repos/kubernetes-sigs/metrics-server/tarball/v0.3.6 -o metrics-server-v0.3.6.tar.gz
相關鏡像
docker pull mirrorgooglecontainers/metrics-server-amd64:v0.3.6
鏡像小夥伴可以下載一下,這裡我已經下載好了,直接上傳導入鏡像
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible all -m copy -a "src=./metrics-img.tar dest=/root/metrics-img.tar"
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible all -m shell -a "systemctl restart docker "
192.168.26.82 | CHANGED | rc=0 >>
192.168.26.83 | CHANGED | rc=0 >>
192.168.26.81 | CHANGED | rc=0 >>
通過docker命令導入鏡像
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible all -m shell -a "docker load -i /root/metrics-img.tar"
192.168.26.83 | CHANGED | rc=0 >>
Loaded image: k8s.gcr.io/metrics-server-amd64:v0.3.6
192.168.26.81 | CHANGED | rc=0 >>
Loaded image: k8s.gcr.io/metrics-server-amd64:v0.3.6
192.168.26.82 | CHANGED | rc=0 >>
Loaded image: k8s.gcr.io/metrics-server-amd64:v0.3.6
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$
修改metrics-server-deployment.yaml
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$mv kubernetes-sigs-metrics-server-d1f4f6f/ metrics
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$cd metrics/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics]
└─$ls
cmd deploy hack OWNERS README.md version
code-of-conduct.md Gopkg.lock LICENSE OWNERS_ALIASES SECURITY_CONTACTS
CONTRIBUTING.md Gopkg.toml Makefile pkg vendor
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics]
└─$cd deploy/1.8+/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$ls
aggregated-metrics-reader.yaml metrics-apiservice.yaml resource-reader.yaml
auth-delegator.yaml metrics-server-deployment.yaml
auth-reader.yaml metrics-server-service.yaml
這裡修改一些鏡像獲取策略,因為Githup上的鏡像拉去不下來,或者拉去比較麻煩,所以我們提前上傳好
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$vim metrics-server-deployment.yaml
31 - name: metrics-server
32 image: k8s.gcr.io/metrics-server-amd64:v0.3.6
33 #imagePullPolicy: Always
34 imagePullPolicy: IfNotPresent
35 command:
36 - /metrics-server
37 - --metric-resolution=30s
38 - --kubelet-insecure-tls
39 - --kubelet-preferred-address-types=InternalIP
40 volumeMounts:
運行資源文件,創建相關資源對象
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$kubectl apply -f .
查看pod列表,metrics-server創建成功
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-78d6f96c7b-79xx4 1/1 Running 2 3h15m
calico-node-ntm7v 1/1 Running 1 12h
calico-node-skzjp 1/1 Running 4 12h
calico-node-v7pj5 1/1 Running 1 12h
coredns-545d6fc579-9h2z4 1/1 Running 2 3h15m
coredns-545d6fc579-xgn8x 1/1 Running 2 3h16m
etcd-vms81.liruilongs.github.io 1/1 Running 1 13h
kube-apiserver-vms81.liruilongs.github.io 1/1 Running 2 13h
kube-controller-manager-vms81.liruilongs.github.io 1/1 Running 4 13h
kube-proxy-rbhgf 1/1 Running 1 13h
kube-proxy-vm2sf 1/1 Running 1 13h
kube-proxy-zzbh9 1/1 Running 1 13h
kube-scheduler-vms81.liruilongs.github.io 1/1 Running 5 13h
metrics-server-bcfb98c76-gttkh 1/1 Running 0 70m
通過kubectl top nodes命令測試,
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$kubectl top nodes
W1007 14:23:06.102605 102831 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
vms81.liruilongs.github.io 555m 27% 2025Mi 52%
vms82.liruilongs.github.io 204m 10% 595Mi 15%
vms83.liruilongs.github.io 214m 10% 553Mi 14%
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$
Prometheus+Grafana+NodeExporter:集群監控平台
在各個計算節點上部署NodeExporter採集CPU、內存、磁碟及IO信息,並將這些信息傳輸給監控節點上的Prometheus伺服器進行存儲分析,通過Grafana進行可視化監控,
Prometheus
Prometheus是一款開源的監控解決方案,由SoundCloud公司開發的開源監控系統,是繼Kubernetes之後CNCF第2個孵化成功的項目,在容器和微服務領域得到了廣泛應用,能在監控Kubernetes平台的同時監控部署在此平台中的應用,它提供了一系列工具集及多維度監控指標。Prometheus依賴Grafana實現數據可視化。
Prometheus的主要特點如下:
- 使用指標名稱及鍵值對標識的多維度數據模型。
- 採用靈活的查詢語言PromQL。
- 不依賴分布式存儲,為自治的單節點服務。
- 使用HTTP完成對監控數據的拉取。
- 支持通過網關推送時序數據。
- 支持多種圖形和Dashboard的展示,例如Grafana。
Prometheus生態系統由各種組件組成,用於功能的擴充:
組件 |
描述 |
Prometheus Server |
負責監控數據採集和時序數據存儲,並提供數據查詢功能。 |
客戶端SDK |
對接Prometheus的開發工具包。 |
Push Gateway |
推送數據的網關組件。 |
第三方Exporter |
各種外部指標收集系統,其數據可以被Prometheus採集 |
AlertManager |
告警管理器。 |
其他輔助支持工具 |
– |
Prometheus的核心組件Prometheus Server的主要功能包括:
從Kubernetes Master獲取需要監控的資源或服務信息;從各種Exporter抓取(Pull)指標數據,然後將指標數據保存在時序資料庫(TSDB)中;向其他系統提供HTTP API進行查詢;提供基於PromQL語言的數據查詢;可以將告警數據推送(Push)給AlertManager,等等。
Prometheus的系統架構:
NodeExporter
NodeExporter主要用來採集伺服器CPU、內存、磁碟、IO等信息,是機器數據的通用採集方案。只要在宿主機上安裝NodeExporter和cAdisor容器,通過Prometheus進行抓取即可。它同Zabbix的功能相似.
Grafana
Grafana是一個Dashboard工具,用Go和JS開發,它是一個時間序列資料庫的界面展示層,通過SQL命令查詢出Metrics並將結果展示出來。它能自定義多種儀錶盤,可以輕鬆實現覆蓋多個Docker的宿主機監控信息的展現。
搭建Prometheus+Grafana+NodeExporter平台
這裡我們通過helm的方式搭建,簡單方便快捷,運行之後,相關的鏡像都會創建成功.下面是創建成功的鏡像列表。
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
alertmanager-liruilong-kube-prometheus-alertmanager-0 2/2 Running 0 61m
liruilong-grafana-5955564c75-zpbjq 3/3 Running 0 62m
liruilong-kube-prometheus-operator-5cb699b469-fbkw5 1/1 Running 0 62m
liruilong-kube-state-metrics-5dcf758c47-bbwt4 1/1 Running 7 (32m ago) 62m
liruilong-prometheus-node-exporter-rfsc5 1/1 Running 0 62m
liruilong-prometheus-node-exporter-vm7s9 1/1 Running 0 62m
liruilong-prometheus-node-exporter-z9j8b 1/1 Running 0 62m
prometheus-liruilong-kube-prometheus-prometheus-0 2/2 Running 0 61m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$
環境版本
我的K8s集群版本
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$kubectl get nodes
NAME STATUS ROLES AGE VERSION
vms81.liruilongs.github.io Ready control-plane,master 34d v1.22.2
vms82.liruilongs.github.io Ready <none> 34d v1.22.2
vms83.liruilongs.github.io Ready <none> 34d v1.22.2
hrlm版本
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm version
version.BuildInfo{Version:"v3.2.1", GitCommit:"fe51cd1e31e6a202cba7dead9552a6d418ded79a", GitTreeState:"clean", GoVersion:"go1.13.10"}
prometheus-operator(舊名字)安裝出現的問題
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm search repo prometheus-operator
NAME CHART VERSION APP VERSION DESCRIPTION
ali/prometheus-operator 8.7.0 0.35.0 Provides easy monitoring definitions for Kubern...
azure/prometheus-operator 9.3.2 0.38.1 DEPRECATED Provides easy monitoring definitions...
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm install liruilong ali/prometheus-operator
Error: failed to install CRD crds/crd-alertmanager.yaml: unable to recognize "": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm pull ali/prometheus-operator
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$
解決辦法:新版本安裝
直接下載kube-prometheus-stack(新)的chart包,通過命令安裝:
https://github.com/prometheus-community/helm-charts/releases/download/kube-prometheus-stack-30.0.1/kube-prometheus-stack-30.0.1.tgz
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$ls
index.yaml kube-prometheus-stack-30.0.1.tgz liruilonghelm liruilonghelm-0.1.0.tgz mysql mysql-1.6.4.tgz
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
解壓chart包kube-prometheus-stack-30.0.1.tgz
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$tar -zxf kube-prometheus-stack-30.0.1.tgz
創建新的命名空間
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$cd kube-prometheus-stack/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl create ns monitoring
namespace/monitoring created
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl config set-context $(kubectl config current-context) --namespace=monitoring
Context "kubernetes-admin@kubernetes" modified.
進入文件夾,直接通過helm install liruilong .安裝
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$ls
Chart.lock charts Chart.yaml CONTRIBUTING.md crds README.md templates values.yaml
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$helm install liruilong .
kube-prometheus-admission-create對應Pod的相關鏡像下載不下來問題
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
liruilong-kube-prometheus-admission-create--1-bn7x2 0/1 ImagePullBackOff 0 33s
查看pod詳細信息,發現是谷歌的一個鏡像國內無法下載
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$kubectl describe pod liruilong-kube-prometheus-admission-create--1-bn7x2
Name: liruilong-kube-prometheus-admission-create--1-bn7x2
Namespace: monitoring
Priority: 0
Node: vms83.liruilongs.github.io/192.168.26.83
Start Time: Sun, 16 Jan 2022 02:43:07 +0800
Labels: app=kube-prometheus-stack-admission-create
app.kubernetes.io/instance=liruilong
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/part-of=kube-prometheus-stack
app.kubernetes.io/version=30.0.1
chart=kube-prometheus-stack-30.0.1
controller-uid=2ce48cd2-a118-4e23-a27f-0228ef6c45e7
heritage=Helm
job-name=liruilong-kube-prometheus-admission-create
release=liruilong
Annotations: cni.projectcalico.org/podIP: 10.244.70.8/32
cni.projectcalico.org/podIPs: 10.244.70.8/32
Status: Pending
IP: 10.244.70.8
IPs:
IP: 10.244.70.8
Controlled By: Job/liruilong-kube-prometheus-admission-create
Containers:
create:
Container ID:
Image: k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068
Image ID:
Port: <none>
Host Port:
。。。。。。。。。。。。。。。。。。。。。。。。。。。
在dokcer倉庫里找了一個類似的,通過 kubectl edit修改
image: k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0 替換為 : docker.io/liangjw/kube-webhook-certgen:v1.1.1
或者也可以修改配置文件從新install(記得要把sha注釋掉)
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$ls
index.yaml kube-prometheus-stack kube-prometheus-stack-30.0.1.tgz liruilonghelm liruilonghelm-0.1.0.tgz mysql mysql-1.6.4.tgz
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$cd kube-prometheus-stack/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$ls
Chart.lock charts Chart.yaml CONTRIBUTING.md crds README.md templates values.yaml
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$cat values.yaml | grep -A 3 -B 2 kube-webhook-certgen
enabled: true
image:
repository: docker.io/liangjw/kube-webhook-certgen
tag: v1.1.1
#sha: "f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068"
pullPolicy: IfNotPresent
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$helm del liruilong;helm install liruilong .
之後其他的相關pod正常創建中
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
liruilong-grafana-5955564c75-zpbjq 0/3 ContainerCreating 0 27s
liruilong-kube-prometheus-operator-5cb699b469-fbkw5 0/1 ContainerCreating 0 27s
liruilong-kube-state-metrics-5dcf758c47-bbwt4 0/1 ContainerCreating 0 27s
liruilong-prometheus-node-exporter-rfsc5 0/1 ContainerCreating 0 28s
liruilong-prometheus-node-exporter-vm7s9 0/1 ContainerCreating 0 28s
liruilong-prometheus-node-exporter-z9j8b 0/1 ContainerCreating 0 28s
kube-state-metrics這個pod的鏡像也沒有拉取下來。應該也是相同的原因
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
alertmanager-liruilong-kube-prometheus-alertmanager-0 2/2 Running 0 3m35s
liruilong-grafana-5955564c75-zpbjq 3/3 Running 0 4m46s
liruilong-kube-prometheus-operator-5cb699b469-fbkw5 1/1 Running 0 4m46s
liruilong-kube-state-metrics-5dcf758c47-bbwt4 0/1 ImagePullBackOff 0 4m46s
liruilong-prometheus-node-exporter-rfsc5 1/1 Running 0 4m47s
liruilong-prometheus-node-exporter-vm7s9 1/1 Running 0 4m47s
liruilong-prometheus-node-exporter-z9j8b 1/1 Running 0 4m47s
prometheus-liruilong-kube-prometheus-prometheus-0 2/2 Running 0 3m34s
同樣 k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.3.0 這個鏡像沒辦法拉取
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl describe pod liruilong-kube-state-metrics-5dcf758c47-bbwt4
Name: liruilong-kube-state-metrics-5dcf758c47-bbwt4
Namespace: monitoring
Priority: 0
Node: vms82.liruilongs.github.io/192.168.26.82
Start Time: Sun, 16 Jan 2022 02:59:53 +0800
Labels: app.kubernetes.io/component=metrics
app.kubernetes.io/instance=liruilong
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=kube-state-metrics
app.kubernetes.io/part-of=kube-state-metrics
app.kubernetes.io/version=2.3.0
helm.sh/chart=kube-state-metrics-4.3.0
pod-template-hash=5dcf758c47
release=liruilong
Annotations: cni.projectcalico.org/podIP: 10.244.171.153/32
cni.projectcalico.org/podIPs: 10.244.171.153/32
Status: Pending
IP: 10.244.171.153
IPs:
IP: 10.244.171.153
Controlled By: ReplicaSet/liruilong-kube-state-metrics-5dcf758c47
Containers:
kube-state-metrics:
Container ID:
Image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.3.0
Image ID:
Port: 8080/TCP
。。。。。。。。。。。。。。。。。。。。。。
同樣的,我們通過docker倉庫找一下相同的,然後通過kubectl edit pod修改一下
k8s.gcr.io/kube-state-metrics/kube-state-metrics 替換為: docker.io/dyrnq/kube-state-metrics:v2.3.0
可以先在節點機上拉取一下
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ ansible node -m shell -a "docker pull dyrnq/kube-state-metrics:v2.3.0"
192.168.26.82 | CHANGED | rc=0 >>
v2.3.0: Pulling from dyrnq/kube-state-metrics
e8614d09b7be: Pulling fs layer
53ccb90bafd7: Pulling fs layer
e8614d09b7be: Verifying Checksum
e8614d09b7be: Download complete
e8614d09b7be: Pull complete
53ccb90bafd7: Verifying Checksum
53ccb90bafd7: Download complete
53ccb90bafd7: Pull complete
Digest: sha256:c9137505edaef138cc23479c73e46e9a3ef7ec6225b64789a03609c973b99030
Status: Downloaded newer image for dyrnq/kube-state-metrics:v2.3.0
docker.io/dyrnq/kube-state-metrics:v2.3.0
192.168.26.83 | CHANGED | rc=0 >>
v2.3.0: Pulling from dyrnq/kube-state-metrics
e8614d09b7be: Pulling fs layer
53ccb90bafd7: Pulling fs layer
e8614d09b7be: Verifying Checksum
e8614d09b7be: Download complete
e8614d09b7be: Pull complete
53ccb90bafd7: Verifying Checksum
53ccb90bafd7: Download complete
53ccb90bafd7: Pull complete
Digest: sha256:c9137505edaef138cc23479c73e46e9a3ef7ec6225b64789a03609c973b99030
Status: Downloaded newer image for dyrnq/kube-state-metrics:v2.3.0
docker.io/dyrnq/kube-state-metrics:v2.3.0
修改完之後,會發現所有的pod都創建成功
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
alertmanager-liruilong-kube-prometheus-alertmanager-0 2/2 Running 0 61m
liruilong-grafana-5955564c75-zpbjq 3/3 Running 0 62m
liruilong-kube-prometheus-operator-5cb699b469-fbkw5 1/1 Running 0 62m
liruilong-kube-state-metrics-5dcf758c47-bbwt4 1/1 Running 7 (32m ago) 62m
liruilong-prometheus-node-exporter-rfsc5 1/1 Running 0 62m
liruilong-prometheus-node-exporter-vm7s9 1/1 Running 0 62m
liruilong-prometheus-node-exporter-z9j8b 1/1 Running 0 62m
prometheus-liruilong-kube-prometheus-prometheus-0 2/2 Running 0 61m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$
然後我們需要修改liruilong-grafana SVC的類型為NodePort,這樣,物理機就可以訪問了
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 33m
liruilong-grafana ClusterIP 10.99.220.121 <none> 80/TCP 34m
liruilong-kube-prometheus-alertmanager ClusterIP 10.97.193.228 <none> 9093/TCP 34m
liruilong-kube-prometheus-operator ClusterIP 10.101.106.93 <none> 443/TCP 34m
liruilong-kube-prometheus-prometheus ClusterIP 10.105.176.19 <none> 9090/TCP 34m
liruilong-kube-state-metrics ClusterIP 10.98.94.55 <none> 8080/TCP 34m
liruilong-prometheus-node-exporter ClusterIP 10.110.216.215 <none> 9100/TCP 34m
prometheus-operated ClusterIP None <none> 9090/TCP 33m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl edit svc liruilong-grafana
service/liruilong-grafana edited
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 35m
liruilong-grafana NodePort 10.99.220.121 <none> 80:30443/TCP 36m
liruilong-kube-prometheus-alertmanager ClusterIP 10.97.193.228 <none> 9093/TCP 36m
liruilong-kube-prometheus-operator ClusterIP 10.101.106.93 <none> 443/TCP 36m
liruilong-kube-prometheus-prometheus ClusterIP 10.105.176.19 <none> 9090/TCP 36m
liruilong-kube-state-metrics ClusterIP 10.98.94.55 <none> 8080/TCP 36m
liruilong-prometheus-node-exporter ClusterIP 10.110.216.215 <none> 9100/TCP 36m
prometheus-operated ClusterIP None <none> 9090/TCP 35m
物理機訪問 |
通過secrets解密獲取用戶名密碼
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets | grep grafana
liruilong-grafana Opaque 3 38m
liruilong-grafana-test-token-q8z8j kubernetes.io/service-account-token 3 38m
liruilong-grafana-token-j94p8 kubernetes.io/service-account-token 3 38m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets liruilong-grafana -o yaml
apiVersion: v1
data:
admin-password: cHJvbS1vcGVyYXRvcg==
admin-user: YWRtaW4=
ldap-toml: ""
kind: Secret
metadata:
annotations:
meta.helm.sh/release-name: liruilong
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2022-01-15T18:59:40Z"
labels:
app.kubernetes.io/instance: liruilong
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: grafana
app.kubernetes.io/version: 8.3.3
helm.sh/chart: grafana-6.20.5
name: liruilong-grafana
namespace: monitoring
resourceVersion: "1105663"
uid: c03ff5f3-deb5-458c-8583-787f41034469
type: Opaque
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets liruilong-grafana -o jsonpath='{.data.admin-user}}'| base64 -d
adminbase64: 輸入無效
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets liruilong-grafana -o jsonpath='{.data.admin-password}}'| base64 -d
prom-operatorbase64: 輸入無效
得到用戶名密碼:admin/prom-operator
正常登錄,查看監控信息 |