Metrics Server
Metrics 的产生背景?
Resource Metrics API is an effort to provide a first-class Kubernetes API (stable, versioned, discoverable, available through apiserver and with client support) that serves resource usage metrics for pods and nodes.
当前架构中存在的问题是什么?
The API has been already implemented in Heapster, but users and Kubernetes components can only access it through master proxy mechanism and have to decode it on their own. Heapster serves the API using go http library which doesn’t offer a number of functionality that is offered by Kubernetes API server like authorization/authentication or client generation.
Metrics Server 是如何收集处理节点以及Pod指标数据的?
It will be a cluster level component which periodically scrapes metrics from all Kubernetes nodes served by Kubelet through Summary API.
Then metrics will be aggregated, stored in memory (see Scalability limitations) and served in Metrics API format.
Metrics Server 会收集哪些指标数据?
文档里面没有介绍,但是根据下面命令结果,个人推测是只会收集CPU、内存使用情况,频率未知
通过kubectl top
获取指标情况
# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
node001 152m 15% 1302Mi 74%
node002 44m 4% 521Mi 29%
node003 60m 6% 666Mi 38%
# kubectl top pod
NAME CPU(cores) MEMORY(bytes)
client-bbf58867f-5qmcc 0m 0Mi
myapp-8b4d65cd6-gglt9 0m 2Mi
通过curl获取指标情况
# curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/nodes
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [
{
"metadata": {
"name": "node001",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node001",
"creationTimestamp": "2018-10-09T07:12:35Z"
},
"timestamp": "2018-10-09T07:12:00Z",
"window": "1m0s",
"usage": {
"cpu": "164m",
"memory": "1346644Ki"
}
},
{
"metadata": {
"name": "node002",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node002",
"creationTimestamp": "2018-10-09T07:12:35Z"
},
"timestamp": "2018-10-09T07:12:00Z",
"window": "1m0s",
"usage": {
"cpu": "46m",
"memory": "526960Ki"
}
},
{
"metadata": {
"name": "node003",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node003",
"creationTimestamp": "2018-10-09T07:12:35Z"
},
"timestamp": "2018-10-09T07:12:00Z",
"window": "1m0s",
"usage": {
"cpu": "60m",
"memory": "677312Ki"
}
}
]
}
获取指定pod指标情况
# curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/myapp-8b4d65cd6-gglt9
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "myapp-8b4d65cd6-gglt9",
"namespace": "default",
"selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/myapp-8b4d65cd6-gglt9",
"creationTimestamp": "2018-10-09T07:23:44Z"
},
"timestamp": "2018-10-09T07:23:00Z",
"window": "1m0s",
"containers": [
{
"name": "myapp",
"usage": {
"cpu": "0",
"memory": "2096Ki"
}
}
]
}
Metrics Server 如何存储指标数据? 存储周期多长? 怎么保留历史?
Then metrics will be aggregated, stored in memory (see Scalability limitations) and served in Metrics API format.
如何部署Metrics Server?
- 下载相关资源清单文件
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/metrics-server
# for i in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml; do wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/$i; done
创建相关资源
# kubectl apply -f .
检查api群组是否有metric相关
# kubectl api-versions|grep metric metrics.k8s.io/v1beta1
curl调用APIServer获取node和pod数据
# curl http://localhost:8080/apis/metrics.k8s.io/v1beta1 { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "metrics.k8s.io/v1beta1", "resources": [ { "name": "nodes", "singularName": "", "namespaced": false, "kind": "NodeMetrics", "verbs": [ "get", "list" ] }, { "name": "pods", "singularName": "", "namespaced": true, "kind": "PodMetrics", "verbs": [ "get", "list" ] } ] } # curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/nodes { "kind": "NodeMetricsList", "apiVersion": "metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes" }, "items": [] } # curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/pods { "kind": "PodMetricsList", "apiVersion": "metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/metrics.k8s.io/v1beta1/pods" }, "items": [] }
修改Deployment资源清单文件
我这里当前是13d59fd版本,默认使用的两个镜像是,
metrics-server-amd64:v0.3.1
,k8s.gcr.io/addon-resizer:1.8.3
在使用的过程中遇到了诸多问题,而且没有找到相关的解决方法,于是降低metrics-server的版本为0.2.1,addon-resizer为1.8.2,配置相关参数后才能成功启动
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: ConfigMap
metadata:
name: metrics-server-config
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: EnsureExists
data:
NannyConfiguration: |-
apiVersion: nannyconfig/v1alpha1
kind: NannyConfiguration
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server-v0.2.1
namespace: kube-system
labels:
k8s-app: metrics-server
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
version: v0.2.1
spec:
selector:
matchLabels:
k8s-app: metrics-server
version: v0.2.1
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
version: v0.2.1
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.2.1
# 默认是Always拉取镜像,但是墙的问题会导致超时,所以修改为本地存在则不拉取
imagePullPolicy: IfNotPresent
command:
# 这里是metrics-server启动命令参数
# 默认会使用http协议访问APIServer的10255端口
# 所以此处修改为使用HTTPS协议访问,并配置正确kubelet的端口
# 由于证书是Kubernetes自签发,所以忽略证书校验
- /metrics-server
- --source=kubernetes.summary_api:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
ports:
- containerPort: 443
name: https
protocol: TCP
- name: metrics-server-nanny
image: k8s.gcr.io/addon-resizer:1.8.2
# 默认是Always拉取镜像,但是墙的问题会导致超时,所以修改为本地存在则不拉取
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 5m
memory: 50Mi
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: metrics-server-config-volume
mountPath: /etc/config
command:
- /pod_nanny
- --config-dir=/etc/config
- --cpu=40m
- --extra-cpu=0.5m
- --memory=40Mi
- --extra-memory=4Mi
- --threshold=5
- --deployment=metrics-server-v0.2.1
- --container=metrics-server
- --poll-period=300000
- --estimator=exponential
volumes:
- name: metrics-server-config-volume
configMap:
name: metrics-server-config
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
应用清单文件,创建相关资源
# kubectl apply -f metrics-server-deployment.yaml
检查Pod状态
# kubectl get pods -n kube-system
# kubectl describe pods metrics-server-v0.2.1-777b86c5c9-lrh9c -n kube-system
检查Pod日志,如果为如下信息则表示正常
# kubectl logs metrics-server-v0.2.1-777b86c5c9-lrh9c -n kube-system metrics-server
I1008 01:49:33.578105 1 heapster.go:71] /metrics-server --source=kubernetes.summary_api:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
I1008 01:49:33.578151 1 heapster.go:72] Metrics Server version v0.2.1
I1008 01:49:33.578343 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version
I1008 01:49:33.578353 1 configs.go:62] Using kubelet port 10250
I1008 01:49:33.578641 1 heapster.go:128] Starting with Metric Sink
I1008 01:49:33.727913 1 serving.go:308] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
I1008 01:49:34.017904 1 heapster.go:101] Starting Heapster API server...
[restful] 2018/10/08 01:49:34 log.go:33: [restful/swagger] listing is available at https:///swaggerapi
[restful] 2018/10/08 01:49:34 log.go:33: [restful/swagger] https:///swaggerui/ is mapped to folder /swagger-ui/
I1008 01:49:34.018917 1 serve.go:85] Serving securely on 0.0.0.0:443
日志如果没有明显错误后,可以通过检查数据采集情况进一步确认
Terminal A # kubectl proxy --port 8080
Terminal B # curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/nodes
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [
{
"metadata": {
"name": "node001",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node001",
"creationTimestamp": "2018-10-08T02:12:44Z"
},
"timestamp": "2018-10-08T02:12:00Z",
"window": "1m0s",
"usage": {
"cpu": "138m",
"memory": "1224640Ki"
}
},
{
"metadata": {
"name": "node002",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node002",
"creationTimestamp": "2018-10-08T02:12:44Z"
},
"timestamp": "2018-10-08T02:12:00Z",
"window": "1m0s",
"usage": {
"cpu": "35m",
"memory": "408520Ki"
}
},
{
"metadata": {
"name": "node003",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node003",
"creationTimestamp": "2018-10-08T02:12:44Z"
},
"timestamp": "2018-10-08T02:12:00Z",
"window": "1m0s",
"usage": {
"cpu": "48m",
"memory": "523404Ki"
}
}
]
}
或者可以通过kubectl top命令获取资源性能数据
# kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
node001 135m 13% 1205Mi 69%
node002 36m 3% 401Mi 23%
node003 46m 4% 518Mi 29%
到此,metrics-server的部署就完成,但是这是节点以及pod指标的监控数据,自定义指标数据需要用到prometheus。
但是问题是,Kubernetes不能直接识别prometheus的数据格式,所以需要用到一个适配器进行数据转换,这就是k8s-prometheus-adapter。
相关文档
- Metrics Server 介绍:https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/metrics-server.md
- 监控架构介绍:https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/monitoring_architecture.md
- addon-resizer 介绍:https://github.com/kubernetes/autoscaler/tree/master/addon-resizer