-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kube-prometheus] cAdvisor metrics are unavailable with Kubeadm default deploy at v1.7.3+ #633
Comments
I'm not super familiar with kubeadm, but the change in 1.7.3 was unrelated to the cAdvisor metric endpoint exposed on port I think in terms of this issue this was actually a firewall issue, that your worker that runs the Prometheus pod doesn't have network access to the kubelet |
@brancz I've dug into the actual kubelet configuration and you're right: the default kubeadm configuration disables the cAdvisor port (with the Actually I've just found the original thread on the Prometheus repository, which already covered this explanation. Personally, I'm not very willing to change the kubeadm settings and redeploy the cluster just to enable that port, while there is already another endpoint working (and changing a manifest is enough to exploit it). But that's just me 😄 |
metoo, with kube-aws. seems like eventually scraping the /metrics/cadvisor endpoint will be preferred? We'll maintain a fork till this is resolved (see commit pingback) |
We'll probably move to the |
Is it meant to be adding in a kubelet/4194 target by default? We have: k8s 1.7.7 built with bootkube We, as the original poster described, get kubelet metrics but no cadvisor metrics. Checking prometheus' targets, its scraping :10250 as per normal. I don't really mind whether it would get cadvisor metrics from 4194/metrics or 10250/metrics/cadvisor - but how do we get it to do... either? It's clear from the kubernetes issues that they don't really consider breaking them out and making the separate a big thing - so they're unlikely to put them back into 10250/metrics. For people hand-rolling their prometheus config it's not too difficult, but when operator is doing it, we lack the control to add things in (one day, prometheus will get the hang of multiple config sources). So, what have people who use prometheus-operator been doing to get those metrics back? |
Can you explain more what you think is not working today? I’m saying this because I think all combinations are possible today, but possibly not well enough documented. If one wants to use the 4194/metrics endpoint then that’s already reflected in the kube-prometheus manifests. If one wants to use the 10250/metrics/cadvisor metrics then one has to modify the servicemonitor endpoint provided in kube-prometheus and set the explicit metrics endpoint. Does that clear things up? |
@ghostflame To use the change - port: cadvisor
interval: 30s
honorLabels: true to - path: /metrics/cadvisor
port: http-metrics
interval: 30s
honorLabels: true and redeploy. Hope this helps... |
In my case, I just let 4194 been listened, and they worked. --cadvisor-port=0 disables cAdvisor from listening to 0.0.0.0:4194 by default. cAdvisor will still be run inside of the kubelet and its API can be accessed at https://{node-ip}:10250/stats/. If you want to enable cAdvisor to listen on a wide-open port, run:
|
Actually CIS requires |
@lorenzo-biava
or
I did not get it work, the alarm manager still reports k8skubeletdown. My setup is k8s v1.9.2 and master branch of prometheus-operator as of today.
|
to resolve the issue in my case above, just do below on all nodes, according to https://github.com/coreos/prometheus-operator/blob/master/contrib/kube-prometheus/docs/kube-prometheus-on-kubeadm.md
|
@lzbgt thks |
@lzbgt Sorry I've missed this; I've not been working with Kubernetes and Prometheus Operator lately, but it's definitely good to know that there's a dedicated doc for |
Closing as sample configurations are available and specific documentation for certain platforms as well. |
Thanks @lorenzo-biava This I work in RKE cluster |
- job_name: 'kubernetes-nodes-cadvisor' # Default to scraping over https. If required, just disable this or change to # `http`. scheme: https scrape_interval: 60s scrape_timeout: 30s # This TLS & bearer token file config is used to connect to the actual scrape # endpoints for cluster components. This is separate to discovery auth # configuration because discovery & scraping are two separate concerns in # Prometheus. The discovery auth config is automatic if Prometheus runs inside # the cluster. Otherwise, more config options have to be provided within the # <kubernetes_sd_config>. tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt # If your node certificates are self-signed or use a different CA to the # master CA, then disable certificate verification below. Note that # certificate verification is an integral part of a secure infrastructure # so this should only be disabled in a controlled environment. You can # disable certificate verification by uncommenting the line below. # insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node # This configuration will work only on kubelet 1.7.3+ # As the scrape endpoints for cAdvisor have changed # if you are using older version you need to change the replacement to # replacement: /api/v1/nodes/$1:4194/proxy/metrics # more info here prometheus-operator/prometheus-operator#633 relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - target_label: __address__ replacement: kubernetes.default.svc:443 - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor Signed-off-by: taylord0ng <hibase123@gmail.com>
What did you do?
Successfully installed
kube-prometheus
in a Kubeadm cluster v1.7.5.What did you expect to see?
The cAdvisor endpoints in the Prometheus kubelet job working correctly.
What did you see instead? Under which circumstances?
Several metrics are gathered correctly, but not the cAdvisor ones in the kubelet job.
Environment
Kubeadm at v1.7.5.
kube-prometheus/manifests/prometheus/prometheus-k8s-service-monitor-kubelet.yaml
Basically, as the cAdvisor metrics have been moved, that configuration is not working anymore.
The official Prometheus' Kubernetes configuration example has already been updated with the change.
A similar configuration should be applied to the prometheus-k8s-service-monitor-kubelet.yaml manifest too, e.g.
It worked in my environment, but of course it might not be backward-compatible.
Perhaps changing it in the kube-prometheus Helm chart (I'm assuming they are working the same way - but haven't tested it yet) and adding a configuration property to the chart o switch the behavior might be a better option.
However, I couldn't managed to find a way to express the cAdvisor metric link with the proxy, as done in the Prometheus official example (
https://kubernetes.default.svc:443/api/v1/nodes/<NODE_NAME>/proxy/metrics/cadvisor
instead ofhttp://<NODE_IP>:10255/metrics/cadvisor
which is the result of this configuration change).Since it would be useful also to access the metrics endpoint for the
kube-scheduler
pod (which is only available through the master-proxy in my setup), I was wondering if it is possible to build such endpoint.The text was updated successfully, but these errors were encountered: