Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus metrics missing k8s_deployment_name attribute for short period after agent restart #37056

Open
indu-subbaraj opened this issue Jan 7, 2025 · 4 comments · May be fixed by #37088
Open
Labels
bug Something isn't working processor/k8sattributes k8s Attributes processor waiting-for-code-owners

Comments

@indu-subbaraj
Copy link

Component(s)

processor/k8sattributes

What happened?

Description

We have otel deployed as a daemon set in our cluster and have noticed that after a restart of the otel agent pods, prometheus metrics are missing the k8s_deployment_name attribute even though it is being extracted as part of the k8sattributes processor. After about 5 minutes, the problem resolves and the label is present on metrics. Notably, we have not observed this issue with the k8s_daemonset_name attribute. By looking at the logs, we found the k8s.deployment.name attribute is missing from the resource, while the other extracted metadata is present (see attached logs).

Our config sets the wait_for_metadata flag to true. The k8sattribute processor config we are currently using:

  k8sattributes:
    wait_for_metadata: true
    auth_type: "serviceAccount"
    passthrough: false
    filter:
      node_from_env_var: KUBE_NODE_NAME
    extract:
      metadata:
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.deployment.name
        - k8s.deployment.uid
        - k8s.namespace.name
      labels:
        # Extracts the value of a pod label and inserts it as a resource attribute
        - tag_name: service_name
          key: tags.datadoghq.com/service
          from: pod
        - tag_name: version
          key: tags.datadoghq.com/version
          from: pod
        - tag_name: label_k8s_bluecore_com_team
          key: k8s.bluecore.com/team
          from: pod
    pod_association:
      # below association takes a look at the datapoint's k8s.pod.uid resource attribute and tries to match it with
      # the pod having the same attribute.
      - sources:
          - from: resource_attribute
            name: k8s.pod.uid

Steps to Reproduce

Deploy otel with a prometheus receiver and a pipeline that uses the k8sattributes processor with the config above. Use the debug exporter to output to stdout. Note that for ~5 min after a restart of the otel pod, k8s_deployment_name is missing from the metrics. After about 5 min, the resource attribute appears.

Expected Result

After the otel pods restart, the k8s_deployment_name label should be on all exported metrics.

Actual Result

For about 5 minutes after otel pods restart, exported prometheus metrics are missing the k8s_deployment_name label despite it being extracted as part of the k8sattribute processor.

Collector version

v0.116.1

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

receivers:
  prometheus/bc:
    config:
      scrape_configs:
      - job_name: bc-prom
        scrape_interval: 60s
        kubernetes_sd_configs:
        - role: pod
          selectors:
          - role: pod
            # only scrape data from pods running on the same node as collector
            field: "spec.nodeName=${env:KUBE_NODE_NAME}"
        relabel_configs:
        # scrape pods annotated with "prometheus.io/scrape: true"
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          regex: "true"
          action: keep
        # read the port from "prometheus.io/port: <port>" annotation and update scraping address accordingly
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: ([^:]+)(?::\d+)?;(\d+)
          # escaped $1:$2
          replacement: $$1:$$2
        # do not scrape init containers; the above does not catch this case because most init containers do not have ports defined
        - source_labels: [ __meta_kubernetes_pod_container_init ]
          regex: "false"
          action: keep
        metric_relabel_configs:
        - source_labels: [__name__]
          regex: "(bc_.*|controller_runtime_.*|rest_client_requests_total|velero_.*|node_.*|certmanager_.*|kubelet_.*)"
          action: keep
processors:
  resourcedetection:
    detectors: [env, gcp]
    timeout: 5s
    gcp:
      resource_attributes:
        cloud.provider:
          enabled: true
        cloud.platform:
          enabled: true
        cloud.account.id:
          enabled: true
        cloud.region:
          enabled: true
        cloud.availability_zone:
          enabled: true
        k8s.cluster.name:
          enabled: true
        host.id:
          enabled: false
        host.name:
          enabled: false
  k8sattributes:
    wait_for_metadata: true
    auth_type: "serviceAccount"
    passthrough: false
    filter:
      node_from_env_var: KUBE_NODE_NAME
    extract:
      metadata:
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.deployment.name
        - k8s.deployment.uid
        - k8s.namespace.name
      labels:
        # Extracts the value of a pod label and inserts it as a resource attribute
        - tag_name: service_name
          key: tags.datadoghq.com/service
          from: pod
        - tag_name: version
          key: tags.datadoghq.com/version
          from: pod
        - tag_name: label_k8s_bluecore_com_team
          key: k8s.bluecore.com/team
          from: pod
    pod_association:
      # below association takes a look at the datapoint's k8s.pod.uid resource attribute and tries to match it with
      # the pod having the same attribute.
      - sources:
          - from: resource_attribute
            name: k8s.pod.uid
  batch:
    send_batch_max_size: 1000
    send_batch_size: 100
    timeout: 10s
exporters:
  debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 200
service:
  pipelines:
    metrics/bc:
      receivers: [prometheus/bc]
      processors:
        - batch
        - resourcedetection
        - k8sattributes
      exporters:
        - debug

Log output

Immediately after otel pods restart, note that k8s.deployment.name is missing as a resource attribute but the other extracted metadata is present:

Resource attributes:
     -> service.name: Str(bc-prom)
     -> net.host.name: Str(<ip>)
     -> server.address: Str(<ip>)
     -> service.instance.id: Str(<ip>:9102)
     -> net.host.port: Str(9102)
     -> http.scheme: Str(http)
     -> server.port: Str(9102)
     -> url.scheme: Str(http)
     -> cluster_name: Str(<cluster_name>)
     -> k8s.pod.name: Str(txnl-api-5fc5f496c5-r4grd)
     -> k8s.pod.uid: Str(03a84821-8629-4345-b265-c5018856468d)
     -> k8s.container.name: Str(txnl-api)
     -> k8s.namespace.name: Str(txnl-api)
     -> pod: Str(txnl-api-5fc5f496c5-r4grd)
     -> cloud.provider: Str(gcp)
     -> cloud.account.id: Str(<cluster_name>)
     -> cloud.platform: Str(gcp_kubernetes_engine)
     -> cloud.region: Str(<region>)
     -> label_k8s_bluecore_com_team: Str(<team>)
     -> service_name: Str(txnl-api)
     -> version: Str(7c79a41)

Also notice that the k8s.daemonset.name attribute IS present immediately after otel pods restart:

Resource attributes:
     -> service.name: Str(bc-prom)
     -> net.host.name: Str(<ip>)
     -> server.address: Str(<ip>)
     -> service.instance.id: Str(<ip>:9100)
     -> net.host.port: Str(9100)
     -> http.scheme: Str(http)
     -> server.port: Str(9100)
     -> url.scheme: Str(http)
     -> k8s.namespace.name: Str(otel-system)
     -> pod: Str(prometheus-node-exporter-c9xsw)
     -> k8s.pod.name: Str(prometheus-node-exporter-c9xsw)
     -> k8s.pod.uid: Str(37b69e3e-6589-4aae-806e-a0009822bf25)
     -> k8s.container.name: Str(node-exporter)
     -> k8s.daemonset.name: Str(prometheus-node-exporter)
     -> cloud.provider: Str(gcp)
     -> cloud.account.id: Str(<cluster_name>)
     -> cloud.platform: Str(gcp_kubernetes_engine)
     -> cloud.region: Str(<region>)
     -> cluster_name: Str(<cluster_name>)

After about 5 minutes, with no further changes, the resource DOES contain the k8s.deployment.name attribute:

Resource attributes:
     -> service.name: Str(bc-prom)
     -> net.host.name: Str(<ip>)
     -> server.address: Str(<ip>)
     -> service.instance.id: Str(<ip>:9102)
     -> net.host.port: Str(9102)
     -> http.scheme: Str(http)
     -> server.port: Str(9102)
     -> url.scheme: Str(http)
     -> k8s.pod.name: Str(txnl-api-5fc5f496c5-8khhd)
     -> k8s.pod.uid: Str(90eda72b-4dc7-40e9-bf4a-f9827b298315)
     -> k8s.container.name: Str(txnl-api)
     -> k8s.namespace.name: Str(txnl-api)
     -> cluster_name: Str(<cluster_name>)
     -> pod: Str(txnl-api-5fc5f496c5-8khhd)
     -> cloud.provider: Str(gcp)
     -> cloud.account.id: Str(<cluster_name>)
     -> cloud.platform: Str(gcp_kubernetes_engine)
     -> cloud.region: Str(<region>)
     -> k8s.deployment.name: Str(txnl-api)
     -> k8s.deployment.uid: Str(f5d96967-5e4a-4728-8f12-ae7ad969b42e)
     -> service_name: Str(txnl-api)
     -> version: Str(7c79a41)
     -> label_k8s_bluecore_com_team: Str(<team>)


### Additional context

_No response_
@indu-subbaraj indu-subbaraj added bug Something isn't working needs triage New item requiring triage labels Jan 7, 2025
@github-actions github-actions bot added the processor/k8sattributes k8s Attributes processor label Jan 7, 2025
Copy link
Contributor

github-actions bot commented Jan 7, 2025

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@bacherfl
Copy link
Contributor

bacherfl commented Jan 7, 2025

(Triage): the issue is well explained and contains all information to reproduce this, so removing the needs-triage label and adding waiting-for-code-owners

I just looked into this - I believe this could be due to the fact that, as opposed to e.g. a DaemonSet, the deployment name for a given pod is retrieved indirectly via the related ReplicaSet:

if c.Rules.DeploymentName {
if replicaset, ok := c.getReplicaSet(string(ref.UID)); ok {
if replicaset.Deployment.Name != "" {
tags[conventions.AttributeK8SDeploymentName] = replicaset.Deployment.Name
}
}
}

This is done at the time the pod is added or updated via the k8sattributesprocessor's Informer. My theory is that in case of a restart of the agent, during the initial sync of the informer where it retrieves all existing resources (i.e. pods, deployments, replicasets, etc.), the processor might be informed about a pod before it has the information about the related replicaset, and therefore the related deployment name is unavailable.
This is probably less likely to happen for any new pods that come up after the agent, as the replica set will always be created first, before the related pod is created, and therefore the agent will already have the information about the replicaset for new pods.

One solution may be to, whenever a replicaset is received in the informer, check for any pods that reference this replicaset in their owner references and update the pod attributes accordingly, in case they do not have this information available yet.
If the code owners agree that my theory makes sense, and that this could be a valid approach I'd like to look into providing a fix for that

@ChrsMark
Copy link
Member

ChrsMark commented Jan 8, 2025

Thank's for investigating this @bacherfl! I wonder if the wait_for_metadata setting added by #32622 can actually help here.

In general, I think it's more efficient to wait for the sync instead of getting back and looping over the whole Pod's cache each time a ReplicaSet is added/updated: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/37088/files#diff-da945276e27b1e0a3eed6a7e4a97c5b1bb9d90da43bd8ef9fce8b533584d4292R1100-R1101

@bacherfl
Copy link
Contributor

bacherfl commented Jan 8, 2025

Thank's for investigating this @bacherfl! I wonder if the wait_for_metadata setting added by #32622 can actually help here.

In general, I think it's more efficient to wait for the sync instead of getting back and looping over the whole Pod's cache each time a ReplicaSet is added/updated: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/37088/files#diff-da945276e27b1e0a3eed6a7e4a97c5b1bb9d90da43bd8ef9fce8b533584d4292R1100-R1101

Thanks @ChrsMark! I agree that iterating over the pods is not ideal - I have posted a suggestion for a potential alternative solution in the comment in the PR - for now I will revert it back to draft, and will keep you updated as soon as I have tried out the other approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working processor/k8sattributes k8s Attributes processor waiting-for-code-owners
Projects
None yet
3 participants