There should be a single way that pods and other resources are identified in k8s component logs wherever possible #23338

pmorie · 2016-03-22T16:52:46Z

Currently there is no single way that pods are identified in log messages, which can make it difficult to find all the messages relevant to a single pod in a log file. I think, wherever possible, that we should identify pods with the following information, formatted consistently:

Pod name
Pod namespace
Pod UID

It's possible that not all call sites will have all of this information; in that case, we should stick to the format, and just log empty fields. I started thinking about this in the context of the kubelet but it applies to anything that logs messages identifying pods.

@kubernetes/sig-node @kubernetes/rh-cluster-infra

yujuhong · 2016-03-22T17:05:38Z

I've been trying to push this with a helper function that prints podName_podNamespace(podUID) for any api.Pod object, but not all call sites have been converted. In some cases, there may not be pod-level information (e.g., only container names/IDs), I am not sure if printing empty fields will help(?)

timothysc · 2016-03-28T01:31:22Z

A dream of mine in every distributed system would be complete object logging for total life-cycle tracability (admission-scheduling-execution). Maybe it's just a dream.

derekwaynecarr · 2016-03-31T21:35:55Z

OMG yes, this would be extremely useful. This goes for kubecontainer.Pod
as well.

On Sun, Mar 27, 2016 at 9:31 PM, Timothy St. Clair <notifications@github.com

wrote:

A dream of mine in every distributed system would be complete object
logging for total life-cycle tracability (admission-scheduling-execution).
Maybe it's just a dream.

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub
#23338 (comment)

yujuhong · 2016-04-13T00:14:13Z

I agree in general, and that's also why I added the helper function to begin with. However, supporting this everywhere is a bit problematic because not every function will have access to the entire api.Pod object for example, or even the pod UID. E.g., I think it's acceptable that the dockertools package logs a single line about "deleting a container" without writing down the UID. Of course, there should be corresponding higher-level message about deleting containers in a pod logged somewhere else. I'd like to think that we have a consistent way to log objects whenever possible, but not every log message can be associated back with a high-level object easily. For relevant information to track the api object, we should (perhaps) rely on Events.

timstclair · 2016-04-25T19:08:14Z

Can we generalize this to standardize logging across the whole project? What if we abstracted away from glog to create our own higher-level logging package? I am envisioning an API that would enable something like:

log.Msg("could not read pod info"). // log message
    ForObject(pod).                 // associate with API object (log identity, best effort)
    WithError(err).                 // include an error
    Error()                         // write the log (error level)

or

log.Msg("syncing pod").  // log message
    FullObject(pod).     // pretty-print full object spec (deep print)
    From(podWorker).     // identify call owner
    Label("first time", isFirst)  // attach extra information
    Debug()              // log at debug level

The goal would be to identify the common pieces of information that are logged and provide them in a consistent, human & machine readable format.

WDYT?

timothysc · 2016-04-25T19:18:43Z

I like wrapping logging, I'm not sold on the format above vs.

klog.Infof
klog.Debugf
klog.Warningf
klog.Errorf

If you wanted some object formatter interface above, that may make sense too.

/cc @jayunit100

timstclair · 2016-04-25T19:41:19Z

Something to add: a problem with just solving this in a logging library is we often want to write this data to error objects, and don't necessarily log it directly. We could introduce a custom error type with fields to attach all the same metadata to it (object reference, full object, another error, labels, etc.), and then make the logger dissect the object and format it consistently.

@timothysc - I think an advantage to my proposal over Infof style logging is the output could be more structured, and hence more machine readable, so we could build custom tools for searching & filtering the logs. E.g. Here are the logs from a kubelet, the api-server, and the master, combine them and show me all the entries relating to pod "heapster". Or better yet, just connect to my cluster and show me all the logs for XYZ.

kubectl debug pod heapster-v1.0.0-vuanf

vishh · 2016-04-25T20:38:45Z

FYI: https://github.com/Sirupsen/logrus is an attempt to generate structures logs for golang.

timothysc · 2016-04-25T20:59:20Z

@timstclair I don't think there is anything that precludes us from having a set of higher level capabilities, that back into the lower level utilities.

Here is where I miss my macro-overloading and template pasting magic of C++.

ncdc · 2016-04-26T13:34:02Z

See also #6461 #17162 #17449

timothysc · 2016-04-28T20:52:07Z

After some time I kind of like https://github.com/Sirupsen/logrus ...

bgrant0607 · 2016-05-10T00:00:32Z

I am very much in favor of making our system component logs more useful/actionable, consistent, structured, etc.

I'd be in favor of treating errors as failures in tests.

I am also in favor of leveraging third-party libraries and tools.

I am not in favor of adding functionality to kubectl for digging into component logs. Our CLI and API surface areas are big enough already, and this really sounds like a job for Elasticsearch/Kibana.

If there are conditions users care about, they should be surfaced via the API, such as using events or conditions.

If there are occurrences that cluster admins may care about, they could be surfaced via events, node conditions, or exported monitoring metrics.

See also #3500 and #20672 re. just dumping API objects for debugging.

timstclair · 2016-05-10T03:29:18Z

I agree with all these points, and thanks for links re: dumping API objects. I agree that we should leverage existing libraries and tools, but I think we should also build kubernetes-specific abstractions around them. Our elasticsearch / kibana / GCM solutions work for running clusters, but it would also be good to have a way of ingesting logs from a debug dump or a past jenkins e2e run (e.g. import the logs from this storage bucket). Good points about events & conditions vs logs, we should largely view them as internal debugging tools.

fejta-bot · 2017-12-15T12:54:38Z

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot · 2018-06-14T03:38:25Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-07-14T04:24:11Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot · 2018-08-13T05:10:21Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

maisem added team/cluster sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Mar 22, 2016

timothysc mentioned this issue Apr 28, 2016

Log control: Enable a DebugF implementation to separate *ginkgo progress logs* from *debug e2e logs* #24615

Closed

bgrant0607 changed the title ~~There should be a single way that pods are identified in logs wherever possible~~ There should be a single way that pods and other resources are identified in k8s component logs wherever possible May 9, 2016

bgrant0607 added area/test-infra sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. labels May 9, 2016

timstclair added the area/logging label May 12, 2016

timstclair mentioned this issue May 12, 2016

Propogated errors missing context #25521

Closed

timstclair mentioned this issue Jun 8, 2016

Kubelet logs would be more useful if standard form for namespace/name.container #8225

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 15, 2017

yujuhong removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 18, 2017

spiffxp removed the team/cluster (deprecated - do not use) label Mar 15, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 14, 2018

k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jul 14, 2018

k8s-ci-robot closed this as completed Aug 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There should be a single way that pods and other resources are identified in k8s component logs wherever possible #23338

There should be a single way that pods and other resources are identified in k8s component logs wherever possible #23338

pmorie commented Mar 22, 2016

yujuhong commented Mar 22, 2016

timothysc commented Mar 28, 2016

derekwaynecarr commented Mar 31, 2016

yujuhong commented Apr 13, 2016

timstclair commented Apr 25, 2016

timothysc commented Apr 25, 2016

timstclair commented Apr 25, 2016

vishh commented Apr 25, 2016

timothysc commented Apr 25, 2016 •

edited

Loading

ncdc commented Apr 26, 2016

timothysc commented Apr 28, 2016

bgrant0607 commented May 10, 2016

timstclair commented May 10, 2016

fejta-bot commented Dec 15, 2017

fejta-bot commented Jun 14, 2018

fejta-bot commented Jul 14, 2018

fejta-bot commented Aug 13, 2018

There should be a single way that pods and other resources are identified in k8s component logs wherever possible #23338

There should be a single way that pods and other resources are identified in k8s component logs wherever possible #23338

Comments

pmorie commented Mar 22, 2016

yujuhong commented Mar 22, 2016

timothysc commented Mar 28, 2016

derekwaynecarr commented Mar 31, 2016

yujuhong commented Apr 13, 2016

timstclair commented Apr 25, 2016

timothysc commented Apr 25, 2016

timstclair commented Apr 25, 2016

vishh commented Apr 25, 2016

timothysc commented Apr 25, 2016 • edited Loading

ncdc commented Apr 26, 2016

timothysc commented Apr 28, 2016

bgrant0607 commented May 10, 2016

timstclair commented May 10, 2016

fejta-bot commented Dec 15, 2017

fejta-bot commented Jun 14, 2018

fejta-bot commented Jul 14, 2018

fejta-bot commented Aug 13, 2018

timothysc commented Apr 25, 2016 •

edited

Loading