Skip to content

Development Conventions

Taylor Barrella edited this page Oct 15, 2020 · 9 revisions

We follow a number of conventions in our source base to help us create something cohesive and coherent, which makes the source base easier to create, maintain, and use.

Others docs you should look at:

Coding conventions

  • Follow the general guidance from Effective Go

  • Go Code Review Comments provides a good collection of common code review comments so you can get your code up to snuff before asking others to look at it.

  • Comment your code.

    • Follow the general guidance from Go's commenting conventions
    • If reviewers ask questions about why the code is the way it is, that's a sign that comments might be helpful.
  • Command-line flags should use dashes, not underscores

  • Naming

    • Please consider package name when selecting an interface name, and avoid redundancy. For example, use adapter.AspectConfig instead of adapter.AdapterConfig.

    • Must use lowercase for Go package names.

    • Must use camel case for Go types.

    • Use singular nouns for types unless they represent a collection. Use plural nouns for collections.

    • Please consider parent directory name when choosing a package name:

      • adapters/factMapper/tracker.go should say package factMapper not package factmapperadapter.

      • Unless there's a good reason, package names should match the name of the directory in which the .go file exists.

      • Importers can use a different name if they need to disambiguate.

Logging conventions

Istio code is expected to use the standard Istio logging package. When importing external package to Istio, if these packages send logs to the standard Go "log" package or to the global zap logger, the log data will automatically be captured by the Istio log package and merged into the primary log stream(s) produced for the component. If other logging packages are used, their output will uncoordinated and unrelated to the primary log output, which isn't great for users of the component.

The Istio logging package supports four logging levels:

  • Error - Always an error.
  • Warn - Something unexpected, but probably not an error.
  • Info - Useful info about something happening.
  • Debug - Extra stuff useful during development.

By default, errors, warnings, and informational data is emitted, while debug is only enabled when a component is started with particular options.

Log message composition

Good user facing errors/warnings are:

  • Few: Errors and Warnings should represent consequential conditions that the user cares about. Logs should only be emitted at error level for things that are intended for the user to see and act upon.
  • Clear and Precise: Clearly and concisely state what went wrong.
  • Unambiguous: When the error is emitted something really is wrong, not “something is only wrong if you see this error repeatedly, along with another error”. Ambiguities should preferably be resolved in the code (e.g. check if the error repeats, other condition is present etc.) If this is not possible, use a warning instead.
  • Actionable: Suggest next steps to the user.
  • Contextual: Give the user enough context to drill down further (e.g. YAML line number of error), and also find more detailed background information (e.g. error ID, URL).

This advice applies to all log statements, but is especially critical for any message a user is expected to see and act upon. Code reviewers should and approvers should pay attention that log messages and errors follow these guidelines.

In many cases, it's preferable to use the error dictionary because it is reviewed by working groups that have expertise in writing good logging messages. The error dictionary also improves style consistency and maintainability.

Structured logging

The log package supports structured logging through the WithLabels() function, available both at scope and global level. Structured logging is useful for situations where some persistent context is desired for a set of log statements, especially if log output is interleaved due to concurrency.

Example:

// assume some local scope has been defined already in the package e.g. "installer"
var (
        scope = log.RegisterScope("installer", "Messages related to installation and upgrade", 0)
)

func MyHandler(userID, contextID string) {
    // WithLabels returns a copy of scope so other "installer" logs are not affected.
    scope = scope.WithLabels("userID", userID, "contextID", contextID)
    scope.Debug("Something happened")
}

Output:

<timestamp> debug   installer   Something happened  userID=<some value> contextID=<some other value>

or JSON:

{
    ...
    "msg": "Something happened"
    "userID": "<some value>"
    "contextID": "<some other value>"
}

Note: WithLabels() is a direct replacement for zap.Field structured logging:

  // no longer supported
  scope.Debug("Hello", zap.String("foo", "bar") 
  
  // use WithLabels() instead
  scope.WithLabels("foo", "bar").Debug("Hello")))

Error dictionary

The log package output functions (Fatal*, Error*, Warn* etc.) now accept an optional first parameter, a *structured.Error type. Defining a structured.Error entry in the dictionary file is appropriate for most user facing errors. First, fill out all applicable fields in a structured.Error entry in the dictionary file:

var (
        OperatorFailedToGetObjectFromAPIServer = &structured.Error{
        MoreInfo: "Failed to get an object from the Kubernetes API server, because of a transient " +
                "API server error or the object no longer exists.",
        Impact: OperatorImpactFailedToGetObjectFromAPIServer,
        LikelyCause: formatCauses(LikelyCauseAPIServer) + " " + TransiencePermanentForInstall,
        Action: "If the error is because the object was deleted, it can be safely ignored. Otherwise, if the " +
                "error persists, " + ActionCheckBugList,
)

This file contains macros and definitions for many common scenarios to reduce repeating the same thing in slightly different ways. Rather than writing verbose errors in an inconsistent way, many errors can be composed with existing common text constants. The second step is to actually use the error in a logging statement:

    scope.Warnf(errdict.OperatorFailedToGetObjectFromAPIServer, "action X error: %v", err)

In this case, the logging is at Warn level because the error may be transient. The call will output the full error string in the logs:

Failed to get an object from the Kubernetes API server, because of a transient API server error  
or the object no longer exists. If the error is transient, the impact is low. If permanent, updates  
for the objects cannot be processed leading to an out of sync control plane. The likely cause is a  
problem with the Kubernetes API server. If the error occurred immediately after installation, it is  
likely permanent. If the error is because the object was deleted, it can be safely ignored. If this  
error persists, see https://istio.io/latest/about/bugs/ for possible solutions.

The logging handler can output the dictionary error entry in a variety of formats depending on configuration. A likely future format will be a URL with error code, pointing to a page with the above paragraph.

The *structured.Error implements both the error#Error and error#Unwrap interfaces and can be returned from any function while retaining full type semantics when it's actually logged. This is important because error context often originates deep in the stack and can be lost at the logging site when the error is logged. For example:

func MainFunction() {
    if err := calledFunction(); err != nil {
        var ie *structured.Error
        # logger must call As() to retrieve wrapped error
        errors.As(eWrap, &ie) 
        s.Errorf(ie, "calledFunction returned an error: %v", err)
    }
    ...
}

func calledFunction() error {
    ...
    // NOTE: %w verb must be used with annotations. See https://blog.golang.org/go1.13-errors.
    if err := deeperCalledFuction(); err != nil {
        return fmt.Errorf("deeperCalledFunction returned an error: %w", err)
    }
}

func deeperCalledFuction() error {
    ...
    return errdict.OperatorFailedToGetObjectFromAPIServer
}

Performance

The Istio logging package is built on top of the Zap logger and thus inherits its efficiency and flexibility. In any high performance paths, prefer to use the Error, Warn, Info, and Debug methods, as these are the most efficient. All other variations of these four calls end up triggering some memory allocations and have a considerably higher execution time.

If you need to do a fair bit of computation in order to produce data for logging, you should protect that code using the ErrorEnabled, WarnEnabled, InfoEnabled, and DebugEnabled methods.

Be careful not to introduce expensive computations as part of the evaluation of a log statement. For example:

log.Debug(fmt.Sprintf("%s %s %d", s1, s2, i1))

Even when debug-level logs are disabled, the fmt.Sprintf call will still execute. Instead, you should write:

if log.DebugEnabled() {
    log.Debug(fmt.Sprintf("%s %s %d", s1, s2, i1))
}

Note that Mixer adapters don't use the standard Istio logging package. Instead, they are expected to use the adapter logger interface.

Testing conventions

  • For Go code, unit tests are written using the standard Go testing package.

  • All new packages and most new significant functionality must come with unit tests with code coverage >98%

  • Table-driven tests are preferred for testing multiple scenarios/inputs

  • Significant features should come with integration and/or end-to-end tests

  • Tests must be robust. In particular, don't assume that async operations will complete promptly just because it's a test. Put proper synchronization in place or if not possible put in place some retry logic.

Avoiding flaky tests

A flaky test is a test that non-deterministically fails. Most of these are caused by timing sensitive tests. Avoiding code that makes assumptions like time.Sleep in favor of more robust solutions, like polling or pushing.

If you see a flaky test, file an issue - or even better, a PR fixing the issue. Flaky tests have a large impact on developer and CI productivity.

To reproduce test flakes locally, use the stress tool. For example, to test the TestFoo test:

$ go get -u golang.org/x/tools/cmd/stress
$ go test -c -race -o stress.test ./my/test/package
$ stress -p 256 ./stress.test -test.run=TestFoo

This will run the TestFoo test repeatedly with parallelism determined by the -p flag.

To see how frequently a test has flaked, use https://eng.istio.io/flakes.

Directory and file conventions

  • Avoid package sprawl. Find an appropriate subdirectory for new packages.

  • Avoid general utility packages. Packages called "util" are suspect. Instead, derive a name that describes your desired function.

  • Use lowercase for file and directory names.

  • Use underscores for Go file names, e.g. type DeviceAllocator must reside in device_allocator.go.

  • All directory names should be singular unless required by existing frameworks. This is to avoid mixed singular and plural names in the full paths. NOTE: Traditional Unix directory names are often singular, such as "/usr/bin".

  • Third-party code

    • Go code for normal third-party dependencies is managed by the Go Dep.

    • Third-party code must carry licenses. This includes modified third-party code and excerpts.

Superfluous Changes

When possible, avoid making changes that do not bring tangible benefit to Istio users or developers. For example, a change like this that reorders code with no benefit:

- a := 1
  b := 2
+ a := 1

These types of changes are not harmless.

  • They cost maintainer time to review. Even a small change like this can take 15 minutes to understand the full context
  • They cost money to run on our CI system.
  • They risk breaking code that currently works. Similar "harmless" changes have led to production outages in the past.

Troubleshooting

Dev Environment

Writing Code

Pull Requests

Testing

Performance

Releases

Misc

Central Istiod

Pilot

Telemetry

Clone this wiki locally