Skip to content

Processor & Aggregator Plugin Support #1726

Closed
@sparrc

Description

@sparrc

EDIT: the proposed Aggregator interface has changed, see #1726 (comment)

Proposal:

processor & aggregator plugins will be new types of plugins that sit in-between input and output plugins.

If there are processors or aggregators defined in the config, then all metrics will pass through them before being passed onto the output plugins.

processor plugins will generically support matching based on (with globbing):

  1. tag key/value
  2. measurement name
  3. field keys

aggregator plugins will generically support matching based on (with globbing):

  1. tag key/value
  2. measurement name
  3. field keys

An initial implementation has been written by @alimousazy in this PR: #1364, but I would like to consider here the structure & interface of processor plugins independent of the histogram/aggregator feature.

My proposal for the processor interface differs a bit from that PR. While that PR presents an interesting way of streaming metrics through multiple channels, it also raises an important question of how large to create each channel, which could greatly increase the total possible buffer size of telegraf.

Channels are great for multiple processes to run concurrently and aggregate their work in one place, but this is not actually the workflow of a processor plugin. For each metric that comes from the input plugins, each processor will need to be applied, and after all processors are applied the metric(s) will be passed onto the aggregator plugin(s) & output plugin(s).

The original metric will therefore get sent directly to the output plugins, while the aggregator plugins are free to process the metric as they need, adding their metrics to their accumulator as they need.

type Processor interface {
    // SampleConfig returns the default configuration of the Input
    SampleConfig() string

    // Description returns a one-sentence description on the Input
    Description() string

    // Apply the processor to the given metric
    Apply(in ...telegraf.Metric) []telegraf.Metric
}

type Aggregator interface {
    // SampleConfig returns the default configuration of the Input
    SampleConfig() string

    // Description returns a one-sentence description on the Input
    Description() string

    // Apply the metric to the aggregator
    Apply(in telegraf.Metric)

    // Start starts the aggregator
    Start(acc telegraf.Accumulator)
    Stop()
}

Use case: [Why is this important (helps with prioritizing requests)]

some of the uses of these plugins:

  1. dropping metrics
  2. aggregating metrics
  3. adding & removing tags
  4. adding & removing fields
  5. modifying fields, measurement names, tags, etc.

Open Questions:

  1. Ordering: how do we deal with ordering of processors? do we need to support an argument for users to manually order the plugins? or can we rely on the configuration file to provide the order for us?
  2. Allocations: what affect are processor plugins going to have on allocations?

Activity

added
feature requestRequests for new plugin and for new features to existing plugins
on Sep 8, 2016
added this to the 1.1.0 milestone on Sep 8, 2016
added 5 commits that reference this issue on Sep 8, 2016
9149ebd
f718c26
1f02fa0
c7f5dbe
f3a3447
alimousazy

alimousazy commented on Sep 8, 2016

@alimousazy
Contributor

Hi,

I totally agree on that channel will increase complexity in term of memory usage and execution, but I have a question regarding the proposed design. since some of these filters may need to have another trigger other than metric arrival for example Histogram may need to flush data every 1 minute (Aggregate date) how we can handle that . Another thing Filter mapping of in and out metric is not always one to one for example histogram or dropping filters may decide not pass metric. another case when filter flushing metrics ex-histogram it might return multi-metrics instead of one.

sparrc

sparrc commented on Sep 9, 2016

@sparrc
ContributorAuthor

since some of these filters may need to have another trigger other than metric arrival for example Histogram may need to flush data every 1 minute (Aggregate date) how we can handle that .

That's a good point, I think it might be necessary to define two types of plugins: filters and aggregators. Aggregators would behave sort of like a "service filter" where they have continuous access to an output channel.

I'll come up with a design overview for this soon.

Another thing Filter mapping of in and out metric is not always one to one for example histogram or dropping filters may decide not pass metric. another case when filter flushing metrics ex-histogram it might return multi-metrics instead of one.

agreed, I have updated the Apply function to reflect this (accepting and returning lists)

sparrc

sparrc commented on Sep 9, 2016

@sparrc
ContributorAuthor

Updated design, this is to take into account the need for two different types of plugins: filters & aggregators.

alimousazy

alimousazy commented on Sep 9, 2016

@alimousazy
Contributor

While I do feel that this model will solve flushing metrics in active state component, but active state filters still considered as filter and its output should go throw other filters based on order. in suggested design active state filters which does flush metric sepreatly will by pass other filters and push metrics directly to output plugin. It came to my mind that the apply pattern which we are using looks similar to channel with no buffer if you ignore the cost of creating channel in term of memory and functionality, while I still feel that the channel is overkill for such functionliaty.

sparrc

sparrc commented on Sep 9, 2016

@sparrc
ContributorAuthor

@alimousazy I don't quite understand what you're suggesting, just eliminating the channel directly before the outputs? That channel shouldn't need to have a large buffer as it will have a goroutine constantly reading off of it.

alimousazy

alimousazy commented on Sep 9, 2016

@alimousazy
Contributor

@sparrc what I meant since Histogram will emit metrics every minute these metrics should pass also to other filters like drop filter ... etc . based on my understanding the latest design that your proposed Histogram metric will go directly to output plugin .

sparrc

sparrc commented on Sep 9, 2016

@sparrc
ContributorAuthor

yes, correct, the metrics coming from the aggregators would have the same fields as the metrics they are aggregating

50 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestRequests for new plugin and for new features to existing plugins

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Processor & Aggregator Plugin Support · Issue #1726 · influxdata/telegraf