-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow overriding telemetry providers #4970
Comments
I think having more flexible options here could be also helpful when implementing own telemetry reporting capability of OpAMP (or any other use-case where collector's own telemetry is sent out) |
@pmm-sumo if otel-go library is not capable of supporting that (via an extra exporter for example) then we failed to implement the library sdk. I don't think for your use-case this issue is relevant. |
@mx-psi I am failing to see what the proposed solution looks like. Let's start by first define what "capabilities"/"mean" to run the collector as library.
Do you want to use the same "instance of otel-go sdk" between the collector library and the other parts of the binary? If yes, why? Do we need to support this? We can easily build factories of factories and settings of settings, but let's define what we try to achieve. Your point about not changing global state like the grpc logger I get it, but interaction between collector library and the rest of the application is confusing. I also would be curios for example, the watching for config changes? How that interacts with your rest application? The restart logic, etc. What I am trying to say is that I am not sure what the contract should be, and where we draw the line here. |
Apologies for the delay in replying!
IMO, for this to be comfortable to use, we would need a function/struct/some public API that allows one to construct a fixed pipeline and handles the plumbing of telemetry, I agree that we don't need I think this discussion (fixed pipelines defined in Go code) may also be of interest to @Aneurysm9 if a bridge between Collector components and the Go SDK is a use case we want to eventually support. It may even be interesting to explicitly define what use cases we want to support, to drive design decisions in the future. |
@mx-psi I would first look into a prototype of using the "components" and manually connect them in a pipeline without exposing the "Service". Let's try a prototype/experiment. |
Opened a PoC on #5107, PTAL! |
I am a bit confused... I was looking also into the version where in your case you don't use anything in the service at all. |
Oh, okay, completely misread your comment, sorry. Alright, I will try this out too this week :) |
The telemetry provider PoC looks nice, but I also was thinking that bogdan was looking for building a pipeline without a service. That is the capability that would be most interesting to us in the Go SDK to enable re-use of the collector's exporters. |
Yeah, sorry, as I said, I read Bogdan's message completely wrong 😅 I will try out that too at some point this week |
We also have a use case for this feature: We run an internal distribution of the collector to add some internally-built pipeline components. We do use the service package from the collector (our top-level code is similar to otelcontribcol in opentelemetry-collector-contrib). We instrument our components with tracing, and we'd like to export these spans from the collector. It would also be nice to get the collector's internal spans, so our resulting traces aren't missing parents. (We actually had this working back when the collector-internal tracing used OpenCensus, although I doubt it was an intended use case.) @mx-psi's POC was enough for me to get this working again in our distribution wrapper without any further modifications to the collector itself. |
@sveiss I understand the problem and that the current proposal for @mx-psi solves that. But there is a conceptual disagreement in my mind, which is what is the "code part" that is responsible to create these providers. Right now we give users chance to configure them via the service Configuration, so it seems very unnatural to pass a "factory provider" that will ignore the user configuration. I think if you want to use our collector/service then your custom code should be one of the components of the collector (receiver/processor/exporter/extension) and if that is the case and you use the "provider" pass to the Create func than tracing will work. |
To be clear, a telemetry provider in #5107 can ignore user configuration, but it doesn't have to: the relevant method in the interface is That said, I think your point in general is valid (in my use case for example I would either ignore configuration or pass a fixed config always). I am still working on the 'pipelines without service' PoC, but I haven't quite figured out how to handle configuration, since there does not seem to be a very comfortable way to merge with the default config (best idea I have so far is do |
I think for configuration you do:
You never go to/from the yaml map. |
@mx-psi also a bit not sure if you should simply ignore the concept of default. And simply construct the component config. My motivation is because these defaults are probably too connected to the way we configure the collector as a service. |
That seems like a dangerous assumption.
I think not being able to use the YAML configuration would make re-use of collector exporters by the Go SDK overly complicated and too tightly coupled. It might work, but it would not be a good user experience. |
@Aneurysm9 Go SDK does not have a YAML config.. why do you need that? |
|
I added a comment because I think my use case is a little different to the one described by @mx-psi in the original issue. I'm trying to run a collector with extras (custom processors/extensions/internal configuration library). We're building our own top-level otelcol command only because we don't want to fork the collector repo, so we're trying to keep the wrapper as thin as possible. We want to delegate as much as we can to the Using the collector as a library in an existing application is different, and I can see why you'd want to avoid the collector service infrastructure in this case. I'm hoping the solution we end up with allows us to keep leveraging as much of the collector service code as possible.
Our custom code is already structured as collector components, and we can get spans exported from them easily. The PoC in #5107 lets us set up a TracerProvider and use it to export both our own spans and the collector's through the same pipeline. We couldn't do that previously, because the collector creates its own TracerProvider in |
I opened #5149 so we can discuss a concrete implementation instead of theoretical concerns. I explained in #5149 (comment) the problems I see with configuration. I am still not entirely sure if @bogdandrutu or @Aneurysm9 were thinking about something entirely different from this, if so, please provide feedback and we can try and iterate to something closer to what you had in mind. I would also suggest putting this issue on hold and splitting off the 'pipeline without service' discussion to a separate issue, to avoid further polluting the discussion here, if you think it makes sense (even if my solution looks more like #5149 in the end, it may be that @sveiss' still needs a telemetry provider). |
+1 am interested in this! @mx-psi any chance you're still interested in working on this? |
I opened #8111 as draft to move this forward. PTAL at the open questions there! |
@jaronoff97 @jmacd @bogdandrutu @ptodev I am making some (slow) progress with this, first by fixing some issues in the telemetry initialization (see #9384). My end goal would be that we have a factory for telemetry, just like we have for other components. This needs a big refactor, but what vendors would interact with is: // CreateSettings holds configuration for building Telemetry.
type CreateSettings struct {
BuildInfo component.BuildInfo
AsyncErrorChannel chan error
ZapOptions []zap.Option // maybe we can get rid of this
}
// Config for telemetry.
type Config any
// Factory is factory interface for telemetry.
// This interface cannot be directly implemented. Implementations must
// use the NewFactory to implement it.
type Factory interface {
CreateDefaultConfig() Config
CreateLogger(ctx context.Context, set Settings, cfg Config) (*zap.Logger, error)
CreateTracerProvider(ctx context.Context, set Settings, cfg Config) (trace.TracerProvider, error)
CreateMeterProvider(ctx context.Context, set Settings, cfg Config) (metric.MeterProvider, error)
CreateResource(ctx context.Context, set Settings, cfg Config) (*resource.Resource, error)
// More methods could be added here in the future if necessary
// Either a noop implementation would be provided if not available
// Or we would just error out and handle the noop-ness at the service initialization
unexportedFactoryFunc()
}
// FactoryOption apply changes to Factory.
type FactoryOption interface { /* sealed interface with no public methods */ }
type CreateLoggerFunc func(context.Context, Settings, Config) (*zap.Logger, error)
func WithLogger(c CreateLoggerFunc) FactoryOption { /* ... */ }
type CreateMeterProviderFunc func(context.Context, Settings, Config) (meter.MeterProvider, error)
func WithMeterProvider(c CreateMeterProviderFunc) FactoryOption { /* ... */ }
type CreateTracerProviderFunc func(context.Context, Settings, Config) (trace.TracerProvider, error)
func WithTracerProvider(c CreateTracerProviderFunc) FactoryOption { /* ... */ }
type CreateResourceFunc func(context.Context, Settings, Config) (*resource.Resource, error)
func WithResource(c CreateResourceFunc) FactoryOption { /* ... */ }
func NewFactory(createDefaultConfig component.CreateDefaultConfigFunc, options ...FactoryOption) Factory { /* ... */ } We could make this work through the builder. I think this addresses some of the concerns that have been brought up in the past:
Since this is a big chunk of work, I want to first know if:
|
This addresses my concerns! As a vendor, we are running custom OTel collectors inside our SaaS and have exactly the stated reasons for wanting more control over telemetry settings used by those collectors, so that they fit in with our overall organization's telemetry schema (and since we have existing Golang code to do so, already targeting the OTel-Go instrumentation APIs). Currently we have hacked this support together using an extension and by explicitly overriding Your proposals would give us a way to customize the telemetry produced in the collector completely, not just for individual components. This would also let us use and experiment with alternative OTel-Go SDKs inside an OpenTelemetry Collector, too. |
**Description:** Refactor meter provider initialization; removes `telemetryInitializer` and replaces it by a MeterProvider wrapper that owns the lifetime of the OpenCensus registry and the servers associated with the MeterProvider. **Link to tracking Issue:** Relates to #4970 (first refactor before trying out the factory pattern in an internal package)
**Description:** Moves logging messages about the meter provider outside of the meter provider initialization. While working on #4970 (comment), I realized there is an implicit dependency in the initialization order of the different telemetry components. I tried making this work and make the factory have a single `CreateTelemetrySettings`, but this results in an awkward API, so I am trying the alternative here: don't log during the meter provider initialization, but do so outside of it. **Link to tracking Issue:** Relates to #4970 --------- Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Switches `service/telemetry` to a factory pattern. To avoid adding a lot of public API in one go: 1. the actual factory builder is in an internal package 2. I have not added the `CreateMeterProvider` method yet There are two goals with this: one is to make progress on #4970, the other is to allow initializing telemetry sooner: <!-- Issue number if applicable --> #### Link to tracking issue Updates #4970. <!--Describe what testing was performed and which tests were added.--> #### Testing Updates existing tests to use `NewFactory`
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Switches `service/telemetry` to a factory pattern. To avoid adding a lot of public API in one go: 1. the actual factory builder is in an internal package 2. I have not added the `CreateMeterProvider` method yet There are two goals with this: one is to make progress on open-telemetry#4970, the other is to allow initializing telemetry sooner: <!-- Issue number if applicable --> #### Link to tracking issue Updates open-telemetry#4970. <!--Describe what testing was performed and which tests were added.--> #### Testing Updates existing tests to use `NewFactory`
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Switches `service/telemetry` to a factory pattern. To avoid adding a lot of public API in one go: 1. the actual factory builder is in an internal package 2. I have not added the `CreateMeterProvider` method yet There are two goals with this: one is to make progress on open-telemetry#4970, the other is to allow initializing telemetry sooner: <!-- Issue number if applicable --> #### Link to tracking issue Updates open-telemetry#4970. <!--Describe what testing was performed and which tests were added.--> #### Testing Updates existing tests to use `NewFactory`
Removed this from the Collector v1 board based on #10643 |
Is your feature request related to a problem? Please describe.
When embedding the OpenTelemetry Collector on an existing application, said application may have an existing telemetry system for logging, metrics and/or tracing. With the current API, while some features of the telemetry may be configurable through the configuration provider (e.g. one can enable/disable the metrics telemetry), the telemetry can't be easily integrated into existing systems.
Currently, the logging features can be overriden since one can pass
LoggingOptions
,opentelemetry-collector/service/settings.go
Lines 64 to 65 in f980c9e
zap.Core
and have arbitrary logging, but the metrics and traces providers cannot be modified apart from the config exposed in the YAML.Describe the solution you'd like
A new
TelemetryFactory
can be passed toCollectorSettings
. This provider would have a type similar to:A default
TelemetryFactory
is provided for distributions that want to keep the current behavior or build upon the current behavior. This factory can take options such aszap.Logger
options.The
LoggingOptions
field is eventually removed, since it's superseded by the default factory.Describe alternatives you've considered
CollectorSettings
could instead have newMetricsOptions
andTracingOptions
of some kind, although this seems more restrictive.Additional context
I think this will allow creating the telemetry providers at the same point in the initialization as today, but care should be taken to ensure that this can be done as soon as possible.
The factory for telemetry settings should be extendable in the future for new types of telemetry that OpenTelemetry may add. This is effectively the same problem as the one we have with component factories, and we can apply the same solution here.
The text was updated successfully, but these errors were encountered: