Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use github.com/klauspost/compress for gzip and zlib #10782

Merged
merged 3 commits into from
Feb 23, 2024

Conversation

prymitive
Copy link
Contributor

@prymitive prymitive commented May 31, 2022

Metrics from a test instance:

Screenshot from 2022-05-31 10-53-26
Screenshot from 2022-05-31 10-53-13

@prymitive prymitive requested a review from dgl as a code owner May 31, 2022 09:51
@prymitive prymitive changed the title Gzip Use github.com/klauspost/compress for gzip and zlib May 31, 2022
@prymitive prymitive requested a review from codesome as a code owner May 31, 2022 10:11
@prymitive
Copy link
Contributor Author

name                                 old time/op    new time/op    delta
TargetScraper_Gzip/metrics=1-8         65.4µs ± 5%    65.5µs ± 5%     ~     (p=0.871 n=50+50)
TargetScraper_Gzip/metrics=100-8        199µs ± 1%     193µs ± 2%   -3.23%  (p=0.000 n=42+42)
TargetScraper_Gzip/metrics=1000-8       891µs ±11%     876µs ±13%   -1.76%  (p=0.031 n=47+48)
TargetScraper_Gzip/metrics=10000-8     6.17ms ± 4%    6.02ms ± 2%   -2.51%  (p=0.000 n=46+39)
TargetScraper_Gzip/metrics=100000-8    61.5ms ± 3%    61.4ms ± 3%     ~     (p=0.470 n=47+49)

name                                 old alloc/op   new alloc/op   delta
TargetScraper_Gzip/metrics=1-8         6.61kB ± 0%    6.61kB ± 0%     ~     (p=0.917 n=50+50)
TargetScraper_Gzip/metrics=100-8       80.1kB ± 4%    79.8kB ± 4%     ~     (p=0.327 n=50+49)
TargetScraper_Gzip/metrics=1000-8       584kB ± 2%     584kB ± 2%     ~     (p=0.576 n=50+50)
TargetScraper_Gzip/metrics=10000-8     4.48MB ± 2%    4.48MB ± 1%     ~     (p=0.964 n=50+50)
TargetScraper_Gzip/metrics=100000-8    68.1MB ± 0%    68.1MB ± 0%   -0.04%  (p=0.019 n=50+49)

name                                 old allocs/op  new allocs/op  delta
TargetScraper_Gzip/metrics=1-8           75.0 ± 0%      75.0 ± 0%     ~     (all equal)
TargetScraper_Gzip/metrics=100-8         94.0 ± 0%      90.0 ± 0%   -4.26%  (p=0.000 n=50+50)
TargetScraper_Gzip/metrics=1000-8         112 ± 0%       107 ± 0%   -4.46%  (p=0.000 n=50+50)
TargetScraper_Gzip/metrics=10000-8        214 ± 1%       195 ± 1%   -8.68%  (p=0.000 n=46+49)
TargetScraper_Gzip/metrics=100000-8     1.41k ± 0%     0.96k ± 1%  -31.64%  (p=0.000 n=47+50)

@roidelapluie
Copy link
Member

What are the tradeoffs made by the library? Did they submit their changes upstream?

@prymitive
Copy link
Contributor Author

What are the tradeoffs made by the library? Did they submit their changes upstream?

I'm not aware of any trade-offs but I do recall seeing something that main benefits are for amd64 arch.
I know that some changes were merged upstream but that was a while ago, I've raised klauspost/compress#616 to clarify.

@prymitive
Copy link
Contributor Author

See klauspost/compress#617 for the answer.

@bboreham
Copy link
Member

bboreham commented Jun 8, 2022

Metrics from a test instance:

Can you explain (or point out where I missed the explanation) what these pictures show?

Assume I know nothing about your environment, your dashboards, what you changed, when you changed it, etc.

@prymitive
Copy link
Contributor Author

I had a 2.36.0 running on a biggish Prometheus server scraping ~21M time series.
Then I deployed a 2.36.0 + this patch there. Time of deployment is that gap on both graphs.
As you can see on cpu graph the amount of CPU prometheus uses there dropped from 31 cpu cores to 30 cpu cores.
Memory usage doesn't show much of a difference, so the main gain is that cpu usage drop.

Does that help?

@bboreham
Copy link
Member

bboreham commented Jun 8, 2022

Thanks, that helps a bit.

I am puzzled why your charts don't show a repeating pattern every 2 hours, which is the standard head compaction interval.

Also puzzled if your entire Prometheus benefitted by 3.3%, why did your benchmark focused solely on this one improvement show a smaller benefit?

@prymitive
Copy link
Contributor Author

Thanks, that helps a bit.

I am puzzled why your charts don't show a repeating pattern every 2 hours, which is the standard head compaction interval.

We run prometheus with 30m min/max tsdb block range so we can run GC more often.
This was a testing instance and I think it also had #10709 applied.

Also puzzled if your entire Prometheus benefitted by 3.3%, why did your benchmark focused solely on this one improvement show a smaller benefit?

name                                 old time/op    new time/op    delta
TargetScraper_Gzip/metrics=1-8         65.4µs ± 5%    65.5µs ± 5%     ~     (p=0.871 n=50+50)
TargetScraper_Gzip/metrics=100-8        199µs ± 1%     193µs ± 2%   -3.23%  (p=0.000 n=42+42)
TargetScraper_Gzip/metrics=1000-8       891µs ±11%     876µs ±13%   -1.76%  (p=0.031 n=47+48)
TargetScraper_Gzip/metrics=10000-8     6.17ms ± 4%    6.02ms ± 2%   -2.51%  (p=0.000 n=46+39)
TargetScraper_Gzip/metrics=100000-8    61.5ms ± 3%    61.4ms ± 3%     ~     (p=0.470 n=47+49)

It shows 3.23% for a response with 100 metrics. As usual with benchmarking how you run a test influences the outcome, that's why I deployed it to a test instance to see real world results.

@bboreham
Copy link
Member

bboreham commented Jun 9, 2022

I would like to propose a change to your benchmark, to do compression just once outside of the benchmark loop.

This way, the benchmark is more focused on decompression and gives a clearer signal:

name                                old time/op    new time/op    delta
TargetScraperGzip/metrics=1-8         32.5µs ± 4%    31.8µs ± 4%   -2.43%  (p=0.023 n=10+10)
TargetScraperGzip/metrics=100-8       50.0µs ± 2%    45.9µs ± 0%   -8.14%  (p=0.000 n=10+9)
TargetScraperGzip/metrics=1000-8       191µs ± 3%     165µs ± 2%  -13.70%  (p=0.000 n=10+10)
TargetScraperGzip/metrics=10000-8     1.68ms ± 1%    1.37ms ± 1%  -18.72%  (p=0.000 n=9+9)
TargetScraperGzip/metrics=100000-8    16.0ms ± 0%    13.4ms ± 1%  -16.32%  (p=0.000 n=9+8)

name                                old alloc/op   new alloc/op   delta
TargetScraperGzip/metrics=1-8         6.01kB ± 0%    6.03kB ± 0%   +0.30%  (p=0.000 n=9+10)
TargetScraperGzip/metrics=100-8       7.02kB ± 0%    6.86kB ± 0%   -2.25%  (p=0.000 n=10+10)
TargetScraperGzip/metrics=1000-8      8.97kB ± 0%    8.28kB ± 0%   -7.75%  (p=0.000 n=10+10)
TargetScraperGzip/metrics=10000-8     16.8kB ± 0%    15.1kB ± 1%  -10.20%  (p=0.000 n=9+9)
TargetScraperGzip/metrics=100000-8     920kB ± 0%     779kB ± 2%  -15.34%  (p=0.000 n=7+10)

name                                old allocs/op  new allocs/op  delta
TargetScraperGzip/metrics=1-8           71.0 ± 0%      72.0 ± 0%   +1.41%  (p=0.000 n=10+10)
TargetScraperGzip/metrics=100-8         81.0 ± 0%      76.0 ± 0%   -6.17%  (p=0.000 n=10+10)
TargetScraperGzip/metrics=1000-8        92.0 ± 0%      83.0 ± 0%   -9.78%  (p=0.000 n=10+10)
TargetScraperGzip/metrics=10000-8       93.0 ± 0%      91.0 ± 0%   -2.15%  (p=0.000 n=10+10)
TargetScraperGzip/metrics=100000-8       110 ± 0%       135 ± 0%  +22.73%  (p=0.000 n=10+10)

(that last +22% object allocations is from decompression initialization; it is not material overall)

func BenchmarkTargetScraperGzip(b *testing.B) {
	scenarios := []struct {
		metricsCount int
		body         []byte
	}{
		{metricsCount: 1},
		{metricsCount: 100},
		{metricsCount: 1000},
		{metricsCount: 10000},
		{metricsCount: 100000},
	}

	for i := 0; i < len(scenarios); i++ {
		var buf bytes.Buffer
		var name string
		gw := gzip.NewWriter(&buf)
		for j := 0; j < scenarios[i].metricsCount; j++ {
			name = fmt.Sprintf("go_memstats_alloc_bytes_total_%d", j)
			fmt.Fprintf(gw, "# HELP %s Total number of bytes allocated, even if freed.\n", name)
			fmt.Fprintf(gw, "# TYPE %s counter\n", name)
			fmt.Fprintf(gw, "%s %d\n", name, i*j)
		}
		gw.Close()
		scenarios[i].body = buf.Bytes()
	}

	handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.Header().Set("Content-Type", `text/plain; version=0.0.4`)
		w.Header().Set("Content-Encoding", "gzip")
		for _, scenario := range scenarios {
			if strconv.Itoa(scenario.metricsCount) == r.URL.Query()["count"][0] {
				w.Write(scenario.body)
				return
			}
		}
		w.WriteHeader(http.StatusBadRequest)
	})

	server := httptest.NewServer(handler)
	defer server.Close()

	serverURL, err := url.Parse(server.URL)
	if err != nil {
		panic(err)
	}

	client, err := config_util.NewClientFromConfig(config_util.DefaultHTTPClientConfig, "test_job")
	if err != nil {
		panic(err)
	}

	for _, scenario := range scenarios {
		b.Run(fmt.Sprintf("metrics=%d", scenario.metricsCount), func(b *testing.B) {
			ts := &targetScraper{
				Target: &Target{
					labels: labels.FromStrings(
						model.SchemeLabel, serverURL.Scheme,
						model.AddressLabel, serverURL.Host,
					),
					params: url.Values{"count": []string{strconv.Itoa(scenario.metricsCount)}},
				},
				client:  client,
				timeout: time.Second,
			}
			b.ResetTimer()
			var buf bytes.Buffer
			for i := 0; i < b.N; i++ {
				buf.Reset()
				_, err = ts.scrape(context.Background(), &buf)
				require.NoError(b, err)
			}
		})
	}
}

@prymitive
Copy link
Contributor Author

Good call, updated, thanks

@bboreham
Copy link
Member

Given you changed BenchmarkGzip, you should probably state the difference before/after in this PR.

The changes to discovery/xds/kuma_mads.pb.go (annotated DO NOT EDIT) and tsdb/fileutil/flock_js.go do not seem relevant to compression, and I didn't see any explanation in the description or commit message why they are necessary.

@prymitive
Copy link
Contributor Author

Given you changed BenchmarkGzip, you should probably state the difference before/after in this PR.

The changes to discovery/xds/kuma_mads.pb.go (annotated DO NOT EDIT) and tsdb/fileutil/flock_js.go do not seem relevant to compression, and I didn't see any explanation in the description or commit message why they are necessary.

That was in a commit where I run gofumpt.
Rebased, updated commit message and added a lint check to ensure new code is also using this lib.

@bboreham
Copy link
Member

I didn't see any explanation in the description or commit message why they are necessary.

That was in a commit where I run gofumpt.

Ok that's what you did; please add why you did it , making sure you cover why you edited a file marked DO NOT EDIT.

@prymitive
Copy link
Contributor Author

Rebased on top of 2.39.0

@bboreham
Copy link
Member

Note I attempted to fix the way Prombench indicates increased CPU usage if your Prometheus runs faster at prometheus/test-infra#618

@bboreham
Copy link
Member

/prombench main

@prombot
Copy link
Contributor

prombot commented Nov 29, 2023

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-10782 and main

After successful deployment, the benchmarking results can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart main

@prombot
Copy link
Contributor

prombot commented Dec 2, 2023

Benchmark tests are running for 3 days! If this is intended ignore this message otherwise you can cancel it by commenting: /prombench cancel

@bboreham
Copy link
Member

bboreham commented Dec 3, 2023

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Dec 3, 2023

Benchmark cancel is in progress.

@bboreham
Copy link
Member

bboreham commented Dec 3, 2023

Prombench shows this is worse on CPU, memory and latency. Queries are executing at the same rate (IOW this is not the same problem as last time). I guess it's possible another rebase will improve things.

image image

@bwplotka
Copy link
Member

bwplotka commented Jan 9, 2024

(During Prometheus bug scrub meeting)

Looks like this PR was superseded by #13330
Closing this one.

@bwplotka bwplotka closed this Jan 9, 2024
@prymitive
Copy link
Contributor Author

Looks like this PR was superseded by #13330 Closing this one.

That PR is for using github.com/klauspost/compress is different places so I don’t think ot supersedes this one.

@bboreham bboreham reopened this Jan 10, 2024
@bboreham
Copy link
Member

bboreham commented Feb 6, 2024

I'll rebase this one, unless someone objects soon.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
klauspost/compress is a high quality drop-in replacement for common Go
compression libraries. Since Prometheus sends out a lot of HTTP requests
that often return compressed output having improved compression
libraries helps to save cpu & memory resources.
On a test Prometheus server I was able to see cpu reduction from 31 to
30 cores.

Benchmark results:

name                                old time/op    new time/op    delta
TargetScraperGzip/metrics=1-8         69.4µs ± 4%    69.2µs ± 3%     ~     (p=0.122 n=50+50)
TargetScraperGzip/metrics=100-8       84.3µs ± 2%    80.9µs ± 2%   -4.02%  (p=0.000 n=48+46)
TargetScraperGzip/metrics=1000-8       296µs ± 1%     274µs ±14%   -7.35%  (p=0.000 n=47+45)
TargetScraperGzip/metrics=10000-8     2.06ms ± 1%    1.66ms ± 2%  -19.34%  (p=0.000 n=47+45)
TargetScraperGzip/metrics=100000-8    20.9ms ± 2%    17.5ms ± 3%  -16.50%  (p=0.000 n=49+50)

name                                old alloc/op   new alloc/op   delta
TargetScraperGzip/metrics=1-8         6.06kB ± 0%    6.07kB ± 0%   +0.24%  (p=0.000 n=48+48)
TargetScraperGzip/metrics=100-8       7.04kB ± 0%    6.89kB ± 0%   -2.17%  (p=0.000 n=49+50)
TargetScraperGzip/metrics=1000-8      9.02kB ± 0%    8.35kB ± 1%   -7.49%  (p=0.000 n=50+50)
TargetScraperGzip/metrics=10000-8     18.1kB ± 1%    16.1kB ± 2%  -10.87%  (p=0.000 n=47+47)
TargetScraperGzip/metrics=100000-8    1.21MB ± 0%    1.01MB ± 2%  -16.69%  (p=0.000 n=36+50)

name                                old allocs/op  new allocs/op  delta
TargetScraperGzip/metrics=1-8           71.0 ± 0%      72.0 ± 0%   +1.41%  (p=0.000 n=50+50)
TargetScraperGzip/metrics=100-8         81.0 ± 0%      76.0 ± 0%   -6.17%  (p=0.000 n=50+50)
TargetScraperGzip/metrics=1000-8        92.0 ± 0%      83.0 ± 0%   -9.78%  (p=0.000 n=50+50)
TargetScraperGzip/metrics=10000-8       93.0 ± 0%      91.0 ± 0%   -2.15%  (p=0.000 n=50+50)
TargetScraperGzip/metrics=100000-8       111 ± 0%       135 ± 1%  +21.89%  (p=0.000 n=40+50)

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Since we want to use a faster gzip/zlib libraries we should add a lint check so that it's used everywhere

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
@prymitive
Copy link
Contributor Author

Sorry, I was on a leave.
Rebased this branch on top of latest main.

@bboreham
Copy link
Member

/prombench main

@prombot
Copy link
Contributor

prombot commented Feb 23, 2024

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-10782 and main

After successful deployment, the benchmarking results can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart main

@bboreham
Copy link
Member

Showing a little bit lower latency on query-range; memory and CPU much the same.
I think we're good to go.

image

@bboreham bboreham merged commit cc2c3f7 into prometheus:main Feb 23, 2024
25 checks passed
@bboreham
Copy link
Member

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Feb 23, 2024

Benchmark cancel is in progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants