-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
benchmark: add latency/MTU/bandwidth into testcases #1304
Conversation
benchmark/benchmark.go
Outdated
@@ -137,8 +138,11 @@ type ServerInfo struct { | |||
|
|||
// StartServer starts a gRPC server serving a benchmark service according to info. | |||
// It returns its listen address and a function to stop the server. | |||
func StartServer(info ServerInfo, opts ...grpc.ServerOption) (string, func()) { | |||
func StartServer(nw *latency.Network, info ServerInfo, opts ...grpc.ServerOption) (string, func()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding one more parameter to the function, why not add Network
into struct ServeInfo
?
So you don't need to change all the caller of StartServer
benchmark/benchmark.go
Outdated
s := stats.AddStats(b, 38) | ||
nw := &latency.Network{Kbps: kbps, Latency: ltc, MTU: mtu} | ||
nw.Kbps = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this?
This function should always set nw.Kbps
to the parameter kbps
passed in.
If we want to run benchmarks only with kbps
-1, we should make sure this function is called with kbps=-1
, instead of setting nw.Kbps
to -1 inside the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to delete when I was testing
benchmark/benchmark.go
Outdated
grpc.WithInsecure(), | ||
grpc.WithDialer( | ||
func(address string, timeout time.Duration) (net.Conn, error) { | ||
clientConn, err := nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to name the returned conn just conn
. clientConn
usually means a gRPC client.
benchmark/benchmark.go
Outdated
func(address string, timeout time.Duration) (net.Conn, error) { | ||
clientConn, err := nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout) | ||
if err != nil { | ||
b.Fatalf("Unexpected error dialing: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return nil, err
instead of fatal?
benchmark/benchmark.go
Outdated
grpc.WithInsecure(), | ||
grpc.WithDialer( | ||
func(address string, timeout time.Duration) (net.Conn, error) { | ||
clientConn, err := nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this dialer, too.
benchmark/server/main.go
Outdated
@@ -28,7 +29,8 @@ func main() { | |||
grpclog.Fatalf("Failed to serve: %v", err) | |||
} | |||
}() | |||
addr, stopper := benchmark.StartServer(benchmark.ServerInfo{Addr: ":0", Type: "protobuf"}) // listen on all interfaces | |||
nw := &latency.Network{} | |||
addr, stopper := benchmark.StartServer(nw, benchmark.ServerInfo{Addr: ":0", Type: "protobuf"}) // listen on all interfaces |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass a nil
network instead.
If you move nw into ServerInfo
as what I suggested in previous comment, this line doesn't need to be changed.
benchmark/worker/benchmark_server.go
Outdated
if config.PayloadConfig != nil { | ||
switch payload := config.PayloadConfig.Payload.(type) { | ||
case *testpb.PayloadConfig_BytebufParams: | ||
opts = append(opts, grpc.CustomCodec(byteBufCodec{})) | ||
addr, closeFunc = benchmark.StartServer(benchmark.ServerInfo{ | ||
addr, closeFunc = benchmark.StartServer(nw, benchmark.ServerInfo{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment in server/main.go.
benchmark/stats/stats.go
Outdated
@@ -85,7 +85,8 @@ func (stats *Stats) maybeUpdate() { | |||
stats.histogram = NewHistogram(HistogramOptions{ | |||
NumBuckets: numBuckets, | |||
// max-min(lower bound of last bucket) = (1 + growthFactor)^(numBuckets-2) * baseBucketSize. | |||
GrowthFactor: math.Pow(float64(stats.max-stats.min), 1/float64(numBuckets-2)) - 1, | |||
// GrowthFactor can't be 0 | |||
GrowthFactor: math.Pow(float64(stats.max-stats.min), 1/float64(numBuckets-2)) - 0.999, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing 1 to 0.999 seems arbitrary.
In what situation will this be 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When (stat.max-stat.min = 1). There is a parameter oneOverLogOnePlusGrowthFactor: 1 / math.Log(1+opts.GrowthFactor), which mean GrowthFactor can't be 0. I can also add a 0.001 to oneOverLogOnePlusGrowthFactor but it is the same.
The reason it happens is because the latency is divided by a int64 time unit. For example, If adding 40ms latency to the network, it may take 80.xxx ms for a round trip, so the final result only contains 80ms(stat.min) and 81ms(stat.max).
benchmark/benchmark.go
Outdated
grpc.WithInsecure(), | ||
grpc.WithDialer( | ||
func(address string, timeout time.Duration) (net.Conn, error) { | ||
connTimeout, err := nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can return directly here:
return nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout)
Also, nit, I would do this in remove some of the indentation
conn := NewClientConn(
target, grpc.WithInsecure(),
grpc.WithDialer(func(address string, timeout time.Duration) (net.Conn, error) {
return nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout)
}),
)
benchmark/benchmark.go
Outdated
grpc.WithInsecure(), | ||
grpc.WithDialer( | ||
func(address string, timeout time.Duration) (net.Conn, error) { | ||
connTimeout, err := nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as runUnary
return nw.TimeoutDialer(net.DialTimeout)("tcp", address, timeout)
benchmark/benchmark.go
Outdated
} | ||
|
||
// StartServer starts a gRPC server serving a benchmark service according to info. | ||
// It returns its listen address and a function to stop the server. | ||
func StartServer(info ServerInfo, opts ...grpc.ServerOption) (string, func()) { | ||
lis, err := net.Listen("tcp", info.Addr) | ||
nw := info.Network | ||
if nw != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the error before this.
Add latency, MTU and bandwidth into benchmark cases.
make benchmark
only run all testcases matches "trace, MTU=infinite, latency=infinite" in order to reduce the run time to 5 minutes.To run exact one testcase, the command is
go test benchmark/benchmark17_test.go benchmark/benchmark.go -benchmem -bench=BenchmarkClient/Stream-Tracing-kbps_0-MTU_0-maxConcurrentCalls_512-reqSize_1024-respSize_1-latency_0s
Because the benchmark uses the simulated network implemented by latency.go which contains extra operation, the simulated no delay is slower than ideal no delay.