Don't rebalance on too many consumers. Wait for topic to exist. #229

abuchanan-nr · 2019-03-12T18:38:08Z

Fix for #200

I'll try to write tests for this. Putting this up since there's some discussion in #200 on the best approach.

stevevls

Thanks for the PR! The logging changes look good, but I think the wait is in the wrong place.

stevevls · 2019-03-12T20:43:30Z

reader.go

@@ -827,6 +858,9 @@ func (r *Reader) handshake() error {
 		_ = conn.Close()
 	}()

+	// wait for topic to exist


As mentioned in #200, I don't think this is a good place to insert the wait as it will cause the rebalance to time out if the topic doesn't exist. I think that as you test it, you should be able to produce that condition fairly easily, but LMK if you find otherwise!

Hrm, I'm confused. This is being called before rebalance, and therefore before joinGroup and syncGroup. Also, joinGroup is the only place I see config.RebalanceTimeout being used. There's no rebalance/join to timeout at this point, unless I'm missing something.

This is a little bit subtle. In a nutshell, there are three types of participants in consumer groups: the coordinator, the leader (which is a special consumer), and the consumers. The coordinator is a Kafka broker whose responsibility is to manage group membership and coordinate rebalances.

When a rebalance occurs:

All the consumers phone in to the coordinator to join the group:

kafka-go/reader.go

Line 303 in d19f52f

response, err := conn.joinGroup(request)

Via the join group response, one of the consumers discovers that it has been appointed the leader and will be responsible for generating assignments:

kafka-go/reader.go

Line 337 in d19f52f

if iAmLeader := response.MemberID == response.LeaderID; iAmLeader {

Upon successful join, all consumers including the leader call sync group.

kafka-go/reader.go

Line 410 in d19f52f

response, err := conn.syncGroups(request)

The leader will produce assignments and include them in the sync group request. The other consumers leave that part of the request empty.

kafka-go/reader.go

Line 369 in d19f52f

if memberAssignments != nil {

The coordinator takes the leader's assignments and returns them to the other consumers via the sync group response.

So if you put in a blocking call before generating assignments, your leader won't ever send the sync group request with the assignments. If that happens, the coordinator will eventually time out the sync, evict the leader from the group, and then run another rebalance. It's not immediately obvious from reading the code that all this is happening, but you can read up on the protocol here: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal.

As with the case of more consumers than partitions, it's not a protocol level error to request a non-existent topic...it's a perfectly valid state. It's kind of odd to me that the CG protocol doesn't provide a way to flag that as an error, but that's not for us mere mortals to debate. 😉 Anyway, that's why I was recommending to return empty assignments and then leverage the topic watcher feature that's already built in to trigger the rebalance when the topic appears. Does that make sense?

@stevevls What about moving the wait for topic logic earlier?
Maybe inside of the run func?
Like here https://github.com/segmentio/kafka-go/blob/master/reader.go#L867

From my understanding there isn't any reason to move into the handshake logic if the topic doesn't exist. By moving it higher we can also reduce the amount of times we can block and reduce the overall amount of network calls made.

@ShaneSaww That's an interesting idea, though I wonder how it would handle the case where you have a large group of consumers all waiting for a topic to exist. The thing is, their wait times won't be synchronized, so you will have an initial flurry of rebalances as the consumers discover the existence of the topic and join the group.

Contrast that to where the group has already gone through the handshake. In that case, only a single rebalance would be triggered. Furthermore, having gone through handshake, all the consumers will be in sync and heartebeating, so the first member to detect the existence of the topic will force the rebalance.

What do you think?

@ShaneSaww @abuchanan-nr I started to get curious about how other libraries handle the case of unknown topics, so I looked at Sarama and librdkafka.

Sarama handles it like kafka-go currently does--if the topic doesn't exist, the CG leader throws an error and the rebalance goes into a failure loop. Not terribly desirable.

AFAICT, librdkafka ignores unknown topics at assignment time: https://github.com/edenhill/librdkafka/blob/5140cd9d965406a144de067b775bc9ca5255aea1/src/rdkafka_cgrp.c#L924. But it does succeed the rebalance. I would appreciate another set of eyes on that code b/c I'm not a C++ or librdkafka expert. 😄 Either way, I think we should mimic whatever librdkafka is doing.

On another note, I think we've gotten into the weeds a bit with handling the unknown topic. One thing we can consider is merging in the fix for number consumers > number of partitions and then solve the unknown topic case in a separate PR. Up to you!

@stevevls I can look into the java code as well.

merging in the fix for number consumers > number of partitions and then solve the unknown topic case in a separate PR.
I like this idea.

Thanks for all the great discussion.

I think it's important to move forward here. There are some important cases mentioned in #200 that could be very common.

We can revisit to optimize rebalances for unknown topics in future work. I agree it's important, but might take awhile.

We can either use this PR as-is, or I could move the waitForTopic into run where it would have its own connection that is not associated with a group coordinator.

Thoughts?

@ShaneSaww Did you get a chance to see how the java client works? I'm still of the opinion that we should take the librdkafka approach where a non-existent topic simply results in empty assignments, but I think it's most important to behave like other official libraries in case there are places where folks need to inter-operate between languages.

Yeah, from what I can tell with the java client it does something like

start everything as normal

wait for metadata refresh

if metdata refresh shows a topic that you care about rebalance

WillAbides · 2019-03-22T17:16:06Z

reader.go

@@ -436,7 +439,9 @@ func (r *Reader) syncGroup(conn *Conn, memberAssignments GroupMemberAssignments)

 	if len(assignments.Topics) == 0 {
 		generation, memberID := r.membership()
-		return nil, fmt.Errorf("received empty assignments for group, %v as member %s for generation %d", r.config.GroupID, memberID, generation)
+		r.withLogger(func(l *log.Logger) {


If I'm reading this correctly, this is the only change required to address #200.

Could we make this a ReaderOption AllowEmptyAssignments? That would allow the option of preserving existing behavior for the non-existent topic case while allowing us to move forward with allowing more consumers than topics.

There are a few downsides to consider there:

It would be public, and therefore hard to get rid of.

I'd argue it should default to true, to avoid a risky, subtle, and possibly common situation.

Too much config is a bad thing, IMO.

I prefer not to have a configuration option. Currently, the code is incorrect because there are valid use cases and configurations that should work but don't. The downside of allowing empty assignments is that perhaps someone accidentally spins up too many consumers and some number of them sit idle. Preventing things like that feels like it's outside the scope of this library.

Those are good arguments for not making it an option. Consider my suggestion withdrawn.

stevevls · 2019-03-22T18:01:01Z

@abuchanan-nr I'm going to re-float the idea of splitting this into two PRs. 😄 What do you think about scoping this one down to just the problem of num consumers > num partitions?

abuchanan-nr · 2019-03-22T18:07:36Z

I'm going to re-float the idea of splitting this into two PRs.

So this PR would remove the waitForTopic stuff?

stevevls · 2019-03-22T18:12:55Z

Yeah...deferring the waitForTopic functionality to another PR is an option. Everyone seems in agreement on allowing the assignments to be empty, and that fixes one immediate problem. I think there's still some discussion to be had on the correct approach for waitForTopic, so splitting the PR would allow us to fix one issue while we make progress the another.

abuchanan-nr · 2019-03-22T19:40:58Z

Works for me.

The only thing holding me up is figuring out how to write a test for this. Only idea so far: sleep long enough for rebalances to occur, then check ReaderStats. But, it would need to sleep for a relatively long time (~30 seconds).

stevevls · 2019-03-22T21:03:12Z

Yeah...that's a tough one...in part because we're trying to prove a negative (the rebalance doesn't churn). I looked through the code to see if there was any internal state of the Reader that we might be able to poke at, but nothing jumped out. Another thing that came to mind, though, was you could create a topic with a single partition, spin up a bunch of reader (say 10?), then publish messages at a fixed rate and ensure the same reader received all of them. Sounds complicated, though, and also time-dependent. The rebalance idea may be a simpler!

abuchanan-nr · 2019-03-22T23:24:40Z

Created #242 for the more minimal fix in order to avoid nuking any comments going in this thread.

Feel free to close this PR if you want.

stevevls · 2019-04-04T02:13:46Z

Closing this, but I opened #251 to make sure we eventually follow up. Thanks!

abuchanan-nr added 2 commits March 11, 2019 13:49

don't rebalance on too many consumers

dbff5dd

block on topic existence

33590e0

abuchanan-nr mentioned this pull request Mar 12, 2019

Consumer group larger than partition count causes constant reconnection attempts #200

Closed

stevevls self-requested a review March 12, 2019 20:03

stevevls self-assigned this Mar 12, 2019

stevevls suggested changes Mar 12, 2019

View reviewed changes

WillAbides reviewed Mar 22, 2019

View reviewed changes

abuchanan-nr mentioned this pull request Mar 22, 2019

reader: allow empty partition assignments when joining a group #242

Merged

stevevls mentioned this pull request Apr 4, 2019

Consumer Groups gracefully handle non-existent topics #251

Closed

stevevls closed this Apr 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't rebalance on too many consumers. Wait for topic to exist. #229

Don't rebalance on too many consumers. Wait for topic to exist. #229

abuchanan-nr commented Mar 12, 2019

stevevls left a comment

stevevls Mar 12, 2019

abuchanan-nr Mar 12, 2019

stevevls Mar 12, 2019

ShaneSaww Mar 13, 2019

stevevls Mar 13, 2019

stevevls Mar 14, 2019

ShaneSaww Mar 14, 2019 •

edited

Loading

abuchanan-nr Mar 22, 2019

stevevls Mar 22, 2019

ShaneSaww Mar 25, 2019

WillAbides Mar 22, 2019

abuchanan-nr Mar 22, 2019

stevevls Mar 22, 2019

WillAbides Mar 22, 2019

stevevls commented Mar 22, 2019

abuchanan-nr commented Mar 22, 2019

stevevls commented Mar 22, 2019

abuchanan-nr commented Mar 22, 2019

stevevls commented Mar 22, 2019

abuchanan-nr commented Mar 22, 2019

stevevls commented Apr 4, 2019

Don't rebalance on too many consumers. Wait for topic to exist. #229

Don't rebalance on too many consumers. Wait for topic to exist. #229

Conversation

abuchanan-nr commented Mar 12, 2019

stevevls left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ShaneSaww Mar 14, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevevls commented Mar 22, 2019

abuchanan-nr commented Mar 22, 2019

stevevls commented Mar 22, 2019

abuchanan-nr commented Mar 22, 2019

stevevls commented Mar 22, 2019

abuchanan-nr commented Mar 22, 2019

stevevls commented Apr 4, 2019

ShaneSaww Mar 14, 2019 •

edited

Loading