You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found that often we want to be able to ensure that certain part of some topic was processed before starting doing joins with that topic.
For example we join stream A with table B. We know a point in time where topic B would be mostly read. It may not be completely recovered, because messages are constantly coming in. But we know that we need at least the portion before that point in time to be available for joins.
So the idea would be to wait until certain processing time of topic B before starting processing topic A.
I could assume that using processor’s stats something like this could be achieved, but I’m not sure.
I’ve tried sleeping in A’s callback until particular record of topic B is available, but found out that topic B is stalling when topic A sleeps.
The text was updated successfully, but these errors were encountered:
Currently what the processor does is the following: When it starts it queries the current HWM of all joined tables and save them. The processor then recovers its state and the joined tables up to their saved HWMs. Once that is done, the processor starts consuming the streams. In other words, the processor starts consuming the streams once the joined tables are recovered at least up to the HWM they had at the time the processor started.
If the topic being joined is not yet in the right state when the processor starts (and queries the HWM) then you may have problems. Goka does not support joins with time window or such. And hacking that with stats sounds like error prone.
Oh, if that's how it works, it is exactly what I need. But I was worrying if that's always the case and thought I need to do something else to ensure that. Great! Thanks!
I found that often we want to be able to ensure that certain part of some topic was processed before starting doing joins with that topic.
For example we join stream A with table B. We know a point in time where topic B would be mostly read. It may not be completely recovered, because messages are constantly coming in. But we know that we need at least the portion before that point in time to be available for joins.
So the idea would be to wait until certain processing time of topic B before starting processing topic A.
I could assume that using processor’s stats something like this could be achieved, but I’m not sure.
I’ve tried sleeping in A’s callback until particular record of topic B is available, but found out that topic B is stalling when topic A sleeps.
The text was updated successfully, but these errors were encountered: