Skip to content

Commit

Permalink
Fix duplicate documents with Slack connector
Browse files Browse the repository at this point in the history
  • Loading branch information
Weves committed Aug 10, 2023
1 parent a03818e commit 54ee323
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions backend/danswer/connectors/slack/connector.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,11 +168,17 @@ def get_all_docs(
client=client, channel=channel, oldest=oldest, latest=latest
)

seen_thread_ts: set[str] = set()
for message_batch in channel_message_batches:
for message in message_batch:
filtered_thread: ThreadType | None = None
thread_ts = message.get("thread_ts")
if thread_ts:
# skip threads we've already seen, since we've already processed all
# messages in that thread
if thread_ts in seen_thread_ts:
continue
seen_thread_ts.add(thread_ts)
thread = get_thread(
client=client, channel_id=channel["id"], thread_id=thread_ts
)
Expand Down

0 comments on commit 54ee323

Please sign in to comment.