-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
add Module API callback for user directory search #12318
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: Krisztian Szegi <k.git.hub@meszaros-szegi.cloud>
@@ -97,6 +108,23 @@ async def search_users( | |||
""" | |||
results = await self.store.search_user_dir(user_id, search_term, limit) | |||
|
|||
merged_results = results["results"] | |||
existing_user_ids = [user["user_id"] for user in merged_results] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merging the external user directory with synapse's works so that: if user_id match -> user match.
I don't know if it is realistic that the display name or avatar is different in the "backend", but this seems like the safe(r) thing to do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we'd want the external backend to have the last word here, so I think we can just call results["results"].update(external_users)
.
I'd like this to be merged, so #12247 could be closed. |
Is there no docs needed for |
Fair enough - I'll add the docs, and remove the commented out loop! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! I've left a few points I'd like to see addressed before we can merge this. I'd also like to see some tests for this new callback - see this snippet for a few examples of tests that involve module callbacks. I'd also like to see some documentation about this, but from the previous comments on this PR it looks like you're already on it.
@@ -28,6 +28,9 @@ | |||
|
|||
logger = logging.getLogger(__name__) | |||
|
|||
# Types for callbacks to be registered via the module api | |||
ON_SEARH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ON_SEARH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]] | |
ON_SEARCH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]] |
@@ -28,6 +28,9 @@ | |||
|
|||
logger = logging.getLogger(__name__) | |||
|
|||
# Types for callbacks to be registered via the module api | |||
ON_SEARH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also like this callback's name to better reflect what it does. on_action
implies that the callback allows modules to react after action
has happened (which is also the logic other most on_[...]
callbacks follow, with an exception or two). In this case, it implies that this callback is called after a user directory search is complete.
In this case, maybe something like get_external_results
would cut it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I didn't consider the sequential meaning on_[...]
would bring
get_external_results
feels a bit too general, and external potentially isn't the most correct either, as these users could be already mirrored to database.
SEARCH_USERS_CALLBACK
simply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_users
feels to me like it can be a bit misleading as the callback is only for searching users that are on an separate platform (ie not Matrix users that are either local to the homeserver or share rooms with the server).
search_external_users
could be a good middle ground, wdyt?
@@ -97,6 +108,23 @@ async def search_users( | |||
""" | |||
results = await self.store.search_user_dir(user_id, search_term, limit) | |||
|
|||
merged_results = results["results"] | |||
existing_user_ids = [user["user_id"] for user in merged_results] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we'd want the external backend to have the last word here, so I think we can just call results["results"].update(external_users)
.
synapse/handlers/user_directory.py
Outdated
external_users = await callback(user_id, search_term, limit) | ||
if external_users is not None: | ||
merged_results += [ | ||
user | ||
for user in external_users["results"] | ||
if user["user_id"] not in existing_user_ids | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some error handling around this so that a faulty module doesn't completely disable user search. See this snippet for inspiration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, correcting
] | ||
|
||
results["results"] = merged_results | ||
results["limited"] = len(merged_results) > limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes much sense to update limited
like this. We do this in search_user_dir
because we request limit + 1
elements from the database (but even that is a bit meh), but this doesn't really apply here. I think we should just leave results["limited"]
as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did only really understood the logic behind limited
after I committed.
SQL query does not report if the limit
was needed or not - so the +1 is a workaround so we can now if there are more users. And that is a bit "meh":
Maybe the workaround in synapse/storage/databases/main/user_directory
could also strip the the last found result if query returned limit + 1
, as: technically we maybe are falsely reporting limited when in fact, there is exactly limit+1
users.
OTOH, the correct merging would be maybe:
- if
limited
is false before merge:- pass
limit - database_result_size
to callback - callback should also return
limit - database_result_size + 1
if more users are found than requested - check the callback's returned result size, set the
limited
accordingly, and mergeall
orall - 1
- pass
- if
limited
is true before merge, don't merge at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, this would mean that if the database query yielded a limited result set we would only show results in the external backend if their user ID appears in the result set from the database. That's assuming we change the merge logic as suggested in #12318 (comment). Otherwise it means we don't show results (and presumably don't call the callback at all) unless the database doesn't manage to provide a limit
amount of results.
This I guess would make sense even though it's not a super obvious behaviour at first. Though it sounds to me like it would fill a rather niche case; in my experience it would be more logical to expect the other way around, with the custom external backend being the source of truth and falling back to Synapse's generic search to fill in the blanks. I think doing it this way would be more helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very sorry about the delay in my review, this had fallen off my radar somehow. I've added a few thoughts, though as mentioned before I'd also like to see some test for this. The callback will also need to be documented, see https://github.com/matrix-org/synapse/tree/master/docs/modules for reference.
@@ -28,6 +28,9 @@ | |||
|
|||
logger = logging.getLogger(__name__) | |||
|
|||
# Types for callbacks to be registered via the module api | |||
ON_SEARH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_users
feels to me like it can be a bit misleading as the callback is only for searching users that are on an separate platform (ie not Matrix users that are either local to the homeserver or share rooms with the server).
search_external_users
could be a good middle ground, wdyt?
] | ||
|
||
results["results"] = merged_results | ||
results["limited"] = len(merged_results) > limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, this would mean that if the database query yielded a limited result set we would only show results in the external backend if their user ID appears in the result set from the database. That's assuming we change the merge logic as suggested in #12318 (comment). Otherwise it means we don't show results (and presumably don't call the callback at all) unless the database doesn't manage to provide a limit
amount of results.
This I guess would make sense even though it's not a super obvious behaviour at first. Though it sounds to me like it would fill a rather niche case; in my experience it would be more logical to expect the other way around, with the custom external backend being the source of truth and falling back to Synapse's generic search to fill in the blanks. I think doing it this way would be more helpful.
Cool, thank you for the PR 🥳 We also need this feature, thus your PR looks very promising. Keep up your good work! |
@Samonitari or @janpawellek -- do you wish to continue this PR? It looks like it needs some updates and the feedback addressed. |
Yes, but to be a bit more direct - and to not make any bloody estimation lies - sometime this year I'll do this. |
Signed-off-by: Krisztian Szegi k.git.hub@meszaros-szegi.cloud
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)