Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

add Module API callback for user directory search #12318

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion synapse/handlers/user_directory.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# limitations under the License.

import logging
from typing import TYPE_CHECKING, Any, Dict, List, Optional
from typing import TYPE_CHECKING, Any, Awaitable, Callable, Dict, List, Optional

import synapse.metrics
from synapse.api.constants import EventTypes, HistoryVisibility, JoinRules, Membership
Expand All @@ -28,6 +28,9 @@

logger = logging.getLogger(__name__)

# Types for callbacks to be registered via the module api
ON_SEARH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ON_SEARH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]]
ON_SEARCH_USERS_CALLBACK = Callable[[str, str, int], Awaitable[Optional[SearchResult]]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like this callback's name to better reflect what it does. on_action implies that the callback allows modules to react after action has happened (which is also the logic other most on_[...] callbacks follow, with an exception or two). In this case, it implies that this callback is called after a user directory search is complete.

In this case, maybe something like get_external_results would cut it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I didn't consider the sequential meaning on_[...] would bring
get_external_results feels a bit too general, and external potentially isn't the most correct either, as these users could be already mirrored to database.
SEARCH_USERS_CALLBACK simply?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search_users feels to me like it can be a bit misleading as the callback is only for searching users that are on an separate platform (ie not Matrix users that are either local to the homeserver or share rooms with the server).
search_external_users could be a good middle ground, wdyt?



class UserDirectoryHandler(StateDeltasHandler):
"""Handles queries and updates for the user_directory.
Expand Down Expand Up @@ -76,6 +79,14 @@ def __init__(self, hs: "HomeServer"):
# we start populating the user directory
self.clock.call_later(0, self.notify_new_event)

self._on_search_users_callbacks: List[ON_SEARH_USERS_CALLBACK] = []

def register_user_directory_callbacks(
self, on_search_users: Optional[ON_SEARH_USERS_CALLBACK] = None
) -> None:
if on_search_users is not None:
self._on_search_users_callbacks.append(on_search_users)

async def search_users(
self, user_id: str, search_term: str, limit: int
) -> SearchResult:
Expand All @@ -97,6 +108,20 @@ async def search_users(
"""
results = await self.store.search_user_dir(user_id, search_term, limit)

merged_results = results["results"]
existing_user_ids = [user["user_id"] for user in merged_results]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merging the external user directory with synapse's works so that: if user_id match -> user match.
I don't know if it is realistic that the display name or avatar is different in the "backend", but this seems like the safe(r) thing to do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we'd want the external backend to have the last word here, so I think we can just call results["results"].update(external_users).

for callback in self._on_search_users_callbacks:
external_users = await callback(user_id, search_term, limit)
if external_users is not None:
merged_results += [
Samonitari marked this conversation as resolved.
Show resolved Hide resolved
user
for user in external_users["results"]
if user["user_id"] not in existing_user_ids
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some error handling around this so that a faulty module doesn't completely disable user search. See this snippet for inspiration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, correcting


results["results"] = merged_results
results["limited"] = len(merged_results) > limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes much sense to update limited like this. We do this in search_user_dir because we request limit + 1 elements from the database (but even that is a bit meh), but this doesn't really apply here. I think we should just leave results["limited"] as it is.

Copy link
Author

@Samonitari Samonitari Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did only really understood the logic behind limited after I committed.
SQL query does not report if the limit was needed or not - so the +1 is a workaround so we can now if there are more users. And that is a bit "meh":
Maybe the workaround in synapse/storage/databases/main/user_directory could also strip the the last found result if query returned limit + 1, as: technically we maybe are falsely reporting limited when in fact, there is exactly limit+1 users.

OTOH, the correct merging would be maybe:

  • if limited is false before merge:
    • pass limit - database_result_size to callback
    • callback should also return limit - database_result_size + 1 if more users are found than requested
    • check the callback's returned result size, set the limited accordingly, and merge all or all - 1
  • if limited is true before merge, don't merge at all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, this would mean that if the database query yielded a limited result set we would only show results in the external backend if their user ID appears in the result set from the database. That's assuming we change the merge logic as suggested in #12318 (comment). Otherwise it means we don't show results (and presumably don't call the callback at all) unless the database doesn't manage to provide a limit amount of results.

This I guess would make sense even though it's not a super obvious behaviour at first. Though it sounds to me like it would fill a rather niche case; in my experience it would be more logical to expect the other way around, with the custom external backend being the source of truth and falling back to Synapse's generic search to fill in the blanks. I think doing it this way would be more helpful.


# Remove any spammy users from the results.
non_spammy_users = []
for user in results["results"]:
Expand Down
10 changes: 10 additions & 0 deletions synapse/module_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@
ON_LOGGED_OUT_CALLBACK,
AuthHandler,
)
from synapse.handlers.user_directory import ON_SEARH_USERS_CALLBACK
from synapse.http.client import SimpleHttpClient
from synapse.http.server import (
DirectServeHtmlResource,
Expand Down Expand Up @@ -215,6 +216,7 @@ def __init__(self, hs: "HomeServer", auth_handler: AuthHandler) -> None:
self._third_party_event_rules = hs.get_third_party_event_rules()
self._password_auth_provider = hs.get_password_auth_provider()
self._presence_router = hs.get_presence_router()
self._user_directory_handler = hs.get_user_directory_handler()

#################################################################################
# The following methods should only be called during the module's initialisation.
Expand Down Expand Up @@ -354,6 +356,14 @@ def register_password_auth_provider_callbacks(
get_displayname_for_registration=get_displayname_for_registration,
)

def register_user_directory_callbacks(
self, *, on_search_users: Optional[ON_SEARH_USERS_CALLBACK] = None
) -> None:
"""Registers callbacks for user directory capabilities."""
return self._user_directory_handler.register_user_directory_callbacks(
on_search_users=on_search_users
)

def register_background_update_controller_callbacks(
self,
*,
Expand Down