[refactor] translation engines: common interface #3925

Bnyro · 2024-10-15T13:41:13Z

What does this PR do?

add a common answer template for translation engine results

Why is this change important?

see Common interface for translation engines #3576

How to test this PR locally?

use mozhi, lingva, LibreTranslate, Dictzone, MyMemory Translated or DeepL

Author's checklist

this is just a basic idea, more engines need to be converted if the approach seems okay
something similar should also be done to the weather engines

Related issues

closes #3576

searx/utils.py

Bnyro · 2024-10-17T14:17:53Z

As a side effect, this PR also improves some of the translation engines (e.g. LibreTranslate and Lingva) a lot. I should probably split these into dedicated engine module refactor commits in the future.

return42 · 2024-12-14T08:30:03Z

FYI: last force push was just a rebase on current master / before I go into the review.

This patch introduces the typing of the results. The why and how is described in the documentation, please generate the documentation .. $ make docs.clean docs.live and read the following articles in the "Developer documentation": - result types --> http://0.0.0.0:8000/dev/result_types/index.html The result types are available from the `searx.result_types` module. The following have been implemented so far: - base result type: `searx.result_type.Result` --> http://0.0.0.0:8000/dev/result_types/base_result.html - answer results --> http://0.0.0.0:8000/dev/result_types/answer.html including the type for translations (inspired by searxng#3925). For all other types (which still need to be set up in subsequent PRs), template documentation has been created for the transition period. The template documentation is the basis for the typing and is the first complete documentation of the results (needed for engine development). It is the "working paper" (the plan) with which further typifications can be implemented in subsequent PRs. - searxng#357 With the new (sub) types for `Answer`, the templates for the answers have also been revised, `Translation` are now displayed with collapsible entries (inspired by searxng#3925). !en-de dog The implementation for `Plugin` and `Answer` has been revised, see documentation: - Plugin: http://0.0.0.0:8000/dev/plugins/index.html - Answerer: http://0.0.0.0:8000/dev/answerers/index.html With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented) The autocompletion had a bug where the results from `Answer` had not been shown in the past. To test activate autocompletion and try search terms for which we have answerers - statistics: type `min 1 2 3` .. in the completion list you should find an entry like `[de] min(1, 2, 3) = 1` - random: type `random uuid` .. in the completion list, the first item is a random UUID SearXNG extends e.g. the request and response types of flask and httpx, a module has been set up for type extensions: - Extended Types --> http://0.0.0.0:8000/dev/extended_types.html The unit tests have been completely revised. In the previous implementation, the runtime (the global variables such as `searx.settings`) was not initialized before each test, so the runtime environment with which a test ran was always determined by the tests that ran before it. This was also the reason why we sometimes had to observe non-deterministic errors in the tests in the past: - searxng#2988 is one example for the Runtime issues, with non-deterministic behavior .. - searxng#3650 - searxng#3654 - searxng#3642 (comment) - searxng#3746 (comment) We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past: - https://github.com/searxng/searxng/pull/1562/files - https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2 - https://github.com/searxng/searxng/pull/1412/files - searxng#1356 In my opinion, TypeDict is unsuitable because the objects are still dictionaries and not instances of classes / the `dataclass` are classes but ... The `msgspec.Struct` combine the advantages of typing, runtime behaviour and also offer the option of (fast) serializing (incl. type check) the objects. Currently not possible but conceivable with `msgspec`: Outsourcing the engines into separate processes, what possibilities this opens up in the future is left to the imagination! Internally, we have already defined that it is desirable to decouple the development of the engines from the development of the SearXNG core / The serialization of the `Result` objects is a prerequisite for this. .. note:: The threads listed above were the template for this PR, even though the implementation here is based on msgspec. They should also be an inspiration for the following PRs of typification, as the models and implementations can provide a good direction. I tried to create several (thematically separated) commits, but gave up at some point ... there are too many things to tackle at once / The comprehensibility of the commits would not be improved by a thematic separation. On the contrary, we would have to make multiple changes at the same places and the goal of a change would be vaguely recognizable in the fog of the commits. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

Typification of SearXNG ======================= This patch introduces the typing of the results. The why and how is described in the documentation, please generate the documentation .. $ make docs.clean docs.live and read the following articles in the "Developer documentation": - result types --> http://0.0.0.0:8000/dev/result_types/index.html The result types are available from the `searx.result_types` module. The following have been implemented so far: - base result type: `searx.result_type.Result` --> http://0.0.0.0:8000/dev/result_types/base_result.html - answer results --> http://0.0.0.0:8000/dev/result_types/answer.html including the type for translations (inspired by searxng#3925). For all other types (which still need to be set up in subsequent PRs), template documentation has been created for the transition period. Doc of the fields used in Templates =================================== The template documentation is the basis for the typing and is the first complete documentation of the results (needed for engine development). It is the "working paper" (the plan) with which further typifications can be implemented in subsequent PRs. - searxng#357 Answer Templates ================ With the new (sub) types for `Answer`, the templates for the answers have also been revised, `Translation` are now displayed with collapsible entries (inspired by searxng#3925). !en-de dog Plugins & Answerer ================== The implementation for `Plugin` and `Answer` has been revised, see documentation: - Plugin: http://0.0.0.0:8000/dev/plugins/index.html - Answerer: http://0.0.0.0:8000/dev/answerers/index.html With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented) Autocomplete ============ The autocompletion had a bug where the results from `Answer` had not been shown in the past. To test activate autocompletion and try search terms for which we have answerers - statistics: type `min 1 2 3` .. in the completion list you should find an entry like `[de] min(1, 2, 3) = 1` - random: type `random uuid` .. in the completion list, the first item is a random UUID Extended Types ============== SearXNG extends e.g. the request and response types of flask and httpx, a module has been set up for type extensions: - Extended Types --> http://0.0.0.0:8000/dev/extended_types.html Unit-Tests ========== The unit tests have been completely revised. In the previous implementation, the runtime (the global variables such as `searx.settings`) was not initialized before each test, so the runtime environment with which a test ran was always determined by the tests that ran before it. This was also the reason why we sometimes had to observe non-deterministic errors in the tests in the past: - searxng#2988 is one example for the Runtime issues, with non-deterministic behavior .. - searxng#3650 - searxng#3654 - searxng#3642 (comment) - searxng#3746 (comment) Why msgspec.Struct ================== We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past: - https://github.com/searxng/searxng/pull/1562/files - https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2 - https://github.com/searxng/searxng/pull/1412/files - searxng#1356 In my opinion, TypeDict is unsuitable because the objects are still dictionaries and not instances of classes / the `dataclass` are classes but ... The `msgspec.Struct` combine the advantages of typing, runtime behaviour and also offer the option of (fast) serializing (incl. type check) the objects. Currently not possible but conceivable with `msgspec`: Outsourcing the engines into separate processes, what possibilities this opens up in the future is left to the imagination! Internally, we have already defined that it is desirable to decouple the development of the engines from the development of the SearXNG core / The serialization of the `Result` objects is a prerequisite for this. HINT: The threads listed above were the template for this PR, even though the implementation here is based on msgspec. They should also be an inspiration for the following PRs of typification, as the models and implementations can provide a good direction. Why just one commit? ==================== I tried to create several (thematically separated) commits, but gave up at some point ... there are too many things to tackle at once / The comprehensibility of the commits would not be improved by a thematic separation. On the contrary, we would have to make multiple changes at the same places and the goal of a change would be vaguely recognizable in the fog of the commits. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

return42 · 2025-01-14T11:08:12Z

FYI: I included the commit from this PR in #4183

Typification of SearXNG ======================= This patch introduces the typing of the results. The why and how is described in the documentation, please generate the documentation .. $ make docs.clean docs.live and read the following articles in the "Developer documentation": - result types --> http://0.0.0.0:8000/dev/result_types/index.html The result types are available from the `searx.result_types` module. The following have been implemented so far: - base result type: `searx.result_type.Result` --> http://0.0.0.0:8000/dev/result_types/base_result.html - answer results --> http://0.0.0.0:8000/dev/result_types/answer.html including the type for translations (inspired by searxng#3925). For all other types (which still need to be set up in subsequent PRs), template documentation has been created for the transition period. Doc of the fields used in Templates =================================== The template documentation is the basis for the typing and is the first complete documentation of the results (needed for engine development). It is the "working paper" (the plan) with which further typifications can be implemented in subsequent PRs. - searxng#357 Answer Templates ================ With the new (sub) types for `Answer`, the templates for the answers have also been revised, `Translation` are now displayed with collapsible entries (inspired by searxng#3925). !en-de dog Plugins & Answerer ================== The implementation for `Plugin` and `Answer` has been revised, see documentation: - Plugin: http://0.0.0.0:8000/dev/plugins/index.html - Answerer: http://0.0.0.0:8000/dev/answerers/index.html With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented) Autocomplete ============ The autocompletion had a bug where the results from `Answer` had not been shown in the past. To test activate autocompletion and try search terms for which we have answerers - statistics: type `min 1 2 3` .. in the completion list you should find an entry like `[de] min(1, 2, 3) = 1` - random: type `random uuid` .. in the completion list, the first item is a random UUID Extended Types ============== SearXNG extends e.g. the request and response types of flask and httpx, a module has been set up for type extensions: - Extended Types --> http://0.0.0.0:8000/dev/extended_types.html Unit-Tests ========== The unit tests have been completely revised. In the previous implementation, the runtime (the global variables such as `searx.settings`) was not initialized before each test, so the runtime environment with which a test ran was always determined by the tests that ran before it. This was also the reason why we sometimes had to observe non-deterministic errors in the tests in the past: - searxng#2988 is one example for the Runtime issues, with non-deterministic behavior .. - searxng#3650 - searxng#3654 - searxng#3642 (comment) - searxng#3746 (comment) Why msgspec.Struct ================== We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past: - https://github.com/searxng/searxng/pull/1562/files - https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2 - https://github.com/searxng/searxng/pull/1412/files - searxng#1356 In my opinion, TypeDict is unsuitable because the objects are still dictionaries and not instances of classes / the `dataclass` are classes but ... The `msgspec.Struct` combine the advantages of typing, runtime behaviour and also offer the option of (fast) serializing (incl. type check) the objects. Currently not possible but conceivable with `msgspec`: Outsourcing the engines into separate processes, what possibilities this opens up in the future is left to the imagination! Internally, we have already defined that it is desirable to decouple the development of the engines from the development of the SearXNG core / The serialization of the `Result` objects is a prerequisite for this. HINT: The threads listed above were the template for this PR, even though the implementation here is based on msgspec. They should also be an inspiration for the following PRs of typification, as the models and implementations can provide a good direction. Why just one commit? ==================== I tried to create several (thematically separated) commits, but gave up at some point ... there are too many things to tackle at once / The comprehensibility of the commits would not be improved by a thematic separation. On the contrary, we would have to make multiple changes at the same places and the goal of a change would be vaguely recognizable in the fog of the commits. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

Bnyro force-pushed the translation-answer-template branch from e2b0fde to 751d4b2 Compare October 17, 2024 13:28

Bnyro commented Oct 17, 2024

View reviewed changes

searx/utils.py Outdated Show resolved Hide resolved

Bnyro force-pushed the translation-answer-template branch 3 times, most recently from 8339905 to c2a63e1 Compare October 17, 2024 14:16

Bnyro marked this pull request as ready for review October 17, 2024 14:16

Bnyro force-pushed the translation-answer-template branch 2 times, most recently from 55ae373 to 734a237 Compare October 17, 2024 15:15

Bnyro requested a review from return42 October 17, 2024 15:16

Bnyro force-pushed the translation-answer-template branch from 734a237 to 7edd75f Compare October 17, 2024 15:17

[refactor] translation engines: common interface

34b8def

return42 force-pushed the translation-answer-template branch from 7edd75f to 34b8def Compare December 14, 2024 08:29

return42 mentioned this pull request Jan 14, 2025

[refactor] typification of SearXNG (initial) / result items (part 1) #4183

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[refactor] translation engines: common interface #3925

[refactor] translation engines: common interface #3925

Bnyro commented Oct 15, 2024 •

edited

Loading

Bnyro commented Oct 17, 2024

return42 commented Dec 14, 2024

return42 commented Jan 14, 2025

[refactor] translation engines: common interface #3925

Are you sure you want to change the base?

[refactor] translation engines: common interface #3925

Conversation

Bnyro commented Oct 15, 2024 • edited Loading

What does this PR do?

Why is this change important?

How to test this PR locally?

Author's checklist

Related issues

Bnyro commented Oct 17, 2024

return42 commented Dec 14, 2024

return42 commented Jan 14, 2025

Bnyro commented Oct 15, 2024 •

edited

Loading