Skip to content

There is simple documents searcher project based on Rust and Elasticsearch technologies.

Notifications You must be signed in to change notification settings

breadrock1/doc-searcher

Repository files navigation

Pull Request Actions Build

Target - Linux Target - MacOS Target - Windows

Doc-Searcher

Doc-Searcher is a simple and flexible document search application, leveraging the capabilities of Rust and Elasticsearch (by default) to provide efficient and effective full-text search in documents. This project aims to offer a straightforward solution for indexing and searching through a large corpus of documents with the speed and accuracy provided by Elasticsearch.

The main goal - implement simple but powerful system of storing and indexing documents with searching functionality (full-text, semantic). I decided to use elasticsearch as default searching engine, but you may use own solutions by implementing several async traits for Tantivy, QDrant or own solution:

  • CacherService - API of doc-notifier service interactions;
  • EmbeddingsService - API of doc-notifier service interactions;
  • MetricsService - API of metrics to monitoring;
  • StorageService - API (CRUD) of indexed folders and documents;
  • SearcherService - API of searcher functionalities (fulltext, vector, similar).

Features

  • Full-Text Search: Quickly find documents based on content based on choose searching engine;
  • Semantic Search: Fast semantic searching by external embeddings service;
  • Rust Performance: Benefit from the speed and safety of Rust;
  • REST API: Easy to use REST API for searching documents and control management of indexing;
  • Docker Support: Easy deployment with Docker and docker-compose;
  • Caching Actor: Store data to cache service like Redis or own solutions;
  • Remote logging: Send error or warning messages or other metrics to remote server;
  • Swagger: Using swagger documentation service for all available endpoints;
  • Cors Origins: Allows to provide web pages with access to resources of another domain;
  • Parsing and storing: Allows to parse and store files to searching engine localy.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

  • Rust
  • Docker & docker-compose
  • Cache (Redis)
  • Elasticsearch

Installation

  1. Clone the repository
  2. Run cargo install --path . to build project
  3. Setting up .env file with services creds
  4. Run cargo run --bin doc-searcher-init to init elasticsearch schemas
  5. Run cargo run --bin doc-searcher-run to launch service

Features of project

Features to parse and store documents localy from current service (Not stable):

  • enable-cacher - enable cacher service like redis oe other custom implementation;
  • enable-semantic - enable llm service for semantic searching.

Bread White - doc-searcher

stars - doc-searcher forks - doc-searcher

About

There is simple documents searcher project based on Rust and Elasticsearch technologies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published