Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor sparse vector handling in knowledge base and pgvector handler #10357

Merged
merged 4 commits into from
Jan 13, 2025

Conversation

dusvyat
Copy link
Contributor

@dusvyat dusvyat commented Jan 13, 2025

Description

Remove redundant sparse vector logic from knowledge base controller and streamline sparse vector checks in pgvector handler. Ensure vector_size is encapsulated correctly and simplify table creation for dense and sparse vectors.

Fixes #issue_number

Type of change

(Please delete options that are not relevant)

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ⚡ New feature (non-breaking change which adds functionality)
  • 📢 Breaking change (fix or feature that would cause existing functionality not to work as expected)
  • 📄 This change requires a documentation update

Verification Process

To ensure the changes are working as expected:

  • Test Location: Specify the URL or path for testing.
  • Verification Steps: Outline the steps or queries needed to validate the change. Include any data, configurations, or actions required to reproduce or see the new functionality.

Additional Media:

  • I have attached a brief loom video or screenshots showcasing the new functionality or change.

Checklist:

  • My code follows the style guidelines(PEP 8) of MindsDB.
  • I have appropriately commented on my code, especially in complex areas.
  • Necessary documentation updates are either made or tracked in issues.
  • Relevant unit and integration tests are updated or added.

Remove redundant sparse vector logic from knowledge base controller and streamline sparse vector checks in pgvector handler. Ensure `vector_size` is encapsulated correctly and simplify table creation for dense and sparse vectors.
Unified dense and sparse vector handling to streamline query generation. Added explicit type checks and conversions for vectors to support dicts and lists. Ensured appropriate distance operator is applied based on vector type.
@dusvyat dusvyat marked this pull request as ready for review January 13, 2025 17:33
Refactor the handling of `vector_size` to improve clarity and correctness. Make `vector_size` mandatory for sparse vectors and optional for dense ones. Adjust how `size_spec` is constructed to account for the presence or absence of `vector_size`.
@dusvyat dusvyat merged commit e0af017 into main Jan 13, 2025
14 checks passed
@dusvyat dusvyat deleted the additional-fixes-to-pgvectorhandler-for-sparse branch January 13, 2025 17:52
@mindsdb mindsdb locked and limited conversation to collaborators Jan 13, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants