docs: add retriever guide, address minor onboarding feedbacks & enhancement #1326

AyushExel · 2024-05-27T14:41:17Z

Tried to address some onboarding feedbacks listed in OSS Examples/ benchmarks/ Feature docs epic #1224
Improve visibility of pydantic integration and embedding API. (Based on onboarding feedback - Many ways of ingesting data, defining schema but not sure what to use in a specific use-case)
Add a guide that takes users through testing and improving retriever performance using built-in utilities like hybrid-search and reranking
Add some benchmarks for the above
Add missing cohere docs

github-actions · 2024-05-27T14:41:40Z

ACTION NEEDED

Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

…ng functions (#1335) Fixes #1329 Will update docs on #1326

docs/src/guides/tuning_retrievers/2_reranking.md

AyushExel · 2024-06-05T12:49:07Z

lancedb jni test issue seems unrelated to this PR

westonpace

Very cool, I think this is a great description

docs/src/basic.md

westonpace · 2024-06-07T15:31:44Z

docs/src/basic.md

@@ -180,6 +180,9 @@ table.

 !!! info "Under the hood, LanceDB reads in the Apache Arrow data and persists it to disk using the [Lance format](https://www.github.com/lancedb/lance)."

+!!! info "Automatic vectorization with Embedding API"
+    When working with embedding models, it is recommended to use LanceDB embedding API to automatically vectorize the data and queries in the background. See the [quickstart example](#using-embedding-api) or the embedding API [guide](./embeddings/)


Hmm, I don't know if vectorize is the right word to use here.

I'm used to the definition "use an algorithm that takes advantage of vectorized CPU capabilities (e.g. AVX2)" but I think you mean "convert into a vector". But maybe this is a second definition of "vectorize" that is common in ML descriptions?

yeah its pretty common.. IG usage difference is "vectorizing an operation or algo" vs "vectorizing data" .. Example https://neptune.ai/blog/vectorization-techniques-in-nlp-guide .. But I can try to say something like "create vector embeddings of the data" to avoid confusing both the audiences

Okay changed it to "create vector representation of the data"

docs/src/basic.md

docs/src/guides/tuning_retrievers/1_query_types.md

docs/src/guides/tuning_retrievers/2_reranking.md

Co-authored-by: Weston Pace <weston.pace@gmail.com>

update

f4f31ec

github-actions bot added the documentation Improvements or additions to documentation label May 27, 2024

AyushExel changed the title ~~docs: Add retriever guide, address minor onboarding feedbacks & enhancement (May audit)~~ docs: Add retriever guide, address minor onboarding feedbacks & enhancement May 27, 2024

AyushExel changed the title ~~docs: Add retriever guide, address minor onboarding feedbacks & enhancement~~ docs: add retriever guide, address minor onboarding feedbacks & enhancement May 27, 2024

AyushExel mentioned this pull request May 29, 2024

feat: add support for new cohere models in cohere and bedrock embedding functions #1335

Merged

AyushExel changed the title ~~docs: add retriever guide, address minor onboarding feedbacks & enhancement~~ docs: add retriever guide, address minor onboarding feedbacks & enhancement May 29, 2024

AyushExel added 2 commits May 29, 2024 18:48

add cohere embeddings docs

52feef5

update

290803f

github-actions bot added the Python Python SDK label May 29, 2024

ruff

10c3bfd

AyushExel added a commit that referenced this pull request May 30, 2024

feat: add support for new cohere models in cohere and bedrock embeddi…

16eff25

…ng functions (#1335) Fixes #1329 Will update docs on #1326

AyushExel added 2 commits May 31, 2024 15:50

update

19826f8

udpate

65f0336

AyushExel requested review from westonpace, changhiskhan and wjones127 and removed request for westonpace and changhiskhan June 5, 2024 12:34

AyushExel commented Jun 5, 2024

View reviewed changes

docs/src/guides/tuning_retrievers/2_reranking.md Show resolved Hide resolved

AyushExel requested review from changhiskhan and raghavdixit99 June 5, 2024 15:00

AyushExel assigned tanaymeh and unassigned tanaymeh Jun 5, 2024

AyushExel requested a review from tanaymeh June 5, 2024 15:00

westonpace approved these changes Jun 7, 2024

View reviewed changes

AyushExel and others added 2 commits June 8, 2024 05:59

Update docs/src/basic.md

0be52e1

Co-authored-by: Weston Pace <weston.pace@gmail.com>

Update docs/src/basic.md

5f63787

Co-authored-by: Weston Pace <weston.pace@gmail.com>

AyushExel and others added 7 commits June 8, 2024 06:02

Update docs/src/basic.md

9da3456

Co-authored-by: Weston Pace <weston.pace@gmail.com>

Update docs/src/basic.md

a04258f

Co-authored-by: Weston Pace <weston.pace@gmail.com>

Update docs/src/guides/tuning_retrievers/2_reranking.md

c96066b

Co-authored-by: Weston Pace <weston.pace@gmail.com>

Update docs/src/guides/tuning_retrievers/1_query_types.md

41e45d2

Co-authored-by: Weston Pace <weston.pace@gmail.com>

Update docs/src/guides/tuning_retrievers/1_query_types.md

ea7fc31

Co-authored-by: Weston Pace <weston.pace@gmail.com>

Update docs/src/guides/tuning_retrievers/1_query_types.md

9f883fb

Co-authored-by: Weston Pace <weston.pace@gmail.com>

update

e0794e4

AyushExel merged commit 76fc16c into main Jun 8, 2024
14 of 15 checks passed

AyushExel deleted the docs_may branch June 8, 2024 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add retriever guide, address minor onboarding feedbacks & enhancement #1326

docs: add retriever guide, address minor onboarding feedbacks & enhancement #1326

AyushExel commented May 27, 2024 •

edited

Loading

github-actions bot commented May 27, 2024

AyushExel commented Jun 5, 2024

westonpace left a comment

westonpace Jun 7, 2024

AyushExel Jun 8, 2024 •

edited

Loading

AyushExel Jun 8, 2024

docs: add retriever guide, address minor onboarding feedbacks & enhancement #1326

docs: add retriever guide, address minor onboarding feedbacks & enhancement #1326

Conversation

AyushExel commented May 27, 2024 • edited Loading

github-actions bot commented May 27, 2024

AyushExel commented Jun 5, 2024

westonpace left a comment

Choose a reason for hiding this comment

westonpace Jun 7, 2024

Choose a reason for hiding this comment

AyushExel Jun 8, 2024 • edited Loading

Choose a reason for hiding this comment

AyushExel Jun 8, 2024

Choose a reason for hiding this comment

AyushExel commented May 27, 2024 •

edited

Loading

AyushExel Jun 8, 2024 •

edited

Loading