forked from qdrant/qdrant
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Byte storage integration into segment (qdrant#4049)
* byte storage with quantization raw scorer integration config and test are you happy fmt fn renamings cow refactor use quantization branch quantization update * are you happy clippy * don't use distance in quantized scorers * fix build * add fn quantization_preprocess * apply preprocessing for only cosine float metric * fix sparse vectors tests * update openapi * more complicated integration test * update openapi comment * mmap byte storages support * fix async test * move .unwrap closer to the actual check of the vector presence * fmt * remove distance similarity function * avoid copying data while working with cow --------- Co-authored-by: generall <andrey@vasnetsov.com>
- Loading branch information
Showing
47 changed files
with
1,039 additions
and
217 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,41 +1,63 @@ | ||
use std::borrow::Cow; | ||
|
||
use itertools::Itertools; | ||
use serde::{Deserialize, Serialize}; | ||
|
||
use crate::common::operation_error::OperationResult; | ||
use crate::data_types::named_vectors::CowVector; | ||
use crate::data_types::vectors::{VectorElementType, VectorElementTypeByte, VectorRef}; | ||
use crate::data_types::vectors::{VectorElementType, VectorElementTypeByte}; | ||
use crate::types::QuantizationConfig; | ||
|
||
pub trait PrimitiveVectorElement: | ||
Copy + Clone + Default + Serialize + for<'a> Deserialize<'a> | ||
{ | ||
fn from_vector_ref(vector: VectorRef) -> OperationResult<Cow<[Self]>>; | ||
fn slice_from_float_cow(vector: Cow<[VectorElementType]>) -> Cow<[Self]>; | ||
|
||
fn vector_to_cow(vector: &[Self]) -> CowVector; | ||
fn slice_to_float_cow(vector: Cow<[Self]>) -> Cow<[VectorElementType]>; | ||
|
||
fn quantization_preprocess<'a>( | ||
quantization_config: &QuantizationConfig, | ||
vector: &'a [Self], | ||
) -> Cow<'a, [f32]>; | ||
} | ||
|
||
impl PrimitiveVectorElement for VectorElementType { | ||
fn from_vector_ref(vector: VectorRef) -> OperationResult<Cow<[Self]>> { | ||
let vector_ref: &[Self] = vector.try_into()?; | ||
Ok(Cow::from(vector_ref)) | ||
fn slice_from_float_cow(vector: Cow<[VectorElementType]>) -> Cow<[Self]> { | ||
vector | ||
} | ||
|
||
fn vector_to_cow(vector: &[Self]) -> CowVector { | ||
vector.into() | ||
fn slice_to_float_cow(vector: Cow<[Self]>) -> Cow<[VectorElementType]> { | ||
vector | ||
} | ||
|
||
fn quantization_preprocess<'a>( | ||
_quantization_config: &QuantizationConfig, | ||
vector: &'a [Self], | ||
) -> Cow<'a, [f32]> { | ||
Cow::Borrowed(vector) | ||
} | ||
} | ||
|
||
impl PrimitiveVectorElement for VectorElementTypeByte { | ||
fn from_vector_ref(vector: VectorRef) -> OperationResult<Cow<[Self]>> { | ||
let vector_ref: &[VectorElementType] = vector.try_into()?; | ||
let byte_vector = vector_ref.iter().map(|&x| x as u8).collect::<Vec<u8>>(); | ||
Ok(Cow::from(byte_vector)) | ||
fn slice_from_float_cow(vector: Cow<[VectorElementType]>) -> Cow<[Self]> { | ||
Cow::Owned(vector.iter().map(|&x| x as u8).collect()) | ||
} | ||
|
||
fn vector_to_cow(vector: &[Self]) -> CowVector { | ||
vector | ||
.iter() | ||
.map(|&x| x as VectorElementType) | ||
.collect::<Vec<VectorElementType>>() | ||
.into() | ||
fn slice_to_float_cow(vector: Cow<[Self]>) -> Cow<[VectorElementType]> { | ||
Cow::Owned(vector.iter().map(|&x| x as VectorElementType).collect_vec()) | ||
} | ||
|
||
fn quantization_preprocess<'a>( | ||
quantization_config: &QuantizationConfig, | ||
vector: &'a [Self], | ||
) -> Cow<'a, [f32]> { | ||
if let QuantizationConfig::Binary(_) = quantization_config { | ||
Cow::from( | ||
vector | ||
.iter() | ||
.map(|&x| (x as VectorElementType) - 127.0) | ||
.collect_vec(), | ||
) | ||
} else { | ||
Cow::from(vector.iter().map(|&x| x as VectorElementType).collect_vec()) | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.