Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft RFC for support nullable_number_type. #4158

Closed
wants to merge 39 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
7da4751
Draft RFC
wangzhen11aaa Feb 15, 2022
6b4c1a0
Fix bug in match arm
wangzhen11aaa Feb 15, 2022
ad32db7
Get rid of warning
wangzhen11aaa Feb 15, 2022
b6c3236
Get rid of warning
wangzhen11aaa Feb 15, 2022
45fa6ac
use datatype_bytes_size to promote type
wangzhen11aaa Feb 17, 2022
e07f654
make series.rs untouched
wangzhen11aaa Feb 17, 2022
ed82093
Merge branch 'master' into f_support_nullable_number_type_in_group_hash
wangzhen11aaa Feb 17, 2022
6d4f70f
The architecture of this task
wangzhen11aaa Feb 18, 2022
ccc7184
fix logical bug in select the fix_hash_xx method
wangzhen11aaa Feb 18, 2022
d5b7080
cargo fmt
wangzhen11aaa Feb 18, 2022
c5c0c6f
make group_hash.rs in datavalues mod untouched
wangzhen11aaa Feb 18, 2022
3b63562
hash works
wangzhen11aaa Feb 18, 2022
5b2a254
add dereference to a reference variable
wangzhen11aaa Feb 18, 2022
9e5c84b
cargo fmt
wangzhen11aaa Feb 18, 2022
42b69e1
Remove explicit deref operator use the default
wangzhen11aaa Feb 18, 2022
021040b
Can Split Null from 0
wangzhen11aaa Feb 19, 2022
637defe
Group with one nullable column corretly
wangzhen11aaa Feb 20, 2022
4f39389
merge with master
wangzhen11aaa Feb 20, 2022
ff23347
make clippy happy
wangzhen11aaa Feb 20, 2022
2735492
let fmt happy
wangzhen11aaa Feb 20, 2022
030296d
let clippy happy
wangzhen11aaa Feb 20, 2022
c881781
Try to solve weird error
wangzhen11aaa Feb 20, 2022
21fbc6b
let clippy happy
wangzhen11aaa Feb 20, 2022
96f23b8
split ptr add and write method
wangzhen11aaa Feb 20, 2022
bc73c8e
move add expression out of ptr.add method
wangzhen11aaa Feb 20, 2022
d7a2981
split function into small unsafe parts
wangzhen11aaa Feb 20, 2022
4ae8e58
split function into small unsafe parts
wangzhen11aaa Feb 20, 2022
6b1bd76
use ptr1 and one unsafe
wangzhen11aaa Feb 20, 2022
e69d363
use ptr along
wangzhen11aaa Feb 20, 2022
2d2f65e
add new macro to please fixed_hash_with_nullable
wangzhen11aaa Feb 20, 2022
154ac39
make clippy happy
wangzhen11aaa Feb 20, 2022
673763a
make clippy happy
wangzhen11aaa Feb 20, 2022
01650d9
make fmt happy
wangzhen11aaa Feb 20, 2022
a51609e
make clippy calm down
wangzhen11aaa Feb 20, 2022
952f912
make fmt calm down
wangzhen11aaa Feb 20, 2022
9f5971c
make clippy calm down
wangzhen11aaa Feb 20, 2022
5c86e89
make clippy calm down
wangzhen11aaa Feb 20, 2022
bebe5be
Merge branch 'master' into f_support_nullable_number_type_in_group_hash
wangzhen11aaa Feb 21, 2022
581e339
Support group by more than one keys
wangzhen11aaa Feb 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Support group by more than one keys
Signed-off-by: wang.zhen <wangzhenaaa7@gmail.com>
  • Loading branch information
wangzhen11aaa committed Feb 21, 2022
commit 581e339a03d7674beb09143c77f8063e7c464007
6 changes: 3 additions & 3 deletions common/datablocks/src/kernels/data_block_group_by.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,14 @@ impl DataBlock {
let mut group_key_len = 0;
for col in column_names {
let column = block.try_column_by_name(col)?;
let _typ = if column.is_nullable() {
let typ = if column.is_nullable() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use remove_nullable directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use remove_nullable directly.

remove_nullable is c~.

remove_nullable(&column.data_type())
} else {
column.data_type()
};
if _typ.data_type_id().is_integer() {
if typ.data_type_id().is_integer() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can support is_numeric.

// If column is nullable, we will add one byte to identify `null` value for every row.
group_key_len += _typ.data_type_id().numeric_byte_size()?;
group_key_len += typ.data_type_id().numeric_byte_size()?;
if column.is_nullable() {
group_key_len += 1;
}
Expand Down
41 changes: 19 additions & 22 deletions common/datablocks/src/kernels/data_block_group_by_hash.rs
Original file line number Diff line number Diff line change
Expand Up @@ -251,32 +251,31 @@ where T: PrimitiveType

let mut res = Vec::with_capacity(group_fields.len());
let mut offsize = 0;
let mut null_part_offset = 0;

init_nullable_offset_via_fields(&mut null_part_offset, group_fields)?;
let mut null_part_offset = init_nullable_offset_via_fields(group_fields)?;
for f in group_fields.iter() {
let data_type = f.data_type();
let mut deserializer = data_type.create_deserializer(rows);
let reader = vec8.as_slice();
if !data_type.is_nullable() {
deserializer.de_batch(&reader[offsize..], step, rows)?;
res.push(deserializer.finish_to_column());
offsize += data_type.data_type_id().numeric_byte_size()?;
} else {
let nullable_type_size = remove_nullable(data_type)
.data_type_id()
.numeric_byte_size()?;
deserializer.de_batch_with_nullable(
&reader[offsize..],
step,
rows,
null_part_offset,
&mut null_part_offset,
nullable_type_size,
)?;
res.push(deserializer.finish_to_column());
null_part_offset += 1;
}
if data_type.is_nullable() {
offsize += remove_nullable(data_type)
.data_type_id()
.numeric_byte_size()?;
} else {
offsize += data_type.data_type_id().numeric_byte_size()?;

offsize += nullable_type_size;
}
}
Ok(res)
Expand Down Expand Up @@ -304,8 +303,7 @@ where

let group_columns_has_nullable_one = check_group_columns_has_nullable(group_columns);
if group_columns_has_nullable_one {
let mut null_part_offset = 0;
init_nullable_offset(&mut null_part_offset, group_columns)?;
let mut null_part_offset = init_nullable_offset(group_columns)?;
while size > 0 {
build_keys_with_nullable_column(
size,
Expand Down Expand Up @@ -384,30 +382,29 @@ where

/// Init the nullable part's offset in bytes. It follows the values part.
#[inline]
fn init_nullable_offset(null_part_offset: &mut usize, group_keys: &[&ColumnRef]) -> Result<()> {
fn init_nullable_offset(group_keys: &[&ColumnRef]) -> Result<usize> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group_keys.iter().map(|c| c.data_type().numeric_byte_size().unwrap()).sum()

let mut _null_part_offset = 0;
for group_key_column in group_keys {
*null_part_offset += remove_nullable(&group_key_column.data_type())
_null_part_offset += remove_nullable(&group_key_column.data_type())
.data_type_id()
.numeric_byte_size()?;
}
Ok(())
Ok(_null_part_offset)
}

/// Init the nullable part's offset in bytes. It follows the values part.
#[inline]
fn init_nullable_offset_via_fields(
null_part_offset: &mut usize,
group_fields: &[DataField],
) -> Result<()> {
fn init_nullable_offset_via_fields(group_fields: &[DataField]) -> Result<usize> {
let mut null_part_offset = 0;
for f in group_fields {
let f_typ = f.data_type();
if f_typ.is_nullable() {
*null_part_offset += remove_nullable(f_typ).data_type_id().numeric_byte_size()?;
null_part_offset += remove_nullable(f_typ).data_type_id().numeric_byte_size()?;
} else {
*null_part_offset += f_typ.data_type_id().numeric_byte_size()?;
null_part_offset += f_typ.data_type_id().numeric_byte_size()?;
}
}
Ok(())
Ok(null_part_offset)
}

#[inline]
Expand Down
3 changes: 2 additions & 1 deletion common/datavalues/src/types/deserializations/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ pub trait TypeDeserializer: Send + Sync {
_reader: &[u8],
_step: usize,
_rows: usize,
_null_offset: usize,
_null_offset: &mut usize,
_type_size: usize,
) -> Result<()> {
Err(ErrorCode::BadDataValueType(
"de_batch_with_nullable operation only for nullable",
Expand Down
13 changes: 4 additions & 9 deletions common/datavalues/src/types/deserializations/nullable.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,20 +52,15 @@ impl TypeDeserializer for NullableDeserializer {
reader: &[u8],
step: usize,
rows: usize,
null_offset: usize,
null_offset: &mut usize,
type_size: usize,
) -> Result<()> {
for row in 0..rows {
let mut reader = &reader[step * row..];
self.inner.de(&mut reader)?;
// null_offset -= self.inner().to_physical_type;
// todo, We should get the length of missed bytes which caused
// by self.inner.de method.
if reader[null_offset - 1] & 1 == 1 {
self.bitmap.push(false);
} else {
self.bitmap.push(true);
}
self.bitmap.push(reader[*null_offset - type_size] & 1 != 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!=1 ---> = 0

}
*null_offset -= type_size;
Ok(())
}

Expand Down
13 changes: 13 additions & 0 deletions tests/suites/0_stateless/03_dml/03_0018_group_by_nullable.result
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
NULL 2
0 4
1 1
2 3
NULL 0 2
0 0 1
0 1 1
0 2 1
0 3 1
1 NULL 1
1 0 1
2 1 1
2 2 1
5 changes: 5 additions & 0 deletions tests/suites/0_stateless/03_dml/03_0018_group_by_nullable.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
CREATE TABLE t(a UInt64, b UInt32) Engine = Fuse;
INSERT INTO t(a,b) SELECT if (number % 3 = 1, null, number) as a, number + 3 as b FROM numbers(10);
SELECT a%3 as c, count(1) as d from t GROUP BY c ORDER BY c, d;
SELECT a%3 as c, a%4 as d, count(0) as f FROM t GROUP BY c,d ORDER BY c,d,f;
DROP table t;