Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-7028][CH][Part-11] Support write parquet files with bucket #8052

Merged
merged 2 commits into from
Dec 3, 2024

Conversation

lwz9103
Copy link
Contributor

@lwz9103 lwz9103 commented Nov 26, 2024

What changes were proposed in this pull request?

Support write parquet files with bucket

(Fixes: #7028)

How was this patch tested?

unit tests

Copy link

#7028

Copy link

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen changed the title [GLUTEN-7028][CH] Support write parquet files with bucket [GLUTEN-7028][CH][Part-11] Support write parquet files with bucket Nov 26, 2024
Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

@github-actions github-actions bot added the CORE works for Gluten Core label Dec 2, 2024
Copy link

github-actions bot commented Dec 2, 2024

Run Gluten Clickhouse CI on x86

1 similar comment
Copy link

github-actions bot commented Dec 2, 2024

Run Gluten Clickhouse CI on x86

@github-actions github-actions bot removed the CORE works for Gluten Core label Dec 2, 2024
Copy link

github-actions bot commented Dec 2, 2024

Run Gluten Clickhouse CI on x86

@@ -52,14 +52,15 @@ DB::ProcessorPtr make_sink(
const DB::Block & input_header,
const DB::Block & output_header,
const std::string & base_path,
const FileNameGenerator & generator,
FileNameGenerator & generator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

const bool pattern;
const std::string filename_or_pattern;
// Align with org.apache.spark.sql.execution.FileNamePlaceHolder
const std::vector<std::string> placeholders = {"{id}", "{bucket}"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

静态变量

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved


std::string generate() const
std::string pattern_format(std::string arg, std::string replacement) const
Copy link
Contributor

@baibaichen baibaichen Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参数使用常量引用,否则这里会有内存拷贝

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, resolved

Copy link

github-actions bot commented Dec 3, 2024

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen merged commit 2346584 into apache:main Dec 3, 2024
8 checks passed
baibaichen added a commit to Kyligence/gluten that referenced this pull request Dec 4, 2024
baibaichen added a commit that referenced this pull request Dec 4, 2024
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20241204)

* Fix Build due to ClickHouse/ClickHouse#72715

* Fix Build due to ClickHouse/ClickHouse#65691

* Fix Build due to ClickHouse/ClickHouse#72722

* Fix gtest due to #8052

* Fix benchmark due to ClickHouse/ClickHouse#72460

* Add SPARK_DIR_NAME for fixing unstable ut

---------

Co-authored-by: kyligence-git <gluten@kyligence.io>
Co-authored-by: Chang Chen <baibaichen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] Fully Support writing parquet and mergetree in spark 3.5.x with delta protocol
2 participants