Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed execution support of Hive with vineyard #1624

Merged
merged 20 commits into from
Dec 28, 2023

Conversation

vegetableysm
Copy link
Collaborator

@vegetableysm vegetableysm commented Nov 28, 2023

What do these changes do?

  • Support distributed execution for hive with vineyard.
  • Build the docker cluster with scripts.
  • The hdfs related stuffs have been removed.

Currently, the vineyard jar cannot run directly on hive because of dependency conflicts. You can run it temporarily by reverting to an older version of guava dependent by vineyard.
This problem will be fixed in #1682

@vegetableysm vegetableysm changed the title (WIP)Distributed execution support of Hive with vineyard [WIP]Distributed execution support of Hive with vineyard Nov 28, 2023
@vegetableysm
Copy link
Collaborator Author

vegetableysm commented Dec 18, 2023

I try to add location info for splits and Tez is able to get the information I provided, but failed to request a node resource.

Here is the log:
tez

We can see that yarn task scheduler service get the node hostname of "hadoop-yarn-nm-0" but failed to assign container. When the tez client asks for "/default-rack," it matches the hadoop-yarn-nm-0 node ("not match" in the figure should be " match". It is a mistake). Since the current three nodes are at /default-rack, if the requested resource is "/default-rack", which node the task is assigned to is random. There is no guarantee that the task will be assigned to the node with the corresponding data.

Then, I try to create table on HDFS, task allocation also failed. Here is the log:
tez2

I have no idea why the allocation failed.

@vegetableysm vegetableysm force-pushed the hive-integration branch 2 times, most recently from 172e632 to b82b67c Compare December 19, 2023 14:31
@vegetableysm vegetableysm changed the title [WIP]Distributed execution support of Hive with vineyard Distributed execution support of Hive with vineyard Dec 20, 2023
Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove hdfs related stuffs in the docker-compose configuration files.

@vegetableysm vegetableysm changed the title Distributed execution support of Hive with vineyard [WIP]Distributed execution support of Hive with vineyard Dec 22, 2023
@vegetableysm vegetableysm changed the title [WIP]Distributed execution support of Hive with vineyard Distributed execution support of Hive with vineyard Dec 25, 2023
@sighingnow
Copy link
Member

Rebase to main please.

@vegetableysm
Copy link
Collaborator Author

Rebase to main please.

Done

Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment: can we unify the dockerfiles under distributed/ to existing docker/ stuffs?

Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Supports to creat tables on the vineyard with data distributed across different nodes.

Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Fix a bug of file permission.

Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
Signed-off-by: vegetableysm <yuanshumin.ysm@alibaba-inc.com>
@sighingnow sighingnow merged commit 2955b60 into v6d-io:main Dec 28, 2023
4 of 6 checks passed
@sighingnow sighingnow deleted the hive-integration branch December 28, 2023 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants