Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parallel page rank #4689

Merged
merged 2 commits into from
Jan 11, 2025
Merged

Add parallel page rank #4689

merged 2 commits into from
Jan 11, 2025

Conversation

andyfengHKU
Copy link
Contributor

@andyfengHKU andyfengHKU commented Jan 8, 2025

Description

Add parallel page rank implementation.

A quick benchmark on LDBC person knows person

1 thread: 4s
2 thread: 2.6s
4 thread: 5s

The overhead seems to due to the CAS operation because there are 0.5M nodes and 20M edges. The contention looks high. I'll need to further investigation into the scalability.

Contributor agreement

Copy link

github-actions bot commented Jan 8, 2025

Benchmark Result

Master commit hash: decc6d5823880dbb9ae7194d12d608decb7ec5a8
Branch commit hash: e7d5894949f9c2786cb48150ff962e1956da9cf2

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 666.53 651.46 15.07 (2.31%)
aggregation q28 11763.17 11887.29 -124.12 (-1.04%)
filter q14 135.03 127.29 7.75 (6.09%)
filter q15 139.14 133.73 5.41 (4.05%)
filter q16 311.90 310.34 1.56 (0.50%)
filter q17 465.33 454.80 10.53 (2.32%)
filter q18 2032.41 1966.81 65.59 (3.34%)
filter zonemap-node 97.91 89.62 8.29 (9.25%)
filter zonemap-node-lhs-cast 98.90 92.20 6.70 (7.27%)
filter zonemap-node-null 95.07 87.42 7.65 (8.75%)
filter zonemap-rel 5726.17 5727.51 -1.34 (-0.02%)
fixed_size_expr_evaluator q07 605.21 574.87 30.34 (5.28%)
fixed_size_expr_evaluator q08 817.49 804.41 13.08 (1.63%)
fixed_size_expr_evaluator q09 834.77 805.98 28.79 (3.57%)
fixed_size_expr_evaluator q10 257.33 239.68 17.65 (7.36%)
fixed_size_expr_evaluator q11 249.83 234.45 15.39 (6.56%)
fixed_size_expr_evaluator q12 247.82 227.59 20.23 (8.89%)
fixed_size_expr_evaluator q13 1469.36 1476.13 -6.77 (-0.46%)
fixed_size_seq_scan q23 134.13 119.40 14.73 (12.34%)
join q29 627.89 590.23 37.66 (6.38%)
join q30 10056.83 10180.83 -124.00 (-1.22%)
join q31 6.26 5.26 1.00 (19.09%)
join SelectiveTwoHopJoin 59.63 58.52 1.11 (1.89%)
ldbc_snb_ic q35 2595.20 2572.14 23.07 (0.90%)
ldbc_snb_ic q36 470.12 487.71 -17.59 (-3.61%)
ldbc_snb_is q32 4.01 3.45 0.55 (16.05%)
ldbc_snb_is q33 15.24 12.77 2.47 (19.36%)
ldbc_snb_is q34 1.40 1.24 0.16 (12.90%)
multi-rel multi-rel-large-scan 1326.59 1336.41 -9.82 (-0.73%)
multi-rel multi-rel-lookup 12.06 15.10 -3.04 (-20.12%)
multi-rel multi-rel-small-scan 94.48 83.67 10.81 (12.92%)
order_by q25 143.55 133.12 10.43 (7.84%)
order_by q26 468.90 457.26 11.64 (2.55%)
order_by q27 1502.13 1478.76 23.36 (1.58%)
recursive_join recursive-join-bidirection 302.68 297.17 5.50 (1.85%)
recursive_join recursive-join-dense 7424.15 7387.40 36.75 (0.50%)
recursive_join recursive-join-path 23800.13 23803.49 -3.36 (-0.01%)
recursive_join recursive-join-sparse 1084.81 1059.66 25.15 (2.37%)
recursive_join recursive-join-trail 7406.40 7350.57 55.83 (0.76%)
scan_after_filter q01 185.81 175.47 10.34 (5.89%)
scan_after_filter q02 172.62 158.15 14.47 (9.15%)
shortest_path_ldbc100 q37 78.23 101.08 -22.85 (-22.61%)
shortest_path_ldbc100 q38 227.83 371.39 -143.56 (-38.65%)
shortest_path_ldbc100 q39 51.73 59.72 -7.99 (-13.38%)
shortest_path_ldbc100 q40 315.16 367.64 -52.48 (-14.28%)
var_size_expr_evaluator q03 2089.02 2068.03 20.99 (1.02%)
var_size_expr_evaluator q04 2345.66 2241.98 103.68 (4.62%)
var_size_expr_evaluator q05 2646.16 2619.72 26.44 (1.01%)
var_size_expr_evaluator q06 1342.07 1318.61 23.46 (1.78%)
var_size_seq_scan q19 1482.92 1444.45 38.47 (2.66%)
var_size_seq_scan q20 2688.97 2687.44 1.54 (0.06%)
var_size_seq_scan q21 2291.77 2301.21 -9.44 (-0.41%)
var_size_seq_scan q22 128.98 127.52 1.45 (1.14%)

Copy link
Contributor

@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments below. Also can we get some benchmark numbers on running under different num of threads? I think that's good for us to keep track of our scalability.

src/function/gds/page_rank.cpp Show resolved Hide resolved
src/function/gds/page_rank.cpp Outdated Show resolved Hide resolved
src/function/gds/page_rank.cpp Outdated Show resolved Hide resolved
src/function/gds/page_rank.cpp Outdated Show resolved Hide resolved
src/function/gds/page_rank.cpp Outdated Show resolved Hide resolved
src/function/gds/page_rank.cpp Outdated Show resolved Hide resolved
src/function/gds/page_rank.cpp Outdated Show resolved Hide resolved
Copy link

Benchmark Result

Master commit hash: d5acb796e384d3de68e7be15f6ba5c0ef6bcdc64
Branch commit hash: cd67aa03cd787cec7055443dd9ffff30cd5ac0ee

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 649.50 655.32 -5.81 (-0.89%)
aggregation q28 11456.04 11494.18 -38.14 (-0.33%)
copy node-Comment 75966.27 N/A N/A
copy node-Forum 5452.21 N/A N/A
copy node-Organisation 1194.98 N/A N/A
copy node-Person 2217.42 N/A N/A
copy node-Place 1175.90 N/A N/A
copy node-Post 30815.98 N/A N/A
copy node-Tag 1236.25 N/A N/A
copy node-Tagclass 1159.05 N/A N/A
copy rel-comment-hasCreator 58563.59 N/A N/A
copy rel-comment-hasTag 91767.90 N/A N/A
copy rel-comment-isLocatedIn 70959.10 N/A N/A
copy rel-containerOf 14796.09 N/A N/A
copy rel-forum-hasTag 4184.00 N/A N/A
copy rel-hasInterest 3161.35 N/A N/A
copy rel-hasMember 119017.47 N/A N/A
copy rel-hasModerator 1562.18 N/A N/A
copy rel-hasType 256.99 N/A N/A
copy rel-isPartOf 224.64 N/A N/A
copy rel-isSubclassOf 290.32 N/A N/A
copy rel-knows 13653.86 N/A N/A
copy rel-likes-comment 169243.58 N/A N/A
copy rel-likes-post 67679.37 N/A N/A
copy rel-organisation-isLocatedIn 300.03 N/A N/A
copy rel-person-isLocatedIn 432.41 N/A N/A
copy rel-post-hasCreator 14693.79 N/A N/A
copy rel-post-hasTag 22578.40 N/A N/A
copy rel-post-isLocatedIn 17756.02 N/A N/A
copy rel-replyOf-comment 48836.41 N/A N/A
copy rel-replyOf-post 37709.88 N/A N/A
copy rel-studyAt 831.71 N/A N/A
copy rel-workAt 1573.48 N/A N/A
filter q14 125.63 138.10 -12.47 (-9.03%)
filter q15 133.97 139.28 -5.31 (-3.82%)
filter q16 304.08 317.88 -13.80 (-4.34%)
filter q17 446.92 454.64 -7.72 (-1.70%)
filter q18 1872.74 1954.54 -81.80 (-4.19%)
filter zonemap-node 89.14 99.50 -10.36 (-10.41%)
filter zonemap-node-lhs-cast 89.91 97.78 -7.86 (-8.04%)
filter zonemap-node-null 85.47 94.18 -8.71 (-9.25%)
filter zonemap-rel 5687.03 5740.84 -53.81 (-0.94%)
fixed_size_expr_evaluator q07 571.22 564.24 6.97 (1.24%)
fixed_size_expr_evaluator q08 802.29 810.79 -8.50 (-1.05%)
fixed_size_expr_evaluator q09 805.23 809.95 -4.71 (-0.58%)
fixed_size_expr_evaluator q10 238.18 244.92 -6.74 (-2.75%)
fixed_size_expr_evaluator q11 229.20 239.03 -9.83 (-4.11%)
fixed_size_expr_evaluator q12 226.87 234.93 -8.06 (-3.43%)
fixed_size_expr_evaluator q13 1460.53 1469.46 -8.93 (-0.61%)
fixed_size_seq_scan q23 113.23 116.99 -3.76 (-3.21%)
join q29 615.46 599.57 15.89 (2.65%)
join q30 11218.47 11008.46 210.01 (1.91%)
join q31 6.29 5.94 0.35 (5.88%)
join SelectiveTwoHopJoin 58.17 55.58 2.58 (4.65%)
ldbc_snb_ic q35 2484.20 2676.02 -191.82 (-7.17%)
ldbc_snb_ic q36 494.38 473.46 20.92 (4.42%)
ldbc_snb_is q32 5.08 5.63 -0.55 (-9.75%)
ldbc_snb_is q33 15.01 13.35 1.66 (12.42%)
ldbc_snb_is q34 1.29 1.20 0.09 (7.22%)
multi-rel multi-rel-large-scan 1412.09 1300.67 111.42 (8.57%)
multi-rel multi-rel-lookup 29.82 34.51 -4.69 (-13.58%)
multi-rel multi-rel-small-scan 97.27 83.97 13.29 (15.83%)
order_by q25 129.34 144.88 -15.53 (-10.72%)
order_by q26 457.16 482.19 -25.03 (-5.19%)
order_by q27 1499.02 1492.70 6.32 (0.42%)
recursive_join recursive-join-bidirection 279.69 283.60 -3.91 (-1.38%)
recursive_join recursive-join-dense 7398.30 5441.75 1956.55 (35.95%)
recursive_join recursive-join-path 23562.45 23804.62 -242.17 (-1.02%)
recursive_join recursive-join-sparse 1047.93 1064.18 -16.25 (-1.53%)
recursive_join recursive-join-trail 7322.21 6066.14 1256.07 (20.71%)
scan_after_filter q01 174.74 178.64 -3.89 (-2.18%)
scan_after_filter q02 159.90 164.87 -4.96 (-3.01%)
shortest_path_ldbc100 q37 87.52 101.96 -14.44 (-14.16%)
shortest_path_ldbc100 q38 365.83 339.09 26.74 (7.89%)
shortest_path_ldbc100 q39 64.01 66.28 -2.27 (-3.43%)
shortest_path_ldbc100 q40 453.03 355.23 97.80 (27.53%)
var_size_expr_evaluator q03 2066.20 2056.20 10.00 (0.49%)
var_size_expr_evaluator q04 2166.58 2237.42 -70.84 (-3.17%)
var_size_expr_evaluator q05 2526.47 2626.86 -100.38 (-3.82%)
var_size_expr_evaluator q06 1336.65 1354.91 -18.27 (-1.35%)
var_size_seq_scan q19 1442.22 1468.18 -25.96 (-1.77%)
var_size_seq_scan q20 2629.79 2604.30 25.49 (0.98%)
var_size_seq_scan q21 2313.50 2309.84 3.66 (0.16%)
var_size_seq_scan q22 125.81 132.66 -6.85 (-5.16%)

@andyfengHKU andyfengHKU merged commit 114d0aa into master Jan 11, 2025
@andyfengHKU andyfengHKU deleted the p-page-rank branch January 11, 2025 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants