-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35357][GRAPHX] Allow to turn off the normalization applied by static PageRank utilities #32485
[SPARK-35357][GRAPHX] Allow to turn off the normalization applied by static PageRank utilities #32485
Conversation
…nk with a 'normalized' parameter to trigger or not the normalization
ok to test |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #138334 has finished for PR 32485 at commit
|
I think it's fine. cc @srowen FYI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK, only one tiny comment about 'since'
graphx/src/test/scala/org/apache/spark/graphx/lib/PageRankSuite.scala
Outdated
Show resolved
Hide resolved
Test build #138375 has finished for PR 32485 at commit
|
Thank you @Ayushsunny @HyukjinKwon @srowen for the review 🙏 . |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Merged to master |
What changes were proposed in this pull request?
Overload methods
PageRank.runWithOptions
andPageRank.runWithOptionsWithPreviousPageRank
(not to break any user-facing signature) with anormalized
parameter that describes "whether or not to normalize the rank sum".Why are the changes needed?
https://issues.apache.org/jira/browse/SPARK-35357
When dealing with a non negligible proportion of sinks in a graph, algorithm based on incremental update of ranks can get a precision gain for free if they are allowed to manipulate non normalized ranks.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By adding a unit test that verifies that (even when dealing with a graph containing a sink) we end up with the same result for both these scenarios:
a)
PageRank.runWithOptions
with normalization enabledb)
PageRank.runWithOptions
with normalization disabledpreRankGraph1
and run 2 more iterations usingPageRank.runWithOptionsWithPreviousPageRank
with normalization disabledpreRankGraph2
and run 2 more iterations usingPageRank.runWithOptionsWithPreviousPageRank
with normalization enabled