Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for edge weights to GraphSAGE sampling #1667

Merged
merged 9 commits into from
Jun 12, 2020

Conversation

huonw
Copy link
Member

@huonw huonw commented Jun 10, 2020

This expands GraphSAGE (undirected and directed) to support weighted sampling, where edges with higher weights are taken proportionally more often.

For example, suppose there's there's 4 edges from node A:

source target weight
A B 0
A C 1
A D 2
A D 3

An unweighed walk starting at A will choose each of the edges with equal propability and so end up on B, C or D in proportion 1:1:2 (edge counts). A weighted walk will choose the edges proportional to the weights, so end up on the vertices in proportion 0:1:5 (sum of edge weight). (Worth specifically highlighting: a weighted walk will never chose the A-B edge because it has weight 0.)

See: #462

@huonw huonw force-pushed the feature/462-weighted-graphsage branch from 9f2e23f to c362597 Compare June 10, 2020 06:48
@huonw huonw changed the base branch from feature/numpy-random-state to develop June 11, 2020 04:17
@huonw huonw force-pushed the feature/462-weighted-graphsage branch from 1aa7f14 to 887b4ad Compare June 11, 2020 04:22
@codeclimate
Copy link

codeclimate bot commented Jun 11, 2020

Code Climate has analyzed commit 7b86b92 and detected 4 issues on this pull request.

Here's the issue category breakdown:

Category Count
Security 4

View more on Code Climate.

@huonw huonw requested a review from kjun9 June 11, 2020 04:26
@huonw huonw marked this pull request as ready for review June 11, 2020 04:26
tests/test_utils/graphs.py Show resolved Hide resolved
else:
neighbours = []

if len(neighbours) == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should do something like:

if len([w for w in weights if w != 0]) == 0:

so that we still have -1s propagated if all neighbours have zero weight?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, that's a good point. That particular form is likely to be expensive because it's manually iterating the array, but it's probably worth doing something with https://docs.scipy.org/doc/numpy/reference/generated/numpy.count_nonzero.html or using the final value of the cumsum in naive_weighted_choice. I'll probably do it in the latter, because then it applies to weighted Node2Vec too (which currently doesn't handle this case).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and fixed/tested in Node2Vec and CTDNE as appropriate as well.

@huonw huonw requested a review from kjun9 June 11, 2020 06:12
@@ -1117,6 +1128,8 @@ def _step(self, node, time, bias_type, np_rs):
if len(neighbours) > 0:
biases = self._temporal_biases(times, time, bias_type, is_forward=True)
chosen_neighbour_index = self._sample(len(neighbours), biases, np_rs)
assert chosen_neighbour_index is not None, "biases should never be all zero"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

@huonw huonw merged commit 84336e1 into develop Jun 12, 2020
@huonw huonw deleted the feature/462-weighted-graphsage branch June 12, 2020 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants