Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better hashing of GraphId #3332

Merged
merged 3 commits into from
Sep 22, 2021
Merged

Better hashing of GraphId #3332

merged 3 commits into from
Sep 22, 2021

Conversation

oleg-liatte
Copy link
Member

@oleg-liatte oleg-liatte commented Sep 22, 2021

Issue

https://github.com/martinus/robin-hood-hashing

Depends on good Hashing. For a really bad hash the performance will not only degrade like in std::unordered_map, the map will simply fail with an std::overflow_error. In practice, when using the standard robin_hood::hash, I have never seen this happening.

static_cast<size_t>(k.value) is not good hashing. Especially in case of 32-bit size_t, when robin_hood::unordered_map fails pretty fast.

Tasklist

  • Add tests
  • Add #fixes with the issue number that this PR addresses
  • Update the docs with any new request parameters or changes to behavior described
  • Update the changelog
  • If you made changes to the lua files, update the taginfo too.

Requirements / Relations

Link any requirements here. Other pull requests this PR is based on?

@@ -19,6 +19,7 @@
* FIXED: Make combine_route_stats.py properly quote CSV output (best practice improvement) [#3328](https://github.com/valhalla/valhalla/pull/3328)
* FIXED: Merge edge segment records in map matching properly so that resulting edge indices in trace_attributes are valid [#3280](https://github.com/valhalla/valhalla/pull/3280)
* FIXED: Shape walking map matcher now sets correct edge candidates used in the match for origin and destination location [#3329](https://github.com/valhalla/valhalla/pull/3329)
* FIXED: Better hash function of GraphId [#3332](https://github.com/valhalla/valhalla/pull/3332)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this should be considered a fix. Probably more of an enhancement but no need to change this change log entry

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically in case of robin_hood::unordered_map this caused a bug (insertion to unordered_map threw an exception), that is why I considered this a bugfix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't remember where we used it with that container. Actually in most cases I expected we would send the value directly to the container and let the container worry about hashing it. In other words the container template type would be uint64_t

@oleg-liatte oleg-liatte merged commit aacd79a into master Sep 22, 2021
@mandeepsandhu
Copy link
Contributor

What problem was this trying to solve? Sorry I couldn't gather much from the description (was the current scheme causing too many collisions?)

Would love to have seen some tests demonstrating the fix.

@SiarheiFedartsou
Copy link
Member

@mandeepsandhu it works perfectly on 64-bit, but not on 32-bit, because size_t is uint32_t on 32-bit platforms.

@oleg-liatte
Copy link
Member Author

oleg-liatte commented Sep 23, 2021

What problem was this trying to solve? Sorry I couldn't gather much from the description (was the current scheme causing too many collisions?)

Would love to have seen some tests demonstrating the fix.

@mandeepsandhu GraphId::value layout is (from lower to higher):

  • 3 bits: hierarchy level
  • 22 bits: tile id
  • 21 bit: unique id within tile

So if we just strip it to 32 bits (the size of size_t on 32-bit platforms) we get 25 bits which are constant per tile and 7 bits from unique id, which is not much unique and yes, causing too many collisions.

@oleg-liatte oleg-liatte deleted the ol-better-graph-id-hash branch September 24, 2021 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants