Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use distance threshold in hierarchy limits for bidir-astar #3156

Merged
merged 8 commits into from
Jun 25, 2021

Conversation

genadz
Copy link
Contributor

@genadz genadz commented Jun 17, 2021

Issue

This PR partially solves the problem described here #3075 . It fixes cases when an important road with low FRC is near the destination/origin point but can be skipped due to hierarchy limits logic. Especially it may happen for cases when the destination/origin point is situated in a dense places (like city center). Intuition says that in order to be guaranteed to find important roads with low FRC we should always allow expansion within some radius (same logic as for unidirectional search).

Tasklist

  • Add tests
  • Add #fixes with the issue number that this PR addresses
  • Update the docs with any new request parameters or changes to behavior described
  • Update the changelog
  • If you made changes to the lua files, update the taginfo too.

Requirements / Relations

Link any requirements here. Other pull requests this PR is based on?

@genadz
Copy link
Contributor Author

genadz commented Jun 17, 2021

Unfortunately, this change causes performance degradation. There are timings:
master

all items: 13630
percentiles	10%	50%	90%	99%	100%
response_time	0.010s	0.052s	0.201s	1.484s	3.118s
mean = 0.1130

branch

all items: 13630
percentiles	10%	50%	90%	99%	100%
response_time	0.011s	0.068s	0.249s	1.520s	3.213s
mean = 0.1324

We can see that in average timings increased by ~15%. The good news that for long routes (more important metric) it's only ~10% increase.

Copy link
Member

@dnesbitt61 dnesbitt61 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the performance impact of computing 2 additional distances? The use of counts rather than distance was done for performance reasons and also to allow the hierarchies to work in both dense and sparse environments.

In dense environments one generally has to travel less distance to move up to the next hierarchy - counting upward hierarchy transitions ensures enough candidates are found. Similarly in rural environments one has to generally travel further to ensure enough transitions onto higher class roads are made.

@genadz genadz force-pushed the kgv_inc_hi_limits branch from 37208f7 to fbd8f24 Compare June 17, 2021 16:52
@@ -85,7 +85,7 @@ def make_request(post_body):
parser.add_argument('--output-dir', type=str, help='The directory in which to place the result of each request')
parser.add_argument('--concurrency', type=int, help='The number of processes to use to make requests', default=multiprocessing.cpu_count())
parser.add_argument('--format', type=str, help='Supports csv, json, raw and null output formats', default='csv')
parser.add_argument('--headers', type=str, help='Additional http headers to send with the requests. Follows the http header spec, eg. some-header-name: some-header-value', action='append', nargs='*')
parser.add_argument('--headers', type=str, help='Additional http headers to send with the requests. Follows the http header spec, eg. some-header-name: some-header-value', action='append', nargs='*', default=[])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this script failed if --headers parameter wasn't specified. adding default value fixes it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange! i swore i tested that... thank you for cleaning it up!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mandeepsandhu this is where you saw that

// they can be lowered.
// Decrease distance thresholds only for arterial roads for now
hierarchy_limits_forward_[1].expansion_within_dist /= 5.f;
hierarchy_limits_reverse_[1].expansion_within_dist /= 5.f;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current default hierarchy limit for arterial roads is 100km; it's very costly for performance. This code decreases it to 20km making performance drop more or less reasonable

Copy link
Member

@kevinkreiser kevinkreiser Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess we have to do a big round of RAD to see what kind of changes we get though. I could imagine many routes will change because of this especially in rural areas where now you have to find an upward transition within 20km.

with respect to the performance difference how many routes are run with a retry and relaxed limits vs master? i wonder if we would have more retries with this change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess we have to do a big round of RAD to see what kind of changes we get though. I could imagine many routes will change because of this especially in rural areas where now you have to find an upward transition within 20km.

agree about RAD. I'm working on it right now

with respect to the performance difference how many routes are run with a retry and relaxed limits vs master? i wonder if we would have more retries with this change

interesting note. I will check it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinkreiser about retries: it's approximately the same: 296 (master) vs 292 (branch)

@genadz
Copy link
Contributor Author

genadz commented Jun 17, 2021

What is the performance impact of computing 2 additional distances?

I didn't check it, but I think it's unnoticeable.

In dense environments one generally has to travel less distance to move up to the next hierarchy - counting upward hierarchy transitions ensures enough candidates are found. Similarly in rural environments one has to generally travel further to ensure enough transitions onto higher class roads are made.

I agree with that. But for some cases in dense environments we exhaust transition count limit too fast (look at the example in the issue #3075 , the local road we skipped is very close to the destination point but the limits were exhausted before we got it). So, my point here is the following:
near the origin or the destination point optimal route can use lower level and higher level roads (but in the middle of the route it should be only higher level roads). Current logic doesn't guarantee that for dense places all lower level roads near the origin/destination will be expanded, so I used distance threshold for that purpose.

@dnesbitt61
Copy link
Member

You probably don't need absolute distance - rather you perhaps could avoid sqrt calls for these 2 by using distance squared and configuring the hierarchy distances to be squared distance. Maybe do a performance comparison between using GetDistance and GetDistanceSquared (I think you need a new method in astarheuristic.h). I just remember through the years doing a lot of work to avoid sqrt calls as much as possible.

@dnesbitt61
Copy link
Member

dnesbitt61 commented Jun 17, 2021

Also, is there a way to avoid computing these distances for modes that do not use hierarchies (pedestrian, bicycle)?

I just did a comparison with bicycle routes and noted a 6.7% drop in performance with no change in the route and no change in the number of EdgeLabels added. So this PR drops bicycle routing performance significantly with no impact to the actual route computation. I think this is almost entirely due to the cost of the new GetDistance calls - there may be a slight cost to propogate into EdgeLabel but otherwise there is no change.

@genadz
Copy link
Contributor Author

genadz commented Jun 18, 2021

Also, is there a way to avoid computing these distances for modes that do not use hierarchies (pedestrian, bicycle)?

I just did a comparison with bicycle routes and noted a 6.7% drop in performance with no change in the route and no change in the number of EdgeLabels added. So this PR drops bicycle routing performance significantly with no impact to the actual route computation. I think this is almost entirely due to the cost of the new GetDistance calls - there may be a slight cost to propogate into EdgeLabel but otherwise there is no change.

how did you test performance for bicycles ? did you generate some random routes or use from test_requests folder ?

@dnesbitt61
Copy link
Member

I test a single, long bicycle route that I have tracked performance on for several years. I use valhalla_run_route with multi-run set to 100. This will then average the PathAlgorithm time over 100 successive calls (ignores the first to make sure all data is in cache). Obviosuly this doesn't cover a wide range of conditions (e.g. road densities, etc.) but it fully stresses just the PathAlgorithm - there is no I/O and CPU is 100% for the one thread computing the route. I also did this for auto routes while at Mapzen and Mapbox so that I could track performance of a known route over time - comparing code changes and data set changes.

The command is:

./valhalla_run_route --config ../../conf/planetv14.json --multi-run 100 -j '{"locations":[{"lat":39.64225,"lon":-76.10536,"type":"break"},{"lat":38.899677,"lon":-77.050916,"type":"break"}],"costing":"bicycle","costing_options":{"bicycle":{"bicycle_type":"Road","use_roads":"0.0","cycling_speed":"25.0","use_hills":"0.5"}}}' | grep GetBestPath

@genadz
Copy link
Contributor Author

genadz commented Jun 18, 2021

I test a single, long bicycle
The command is:

./valhalla_run_route --config ../../conf/planetv14.json --multi-run 100 -j '{"locations":[{"lat":39.64225,"lon":-76.10536,"type":"break"},{"lat":38.899677,"lon":-77.050916,"type":"break"}],"costing":"bicycle","costing_options":{"bicycle":{"bicycle_type":"Road","use_roads":"0.0","cycling_speed":"25.0","use_hills":"0.5"}}}' | grep GetBestPath

Thank you! will try the same.

hm, I'm getting not very stable results with a such test. variance is about 10%.

Instead of that I created requests.txt file with the bicycle request which is repeated 1000 times. Here results were more stable (variance less than 1%). So, I didn't catch the difference between cases when we use pure distance and squared.

@dnesbitt61
Copy link
Member

I always reboot my laptop (so all other apps are stopped) and make sure no browsers are open prior to running this benchmark. When I do that I get very good consistency between results (usually only vary by a couple ms for the request).

@genadz
Copy link
Contributor Author

genadz commented Jun 21, 2021

I always reboot my laptop (so all other apps are stopped) and make sure no browsers are open prior to running this benchmark. When I do that I get very good consistency between results (usually only vary by a couple ms for the request).

did the same. So, results became more stable, but less than using run_with_server.py script. any case, I got the following:
master (4 runs)

2021/06/21 12:26:27.002326 [INFO] PathAlgorithm GetBestPath took 706 ms
2021/06/21 12:27:34.603899 [INFO] PathAlgorithm GetBestPath average: 650 ms
2021/06/21 12:27:36.203482 [INFO] PathAlgorithm GetBestPath took 714 ms
2021/06/21 12:28:43.844173 [INFO] PathAlgorithm GetBestPath average: 650 ms
2021/06/21 12:28:45.446597 [INFO] PathAlgorithm GetBestPath took 714 ms
2021/06/21 12:29:54.289498 [INFO] PathAlgorithm GetBestPath average: 662 ms
2021/06/21 12:29:55.912541 [INFO] PathAlgorithm GetBestPath took 728 ms
2021/06/21 12:31:04.902405 [INFO] PathAlgorithm GetBestPath average: 663 ms

mean = 656.25, std = 6.26

branch

2021/06/21 12:32:56.320354 [INFO] PathAlgorithm GetBestPath took 700 ms
2021/06/21 12:34:03.447519 [INFO] PathAlgorithm GetBestPath average: 646 ms
2021/06/21 12:34:05.018854 [INFO] PathAlgorithm GetBestPath took 717 ms
2021/06/21 12:35:12.692285 [INFO] PathAlgorithm GetBestPath average: 651 ms
2021/06/21 12:35:14.274132 [INFO] PathAlgorithm GetBestPath took 717 ms
2021/06/21 12:36:22.092979 [INFO] PathAlgorithm GetBestPath average: 653 ms
2021/06/21 12:36:23.671287 [INFO] PathAlgorithm GetBestPath took 720 ms
2021/06/21 12:37:34.180110 [INFO] PathAlgorithm GetBestPath average: 679 ms

mean = 657.25, std = 12.81

It's almost the same, but I agree that we can avoid distance calculation for cases when hierarchy limits are disabled (bicycle and pedestrian).

@genadz
Copy link
Contributor Author

genadz commented Jun 23, 2021

@dnesbitt61 @kevinkreiser I've finished with RAD tests. I ran locally (with my local planet) ~14000 requests for auto costing model. I got 561 differences in ETA, where 386 were better than in master and 175 worse. It was kind of unexpected for me, actually I expected that almost all routes should be improved (by ETA). Then I selected top 30 routes with increased ETA and top 30 routes with decreased ETA and looked at each of them manually.

Most significant improvements in ETA were achieved on routes that include ferries or country crossing (I suspect because near ferries or country borders lie lower level edges that were covered by the branch and missed by master).
There were also several small differences near the origin or destination. This is because we're traversing more lower level edges near origin or destination points that guarantees local optimality near these points.

Each route with downgraded ETA can be explained in one of the following ways:

ETA is bigger but the route cost is smaller.

It turned out that for some routes cost were improved but ETA not. As far as our actual score is cost, we can treat such routes as improved.

Bidirectional astar stops iterating too early.

This is well known problem #2928. For some routes, when I temporary fixed cost threshold in the branch (or even enabled alternatives that extend cost threshold) I got the same routes as in master.

Inaccurate shortcut's costs.

Cost for some shortcuts may be slightly different from the sum of costs of superseded edges. Due to the changes in hierarchy limits logic some shortcuts may be still unavailable in branch (https://github.com/valhalla/valhalla/blob/master/src/thor/bidirectional_astar.cc#L166) because they lie too close to the origin/destination point but may be used in master. In some cases it may lead to different optimal routes.

Hierarchy limits + multiple starting edges.

We initialize forward/reverse search with different starting edges. For each edge we calculate initial cost (equal to squared distance from the origin/destination point to the edge). In some cases initial cost can be very big. It may lead to a very unbalanced search and a situation when we complete a route before exhaust hierarchy limits. So, we can miss local optimality near the origin/destination point. I think this is a separate issue that shouldn't be fixed in this PR.

@genadz
Copy link
Contributor Author

genadz commented Jun 24, 2021

Copy link
Member

@dnesbitt61 dnesbitt61 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me but I can only really test bicycle routing. No performance difference in my use case. I don't create data with hierarchies so that was expected/desired.

@genadz
Copy link
Contributor Author

genadz commented Jun 24, 2021

This looks fine to me but I can only really test bicycle routing. No performance difference in my use case. I don't create data with hierarchies so that was expected/desired.

thanks!

@kevinkreiser
Copy link
Member

Bidirectional astar stops iterating too early.

I know what you mean by this problem but are you saying that this happens more frequently with this branch? I am missing the bigger point about what you were seeing with respect to this problem in the branch. Can you maybe reiterate?

Hierarchy limits + multiple starting edges.

That one is a long standing problem. Its very hard to tune the penalty to make sense. Originally i had it set to be walking speed cost of straight line distance to the snap point but this was sometimes not enough penalty. That is why we have options to disable the penality (which is a bandaid but at least it gives the option).

@genadz
Copy link
Contributor Author

genadz commented Jun 24, 2021

Bidirectional astar stops iterating too early.

I know what you mean by this problem but are you saying that this happens more frequently with this branch? I am missing the bigger point about what you were seeing with respect to this problem in the branch. Can you maybe reiterate?

I don't say that for this branch it happens more frequently. I just analyzed downgraded routes and for some of them it was a reason. That's expected to me.

@genadz
Copy link
Contributor Author

genadz commented Jun 24, 2021

Hierarchy limits + multiple starting edges.

That one is a long standing problem. Its very hard to tune the penalty to make sense. Originally i had it set to be walking speed cost of straight line distance to the snap point but this was sometimes not enough penalty. That is why we have options to disable the penality (which is a bandaid but at least it gives the option).

As for me, it worth a separate issue. And, actually there are 2 problems here:

  1. unbalanced search;
  2. nonoptimal routes due to unbalanced search + hierarchy limits.

@genadz
Copy link
Contributor Author

genadz commented Jun 24, 2021

@kevinkreiser did I answer your question? Or some more clarifications are needed ?

@genadz genadz merged commit 7a9c864 into master Jun 25, 2021
@genadz genadz deleted the kgv_inc_hi_limits branch June 25, 2021 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants