Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topology-preserving VW #143

Merged
merged 2 commits into from
Sep 4, 2017
Merged

Conversation

urschrei
Copy link
Member

@urschrei urschrei commented Aug 17, 2017

This is a first attempt at topology-preserving VW, based on https://www.jasondavies.com/simplify/.
Update: see subsequent comments for a refinement of the approach.
It's WIP for two reasons:

  • I need a longer test input which produces self-intersections for a given epsilon. This is surprisingly difficult to cook up.
  • I'm not sure the approach in this fork is correct:
    1. If an intersection is found, move the triangle (a) into a vec, and exit the loop early
    2. pop the next triangle (b) and start processing it
    3. peek_mut() at the next triangle (c) in the queue, and set its area to as area. This guarantees its removal
    4. push a back onto the heap. If we modified the correct c triangle, a will be subsequently skipped in the main loop, as one of its support points was removed
    5. if it wasn't the triangle we needed to remove, a will be re-detected as causing an intersection.

I've got an alternate implementation here. The difference:

  1. If an intersection is found, move the triangle (a) into a vec, and exit the loop early
  2. pop the next triangle (b) and start processing it
  3. peek at the next triangle (c) in the queue, and increase the epsilon to its area. This guarantees its removal
  4. push a back onto the heap. If the c triangle we remove was the correct one, a will be subsequently skipped in the main loop, as one of its support points was removed
  5. If it wasn't, a will be re-detected as causing an intersection, and the epsilon will be increased again.

The drawback of the second approach is that it permanently increases the user-defined epsilon, leading to the removal of more points overall than the first approach.
Also, both of these approaches could be wrong. More complex test inputs will really help here.

@urschrei urschrei requested a review from mbattifarano August 17, 2017 12:19
@urschrei urschrei force-pushed the vw_topology_preserve_decrease branch from 417ffbb to e327958 Compare August 17, 2017 15:54
@urschrei
Copy link
Member Author

urschrei commented Aug 17, 2017

Some progress (?)
I've used the Norwegian coastline example from the JS implementation. Running the VW algorithm on the mainland polygon (8854 points) with an epsilon of 0.0005 produces a self-intersection. Running the updated algorithm produces a valid geometry, with only 10 6 fewer points than the naïve algorithm (3271 vs 3277 points).

@urschrei
Copy link
Member Author

urschrei commented Aug 17, 2017

And now for the bad news:
cargo bench for the naïve version (reduction of norway_main) runs in around 6.5 ms / iteration
For the topo-preserving version, it's around 370 ms / iteration

@urschrei
Copy link
Member Author

I need to dig into this more, but it doesn't seem to be thrashing anywhere obvious; the naïve version runs the main loop 12.9k times, whereas the new version runs it 13.5k times (Only a 4.5% increase). This is somewhat complicated by not being able to run perf (bc macOS).

@mbattifarano
Copy link
Member

Cool! I can take a look at this over the weekend

@urschrei
Copy link
Member Author

urschrei commented Aug 18, 2017

I hooked up the same test to Instruments. 427 ms total with --release. Of which:
271 ms (64%) inserting into the R* tree
117 ms (27.5%) removing stale segments from the R* tree
Oof. Insertion is apparently O(log(n)) on average, but I'm struggling to reconcile that with the numbers. The JS implementation (using RBush) is beating this by 200ms.

The JS simplify function (though with the full coastline, which is a bit bigger), runs in 1750 ms on latest Chrome, with R* tree insertion taking around 380 ms, and removal of segments taking around 715 ms. That makes me feel a bit better.

@urschrei urschrei force-pushed the vw_topology_preserve_decrease branch 3 times, most recently from c2c72fc to 0dcb7c9 Compare August 18, 2017 15:47
@urschrei
Copy link
Member Author

OK, this is beginning to feel somewhat more coherent. I've refactored the intersection-removal logic, which has made it a lot simpler:
Whereas before, we were relying on a heuristic of "we need to remove a previous point in order to affect this point's geometry, so we modify the next entry in the heap and hope for the best. If it's the wrong point, the intersection persists until we've removed the right one". This worked OK in practice, but it was imprecise, and resulted in the removal of more points than necessary.
The new approach is conceptually the same, but more precise: the VScore struct has a new boolean field initialised to false. If a point's removal causes a self-intersection, the boolean is flipped, and the loop continues until it reaches the adjacent-triangle re-calculation logic.
At that point, the smaller of the adjacent point indices is identified, and its associated VScore area is set to negative epsilon before being pushed onto the heap, ensuring it's popped off again as soon as the loop resets.
This is producing a non-intersecting Norway polygon with the same number of vertices as the self-intersecting version (I'm not sure why yet, I was expecting n-1 vertices).

@urschrei
Copy link
Member Author

Progress:

  • Everything is faster. The 8k -> 3.2k simplification is running in around 300 ms for me. That's still slow, but it's a pretty large real-world example. For n in the 100's-of-points range, performance is perfectly acceptable IMO.
  • I spent a few days banging my head against a solution to a common problem: what to do if an additional point needs to be removed to prevent self-intersection, but removal wouldn't leave enough points to form a valid geometry (see illustration below). This is now handled by bailing out of the simplification if removal would cause an intersection, but we only have n + 2 points left, where n is the minimum number of points needed for a given geometry (2 for a LineString, 4 for a Polygon). The rationale is that if removal of the second point still didn't remove the intersection, we'd be out of points to remove, and be left with an invalid geometry. This strikes me as a pretty good compromise.

screenshot 2017-08-23 13 46 17
screenshot 2017-08-23 13 47 22

Removing the point denoted by the dot causes a self-intersection. Ordinarily, that could be fixed by removing the previous point too, but that's impossible because that wouldn't leave enough points to form a ring, so we keep the exterior as is, and end the simplification. This doesn't affect simplification of the inner ring(s), and there's a test for that.

@urschrei
Copy link
Member Author

OK! I think this death march is over for now. I'm pretty happy with the tests (although ideally I'd like some more weird adversarial examples) and the docs. Could some people have a look at it / try it out / try to break it / comment on the code? @frewsxcv @pka @mbattifarano @jdroenner?

I'm still not delighted about the Spade R* tree performance – someone has suggested using SQLite's R* functionality, but AFAIK it's not included in the standard SQLite distribution, and SQLite isn't available everywhere anyway, and I have no experience using it for spatial queries (Spatialite uses it, of course, so it certainly works). Other points in Spade's favour are the fact that we need an R or R* tree for a Polygon distance variant, so we're not pulling it in as a dependency just for this.

Copy link
Member

@frewsxcv frewsxcv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good after a first pass! i need to do a little more reading as i'm not too familiar with some of the logic here. if someone else in the community is, would love your feedback, otherwise i'll try to take a closer look into this in the coming days

{
fn partial_cmp(&self, other: &VScore<T>) -> Option<Ordering> {
Some(self.cmp(other))
}
}

impl<T> Eq for VScore<T> where T: Float {}
impl<T> Eq for VScore<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might be able to derive Eq but keep the explicit PartialEq impl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler complains about this for some reason, which is (IIRC) why I had to implement the blank Eq.

}

// Geometries that can be simplified using the topology-preserving variant
#[derive(Debug)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[derive(Copy)]

// have min_points left, stop: since a self-intersection causes removal of the spatially previous
// point, THAT could lead to a further self-intersection without the possibility of removing
// more points, potentially leaving the geometry in an invalid state.
#[derive(Debug)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[derive(Copy)]

let interiors = match simplified.is_empty() {
true => vec![],
false => simplified.into_iter().map(LineString).collect(),
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine to just do let interiors = simplified.into_iter().map(LineString).collect(); here. an empty iterator will result in an empty Vec

let candidates = tree.lookup_in_rectangle(&BoundingRect::from_corners(&br, &tl));
candidates
.iter()
.map(|c| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can get rid of the map here i think

    .iter()
    .any(|c| {
        ...
    })

@urschrei urschrei force-pushed the vw_topology_preserve_decrease branch 2 times, most recently from 0003d01 to 2530e2f Compare September 1, 2017 12:46
@frewsxcv
Copy link
Member

frewsxcv commented Sep 4, 2017

I just read through this again one more time, and everything seems good enough to land. I'm still not very knowledgeable about the algorithm (yet), but I don't think it makes sense to hold up merging this. If I learn more about it in the future, I can always revisit the code.

I noticed the title of the PR has 'WIP'. Is this ready to go or do you still have more changes?

@urschrei
Copy link
Member Author

urschrei commented Sep 4, 2017

I don't think there's anything left for me to do – edited. I'm happy to label this as "experimental" in the docs for a while, maybe? I'm just going to squash my commits, and then it's ready to go, apart from the possible doc change.

@urschrei urschrei changed the title WIP: Topology-preserving VW Topology-preserving VW Sep 4, 2017
This is based on the technique outlined at
https://www.jasondavies.com/simplify/

It uses an R* tree to restrict the intersection search to only those
segments that fall within a given triangle's bounding box. If an
intersection is detected, the previous point (i.e. the triangle's
"left" member is also marked for removal by setting its area to
negative epsilon), removing the self-intersection.
@urschrei urschrei force-pushed the vw_topology_preserve_decrease branch from 2530e2f to ee0f105 Compare September 4, 2017 10:51
@frewsxcv
Copy link
Member

frewsxcv commented Sep 4, 2017

I'm happy to label this as "experimental" in the docs for a while, maybe?

up to you if you want to do this, but i don't think it's necessary. i don't see any reason to consider this much more experimental than other algorithms in rust-geo.

thanks for doing this! your comments you added to this PR along the way were very helpful!

bors r+

bors bot added a commit that referenced this pull request Sep 4, 2017
143: Topology-preserving VW r=frewsxcv a=urschrei

This is a first attempt at topology-preserving VW, based on https://www.jasondavies.com/simplify/.
**Update**: see subsequent comments for a refinement of the approach.
<s>It's WIP for two reasons:
- I need a longer test input which produces self-intersections for a given epsilon. This is surprisingly difficult to cook up.</s>
- I'm not sure the approach in this fork is correct:
    1. If an intersection is found, move the triangle (`a`) into a vec, and exit the loop early
    2. pop the next triangle (`b`) and start processing it
    3. `peek_mut()` at the _next_ triangle (`c`) in the queue, and set its area to `a`s area. This guarantees its removal
   4. push `a` back onto the heap. If we modified the correct `c` triangle, `a` will be subsequently skipped in the main loop, as one of its support points was removed
   5. if it wasn't the triangle we needed to remove, `a` will be re-detected as causing an intersection.

<s>I've got an alternate implementation [here](https://github.com/urschrei/rust-geo/blob/vw_topology_preserve/src/algorithm/simplifyvw.rs#L168). The difference:
1. If an intersection is found, move the triangle (`a`) into a vec, and exit the loop early
2. pop the next triangle (`b`) and start processing it
3. `peek` at the _next_ triangle (`c`) in the queue, **and increase the epsilon to its area. This guarantees its removal**
4. push `a` back onto the heap. If the `c` triangle we remove was the correct one, `a` will be subsequently skipped in the main loop, as one of its support points was removed
5. If it wasn't, `a` will be re-detected as causing an intersection, and the epsilon will be increased again.

The drawback of the second approach is that it permanently increases the user-defined epsilon, leading to the removal of more points overall than the first approach.
Also, both of these approaches could be wrong. More complex test inputs will _really_ help here.</s>
@bors
Copy link
Contributor

bors bot commented Sep 4, 2017

Build succeeded

@bors bors bot merged commit 6be1761 into georust:master Sep 4, 2017
@urschrei urschrei deleted the vw_topology_preserve_decrease branch September 4, 2017 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants