-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ShardBucketCollection #184
Conversation
if (Objects.nonNull(collection)) | ||
{ | ||
final long addCount = collection.stream().filter(this::add).count(); | ||
return addCount > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very slick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
{ | ||
/** | ||
* Collection that will handle splitting multipolygons into buckets by overrideing the add | ||
* function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: overriding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff! Minor nits in the test classes.
caught = true; | ||
} | ||
|
||
Assert.assertTrue("Resolve shard default method is an unsupported operation", caught); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check out @Test(expected = UnsupportedOperationException.class)
annotation to allow for easily verifying expected exceptions. Might simplify this code a little.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for this. I use this patten in ProtoIntegerStringDictionaryAdapterTest
. Check that for an example in action.
final PolyLine[] allArray = polylineBuckets2.toArray(array1); | ||
Assert.assertArrayEquals(array2, allArray); | ||
|
||
// get code coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does get code coverage mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to use all the methods to get good coverage score for sonar.
|
||
/** | ||
* Construct the collection by allocation space for a collection for each of the shards and | ||
* assigning the index back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: allocating
* @param shard | ||
* shard associated with the collection | ||
* @return true if the collection has been added to successful and changed as a result, false | ||
* otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: successfully
public boolean contains(final Object object) | ||
{ | ||
final Optional<LocatedType> typedItem = this.castToLocatedType(object); | ||
if (typedItem.isPresent()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this pattern a lot actually. I will be stealing it ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
credit due to @matthieun
Looks good @jklamer . Left a few minor documentation nits. So help me to understand this a bit better. Is this sort of like a way to partition a collection of Based on the logic in the And then from there, it looks like you can ask for all collections that overlap some arbitrary bounds (based on the shard boundary with which that collection is associated), which it can compute efficiently due to having a spatial index. Am I getting it? What types of problems does this facilitate solving? (I am sure there are many, since you clearly put some serious effort into creating this.) |
@lucaspcram Yes exactly you are getting it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question about the Sharding v.s. RTree. If you think this is unnecessary, I will approve!
* Type of the Collection of LocatedType associated with shards | ||
* @author jklamer | ||
*/ | ||
public abstract class ShardBucketCollection<LocatedType extends Located & Serializable, CollectionType extends Collection<LocatedType> & Serializable> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the most complex declaration I have seen ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks much scarier than it is
this(maximumBounds, SlippyTile.allTiles(zoomLevel, maximumBounds)); | ||
} | ||
|
||
public ShardBucketCollection(final Rectangle maximumBounds, final Sharding sharding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharding implementations are already behaving like a spatial index. I believe here you could gain some performance and simplification by using the sharding itself to query the intersecting bounds instead of a RTree, as the scope here is much smaller that what RTrees can handle...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if I query the sharing directly I would also need a map of of shard -> index, so that would require two look-ups instead of just the one now, even if the second look up is constant time it is more steps. I would also need to store both sharding and the map of shard ->index, instead of just the RTree.
In terms of what RTree's can handle, this allows for initializations for slippy tiles up to the SlippyTile max zoom, which is 30. At zoom 19 there are 274.9 billion tiles. So even if the max bounds isn't the whole world the Rtree needs the ability to hold a lot even if my tests didn't work it that hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, if it works for you now, we can merge it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄
this(maximumBounds, SlippyTile.allTiles(zoomLevel, maximumBounds)); | ||
} | ||
|
||
public ShardBucketCollection(final Rectangle maximumBounds, final Sharding sharding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, if it works for you now, we can merge it!
Description:
This is a new abstract collection implementation that allows for easy sorting of located items, like AtlasEntities or CheckFlags, into different buckets as defined by shards. This collection acts as wrapper for the underlying bucket collections. The bucket collection type is determined by the implementer. There are a number of control functions that need to be implemented to allow for simple and flexible use.
Rectangle and Counter changes were made to enable the testing and creation of this class.
Details:
The collection lazily constructs collections per bucket. It uses synchronized blocks to ensure only one collection is initialized per shard and no information is lost. This is to allow for multithreaded insertion/sorting, but it does not guarantee the wrapped collection is thread safe.
The bucket collections are stored in an array and the initializing shards are stored in an immutable R-tree with the index of its corresponding collection in the array. In the figure below I show the SlippyTile shards I used to initialize a number of test collections, each with the corresponding collection array index.
All collection functions interface with the collection index and/or collection array.
The collection does not support safe iterating while inserting.
Potential Impact:
This will change no current implementations but it will enable a number of different located sorting tasks by enabling quick and simple implementation.
Unit Test Approach:
I created the most basic test collection possible and tested all the main collection functionalities on that as well as some of the unique functionalities. I then created more test collections that take advantage of the flexibility of the control functions and tested that they worked as intended.
There are multiple tests checking for the same thing in different ways to gain code coverage.
Test Results:
As expected.
In doubt: Contributing Guidelines