Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added in FilteredIterable #231

Merged
merged 2 commits into from
Oct 1, 2018
Merged

Added in FilteredIterable #231

merged 2 commits into from
Oct 1, 2018

Conversation

adahn6
Copy link
Contributor

@adahn6 adahn6 commented Oct 1, 2018

Description:

This PR adds in a new Iterable subclass, FilteredITerable. Its goal is to allow for easy filtering of objects without a significant memory footprint, including on repeat iterations. To do this, it keeps a set of Identifiers for the objects in the Iterable. The identifier is defined as whatever the Identifier function returns for an object in the Iterable-- examples of this include the object itself (simple, but likely a heavy memory footprint), a data member (such as an AtlasEntity's identifier), or the result of a hash function. The iterator will skip over any element for which the Identifier function's return value is in the filtered set. Elements to be skipped on future iterations can be added to the FilteredIterator using the addToFilteredSet() method.

Example use cases are available in the unit tests for the class.

Potential Impact:

Nothing at the moment, given that it's not instantiated anywhere

Unit Test Approach:

There are two unit tests. The first makes a new Iterable from the numbers 0-9, then filters them in various ways to ensure filtering logic is correct. This check includes corner cases such as filtering twice, filtering first and last elements, and filtering all elements.

The second test uses a set of URLs in order to test a more complex identifier function, as well as integration with the StreamIterable class.

Test Results:

2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:45 - [https://github.com, http://github.com, https://test.com, http://test.com]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:52 - [https://github.com, https://test.com]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:60 - []
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:82 - [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:91 - [0, 1, 2, 3, 4, 7, 8, 9]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:99 - [0, 1, 2, 3, 4, 7, 8, 9]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:107 - [0, 1, 2, 3, 4, 7, 8, 9]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:120 - [8]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:127 - []
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:136 - [1, 2, 3, 4, 5, 6, 7, 8, 9]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:145 - [0, 1, 2, 3, 4, 5, 6, 7, 8]
2018-10-01 11:10:43 INFO  [main] FilteredIterableTest:155 - []

Describe other (non-unit) test results here.


In doubt: Contributing Guidelines

* Adds in a new FilteredIterable that keeps a Set of elements to filter from iterations. Useful for when Iterable will be iterated over multiple times, but some elements should be skipped for efficiency.
Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean and easy to ready code, thanks @adahn6! Found a couple of nits and looks like checkstyle is complaining.

* @return True if an element was added to the filter set, false if it wasn't (likely in the
* case it was already present in the set)
*/
public boolean addToFilteredSet(final Type t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually want to avoid single character parameter names. Suggest "type" here.

final Iterable<Type> types, final Set<IdentifierType> filterSet,
final Function<Type, IdentifierType> identifier)
{

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra line

@adahn6
Copy link
Contributor Author

adahn6 commented Oct 1, 2018

@MikeGost fixed!

Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice tests also!

Copy link
Collaborator

@matthieun matthieun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice testing!

@matthieun matthieun merged commit 6809924 into osmlab:dev Oct 1, 2018
@matthieun matthieun added this to the 5.1.14 milestone Oct 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants