Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CountryToShardList Caching #202

Merged
merged 8 commits into from
Aug 24, 2018
Merged

Conversation

jwpgage
Copy link
Contributor

@jwpgage jwpgage commented Aug 22, 2018

Description:

This PR adds command CountryToShardListCacher (name open for suggestion), which is a command very similar to CountryToShardListing but instead of writing a list of CountryShards, writes to file a country followed by all shards that intersect that country (each country and shard list pair is one line). The PR also adds CountryToShardListCache, which reads the file produced by CountryToShardListCacher, and stores these country/shardList pairs in a HashMap.

UPDATE: After PR comments and offline discussion, this has been refactored to a single class CountryToShardListCache, which can be constructed from scratch using a sharding, country list, and boundary file, or can be constructed from file. The class offers a save(WritableResource output) function, which writes to file the MultiMap of country to intersecting shards. The class also offers a getShardsForCountry(String country), which returns an Optional<List<SlippyTile>> from the MultiMap.

Potential Impact:

This does not change any existing libraries, but is instead a new feature.

Unit Test Approach:

The unit test added tests creating a CountryToShardListCache object from a .txt file generated by CountryToShardListCacher, and then tests getting the shards that intersect DMA.

The unit tests test getting the shard list for a specific country, creating an instance from scratch, creating an instance from file, and ensures that these have the same output.

Test Results:

N/A


In doubt: Contributing Guidelines

Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean and easy to understand, thanks @jwpgage! Couple of minor nits along the way.

public final class CountryToShardListCache
{
private static final Logger logger = LoggerFactory.getLogger(CountryToShardListCache.class);
private final HashMap<String, List<SlippyTile>> countryToShards = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Consider using MultiMap here instead of HashMap.

final String shardList = countryAndShardList[1];
final List<SlippyTile> shards = new ArrayList<>(
Arrays.asList(shardList.split("\\s*,\\s*"))).stream()
.map(string -> SlippyTile.forName(string))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: SlippyTile::forName

{
final List<String> shardNames = shardSet.stream()
.map(shard -> shard.getName()).collect(Collectors.toList());
writer.writeLine(String.format("%s||%s", country, shardNames)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to pull out the || as a delimiter constant?

{
file.lines().forEach(line ->
{
final String[] countryAndShardList = line.split(Pattern.quote("||"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: If you pull out the delimiter constant in the Cacher, than you reference it here, so if it ever changes, this code won't break.

private static final Switch<CountryBoundaryMap> BOUNDARIES = new Switch<>("boundaries",
"The country boundaries.", value ->
{
final Time start = Time.now();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lot going in when parsing a switch, recommend pushing this to a private method.

public class CountryToShardListCacheTest
{
@Test
public void testGetShardNamesForCountry()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a quick test for a non-existent country to verify nothing breaks?

private static final Logger logger = LoggerFactory.getLogger(CountryToShardListCache.class);
private final HashMap<String, List<SlippyTile>> countryToShards = new HashMap<>();

public static CountryToShardListCache create(final Resource file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing a random idea out here - would it be useful to have the CountryToShardListCache also take in a CountryBoundaryMap and Sharding? Then CountryToShardListCache can support read() and write() methods and can essentially become a resource. Not sure what the scope of this very useful functionality is though!

Another advantage is that it would also make the CountryToShardListCacher really only a thin layer. The logic in the CountryToShardListCacher can then be in one place and users won't have to duplicate that code to write CountryToShardList files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, working on this now

Copy link
Collaborator

@matthieun matthieun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @jwpgage! A few suggestions. It would also be nice to have extra javadoc comments on the methods that are public facing.

});
}

public CountryToShardListCache(final Resource file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the resource be called resource?

}
}

public Optional<List<SlippyTile>> getShardsForCountry(final String country)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I think returning an empty list is simpler than having an optional. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely simpler, I changed it

private static final String DELIMITER = "||";
private final MultiMap<String, SlippyTile> countryToShards = new MultiMap<>();

public CountryToShardListCache(final CountryBoundaryMap boundaries, final StringList countries,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding another constructor that uses all the countries in the boundarymap and dos not require a list of countries?

*
* @author james-gage
*/
public final class CountryToShardListCache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making this a Command that takes a path to a country boundary map file, a sharding tree file, and an optional list of countries, and saves it to a result file?

Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @jwpgage - very clean final product. Despite the intuitive naming, still requesting a +1 to Matthieu's suggestion of adding some java doc to public facing methods.

MikeGost
MikeGost previously approved these changes Aug 24, 2018
Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

Copy link
Collaborator

@matthieun matthieun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jwpgage, sorry I had missed those to logger error statements. I think they should be exceptions instead. Otherwise everything looks great! Once that changes it is ready for merge!

}
catch (final Exception e)
{
logger.error("Error while reading CountryToShardListCache file!", e);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw a CoreException here that references the other one! I don't think we want a partially built file, with an error message buried in the logs..

}
catch (final Exception e)
{
logger.error("Error while writing CountryToShardListCache to file!", e);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here!

@matthieun matthieun merged commit 85caca6 into osmlab:dev Aug 24, 2018
@jwpgage jwpgage deleted the shardlistcache branch August 24, 2018 21:48
@matthieun matthieun added this to the 5.1.9 milestone Aug 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants