Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import implementation from OSMesa project #60

Merged
merged 41 commits into from
Mar 22, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
b927f93
Clean out project; Remove obsolete code and related documentation
jpolchlo Mar 1, 2019
72470ae
Update README
jpolchlo Mar 1, 2019
f4d6075
Freshen SBT project configs
jpolchlo Mar 1, 2019
a0ccffc
Remove chatty scalac warnings
jpolchlo Mar 1, 2019
acac58c
Import code from external project
jpolchlo Mar 1, 2019
fa2d035
Include SPI registry
jpolchlo Mar 1, 2019
98ce096
Remove extraneous version identifier
jpolchlo Mar 1, 2019
b7092c1
Remove outdated console setup
jpolchlo Mar 1, 2019
82ee939
Import test suite
jpolchlo Mar 1, 2019
f12c970
Bump version number
jpolchlo Mar 4, 2019
115ccb8
Bring in missing SAX parser from OSMesa (oops)
jpolchlo Mar 5, 2019
4f17b6b
Small config fixes
jpolchlo Mar 5, 2019
d47c826
Move files to right place in tree
jpolchlo Mar 5, 2019
da8643c
Make constructGeometries work for inputs from Change streams as well …
jpolchlo Mar 5, 2019
9b78222
Update README
jpolchlo Mar 5, 2019
b01b60f
Remove raster package
jpolchlo Mar 6, 2019
378b5be
Move Geocode out of ProcessOSM
jpolchlo Mar 13, 2019
c4f14f4
Address PR comments (remove caching facilities, improve conversion fu…
jpolchlo Mar 14, 2019
4bcf487
Fix test
jpolchlo Mar 14, 2019
b1569e8
Make tests pass
jpolchlo Mar 14, 2019
ca6a156
Adjust version number [skipci]
jpolchlo Mar 14, 2019
cfd541a
Unused
mojodna Mar 14, 2019
9558453
Ignore benchmark artifacts
mojodna Mar 14, 2019
0ad9c7a
Remove benchmarks referencing unused code
mojodna Mar 14, 2019
ec0e127
Remove unused imports
mojodna Mar 14, 2019
57ce394
geotrellis.spark.io.hadoop._ _is_ required
mojodna Mar 14, 2019
d161297
Upgrade dependencies
mojodna Mar 14, 2019
a51bdae
Style tweaks
mojodna Mar 14, 2019
b118b33
Additional docs for ProcessOSM entrypoints
mojodna Mar 14, 2019
1260e13
Make sure we are using compressed internal representations for member…
jpolchlo Mar 15, 2019
c2e1571
Reorganize library components; simplify and rename main user-facing i…
jpolchlo Mar 18, 2019
b2a0bbd
Update README to use new struture
jpolchlo Mar 18, 2019
23c6b83
Improve description of `toGeometry`'s output [skip ci]
jpolchlo Mar 18, 2019
e68d7e8
Fix slight README issue
jpolchlo Mar 18, 2019
1b7c43f
Fix tests for new structure
jpolchlo Mar 18, 2019
78820aa
Update docs to include section on internal package and compressed mem…
jpolchlo Mar 19, 2019
51f0827
Make tests work
jpolchlo Mar 19, 2019
bf2b5fc
Clean up
jpolchlo Mar 19, 2019
bb64732
Upgrade to `org.locationtech` organization for JTS (bumps GT and Geom…
jpolchlo Mar 19, 2019
230627d
Adjust copy method definitions for CoordinateSequence subclasses
jpolchlo Mar 22, 2019
12e2da2
Fix tests
jpolchlo Mar 22, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Address PR comments (remove caching facilities, improve conversion fu…
…nctions)
  • Loading branch information
jpolchlo committed Mar 22, 2019
commit c4f14f43a29fb5fa658c33cc2edb01cdb238f5c6
54 changes: 6 additions & 48 deletions src/main/scala/vectorpipe/ProcessOSM.scala
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ import com.vividsolutions.jts.{geom => jts}
import vectorpipe.functions.osm._
import vectorpipe.relations.MultiPolygons
import vectorpipe.relations.Routes
import vectorpipe.util.{Caching, Resource}
import vectorpipe.util.Resource
import spray.json._

object ProcessOSM {
Expand Down Expand Up @@ -66,20 +66,6 @@ object ProcessOSM {

lazy val VersionedElementEncoder: Encoder[Row] = RowEncoder(VersionedElementSchema)

/* support UDF for constructGeometries */
private case class StrMember(`type`: String, ref: Long, role: String)
private val convertMembers = org.apache.spark.sql.functions.udf { member: Seq[Row] =>
if (member == null)
null
else {
member.map { row: Row =>
StrMember(vectorpipe.model.Member.stringFromByte(row.getAs[Byte]("type")),
row.getAs[Long]("ref"),
row.getAs[String]("role"))
}
}
}

/**
* Snapshot pre-processed elements.
*
Expand Down Expand Up @@ -232,25 +218,7 @@ object ProcessOSM {
import input.sparkSession.implicits._
val st_pointToGeom = org.apache.spark.sql.functions.udf { pt: jts.Point => pt.asInstanceOf[jts.Geometry] }

// The following type conversion needed in case input comes from Change source
val elements =
Try(input.schema
.find(_.name == "members")
.get
.dataType
.asInstanceOf[ArrayType]
.elementType
.asInstanceOf[StructType]
.find(_.name == "type")
.get) match {
case Failure(_) => throw new IllegalArgumentException(s"Could not get type of 'members' in schema ${input.schema}")
case Success(field) =>
if (field.dataType == ByteType)
input.withColumn("members", convertMembers('members))
else
input
}

val elements = elaborateMemberTypes(input)
jpolchlo marked this conversation as resolved.
Show resolved Hide resolved
val nodes = ProcessOSM.preprocessNodes(elements)

val nodeGeoms = ProcessOSM.constructPointGeometries(nodes)
Expand Down Expand Up @@ -308,8 +276,7 @@ object ProcessOSM {
* @param _nodesToWays Optional lookup table.
* @return Way geometries.
*/
def reconstructWayGeometries(_ways: DataFrame, _nodes: DataFrame, _nodesToWays: Option[DataFrame] = None)(implicit
cache: Caching = Caching.none, cachePartitions: Option[Int] = None): DataFrame = {
def reconstructWayGeometries(_ways: DataFrame, _nodes: DataFrame, _nodesToWays: Option[DataFrame] = None): DataFrame = {
implicit val ss: SparkSession = _ways.sparkSession
import ss.implicits._
ss.withJTS
Expand Down Expand Up @@ -504,20 +471,14 @@ object ProcessOSM {
* @param geoms DataFrame containing way geometries to use in reconstruction.
* @return Relations geometries.
*/
def reconstructRelationGeometries(_relations: DataFrame, geoms: DataFrame)(implicit cache: Caching = Caching.none,
cachePartitions: Option[Int] = None)
: DataFrame = {
def reconstructRelationGeometries(_relations: DataFrame, geoms: DataFrame): DataFrame = {
val relations = preprocessRelations(_relations)

reconstructMultiPolygonRelationGeometries(relations, geoms)
.union(reconstructRouteRelationGeometries(relations, geoms))
}

def reconstructMultiPolygonRelationGeometries(_relations: DataFrame, geoms: DataFrame)(implicit cache: Caching =
Caching.none,
cachePartitions: Option[Int]
= None)
: DataFrame = {
def reconstructMultiPolygonRelationGeometries(_relations: DataFrame, geoms: DataFrame): DataFrame = {
implicit val ss: SparkSession = _relations.sparkSession
import ss.implicits._
ss.withJTS
Expand Down Expand Up @@ -579,10 +540,7 @@ object ProcessOSM {
'minorVersion)
}

def reconstructRouteRelationGeometries(_relations: DataFrame, geoms: DataFrame)(implicit cache: Caching = Caching
.none,
cachePartitions: Option[Int] = None)
: DataFrame = {
def reconstructRouteRelationGeometries(_relations: DataFrame, geoms: DataFrame): DataFrame = {
implicit val ss: SparkSession = _relations.sparkSession
import ss.implicits._
ss.withJTS
Expand Down
40 changes: 35 additions & 5 deletions src/main/scala/vectorpipe/functions/osm/package.scala
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
package vectorpipe.functions

import org.apache.spark.sql.expressions.UserDefinedFunction
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import org.apache.spark.sql.{Column, Row}
import vectorpipe.ProcessOSM._
import vectorpipe.model.Member

import scala.util.{Try, Success, Failure}
import scala.util.matching.Regex

package object osm {
Expand Down Expand Up @@ -169,11 +172,7 @@ package object osm {

private val _compressMemberTypes = (members: Seq[Row]) =>
members.map { row =>
val t = row.getAs[String]("type") match {
case "node" => NodeType
case "way" => WayType
case "relation" => RelationType
}
val t = Member.typeFromString(row.getAs[String]("type"))
val ref = row.getAs[Long]("ref")
val role = row.getAs[String]("role")

Expand All @@ -182,6 +181,37 @@ package object osm {

lazy val compressMemberTypes: UserDefinedFunction = udf(_compressMemberTypes, MemberSchema)

private case class StrMember(`type`: String, ref: Long, role: String)

private val convertMembers = org.apache.spark.sql.functions.udf { member: Seq[Row] =>
if (member == null)
null
else {
member.map { row: Row =>
StrMember(vectorpipe.model.Member.stringFromByte(row.getAs[Byte]("type")),
row.getAs[Long]("ref"),
row.getAs[String]("role"))
}
}
}

def elaborateMemberTypes(input: DataFrame): DataFrame = {
// The following type conversion needed in case input comes from Change source
Try(input.schema("members")
.dataType
.asInstanceOf[ArrayType]
.elementType
.asInstanceOf[StructType]
.apply("type")) match {
case Failure(_) => throw new IllegalArgumentException(s"Could not get type of 'members' in schema ${input.schema}")
case Success(field) =>
if (field.dataType == ByteType)
input.withColumn("members", convertMembers(col("members")))
else
input
}
}

// matches letters or emoji (no numbers or punctuation)
private val ContentMatcher: Regex = """[\p{L}\uD83C-\uDBFF\uDC00-\uDFFF]""".r
private val TrailingPunctuationMatcher: Regex = """[:]$""".r
Expand Down
7 changes: 1 addition & 6 deletions src/main/scala/vectorpipe/model/AugmentedDiff.scala
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,7 @@ object AugmentedDiff {
def apply(sequence: Int,
prev: Option[Feature[GTGeometry, ElementWithSequence]],
curr: Feature[GTGeometry, ElementWithSequence]): AugmentedDiff = {
val `type` = curr.data.`type` match {
case "node" => ProcessOSM.NodeType
case "way" => ProcessOSM.WayType
case "relation" => ProcessOSM.RelationType
}

val `type` = Member.typeFromString(curr.data.`type`)
val minorVersion = prev.map(_.data.version).getOrElse(Int.MinValue) == curr.data.version

AugmentedDiff(
Expand Down
1 change: 1 addition & 0 deletions src/main/scala/vectorpipe/model/Member.scala
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ object Member {
case "node" => NodeType
case "way" => WayType
case "relation" => RelationType
case _ => null.asInstanceOf[Byte]
}

def stringFromByte(b: Byte): String = b match {
Expand Down
68 changes: 0 additions & 68 deletions src/main/scala/vectorpipe/util/Caching.scala

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -114,14 +114,8 @@ class MultiPolygonRelationReconstructionSpec extends PropSpec with TableDrivenPr
case ((changeset, id, version, minorVersion, updated, validUntil), rows) =>
val members = rows.toVector
// TODO store Bytes as the type in fixtures
val types = members.map(_.getAs[String]("type") match {
case "node" => NodeType
case "way" => WayType
case "relation" => RelationType
case _ => null.asInstanceOf[Byte]
})
val types = members.map(Member.typeFromString(_.getAs[String]("type")))
val roles = members.map(_.getAs[String]("role"))
//val geoms = members.map(_.getAs[jts.Geometry]("geom"))
val geoms = members.map(_.getAs[jts.Geometry]("geometry"))
val mp = build(id, version, updated, types, roles, geoms).orNull

Expand Down