Skip to content

Commit

Permalink
[SPARK-44832][CONNECT] Make transitive dependencies work properly for…
Browse files Browse the repository at this point in the history
… Scala Client

### What changes were proposed in this pull request?
This PR cleans up the Maven build for the Spark Connect Client and Spark Connect Common. The most important change is that we move `sql-api` from a `provided` to `compile` dependency. The net effect of this is that when a user takes a dependency on the client, all of its required (transitive) dependencies are automatically added.

Please note that this does not address concerns around creating an überjar and shading. That is for a different day :)

### Why are the changes needed?
When you take a dependency on the connect scala client you need to manually add the `sql-api` module as a dependency. This is rather poor UX.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually running maven, checking dependency tree, ...

Closes apache#42518 from hvanhovell/SPARK-44832.

Authored-by: Herman van Hovell <herman@databricks.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
  • Loading branch information
hvanhovell committed Aug 28, 2023
1 parent c8a2092 commit 50d9a56
Show file tree
Hide file tree
Showing 4 changed files with 28 additions and 61 deletions.
48 changes: 0 additions & 48 deletions connector/connect/client/jvm/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -39,75 +39,28 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-api_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sketch_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${connect.guava.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>failureaccess</artifactId>
<version>${guava.failureaccess.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-codec-http2</artifactId>
<version>${netty.version}</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-handler-proxy</artifactId>
<version>${netty.version}</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-transport-native-unix-common</artifactId>
<version>${netty.version}</version>
</dependency>
<dependency>
<groupId>com.lihaoyi</groupId>
<artifactId>ammonite_${scala.version}</artifactId>
<version>${ammonite.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<type>test-jar</type>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_${scala.binary.version}</artifactId>
Expand Down Expand Up @@ -148,7 +101,6 @@
<include>org.codehaus.mojo:*</include>
<include>org.checkerframework:*</include>
<include>org.apache.spark:spark-connect-common_${scala.binary.version}</include>
<include>org.apache.spark:spark-common-utils_${scala.binary.version}</include>
</includes>
</artifactSet>
<relocations>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import java.util.regex.Pattern
import com.typesafe.tools.mima.core._
import com.typesafe.tools.mima.lib.MiMaLib

import org.apache.spark.SparkBuildInfo.spark_version
import org.apache.spark.sql.test.IntegrationTestUtils._

/**
Expand All @@ -46,18 +47,38 @@ object CheckConnectJvmClientCompatibility {
sys.env("SPARK_HOME")
}

private val sqlJar = {
val path = Paths.get(
sparkHome,
"sql",
"core",
"target",
"scala-" + scalaVersion,
"spark-sql_" + scalaVersion + "-" + spark_version + ".jar")
assert(Files.exists(path), s"$path does not exist")
path.toFile
}

private val clientJar = {
val path = Paths.get(
sparkHome,
"connector",
"connect",
"client",
"jvm",
"target",
"scala-" + scalaVersion,
"spark-connect-client-jvm_" + scalaVersion + "-" + spark_version + ".jar")
assert(Files.exists(path), s"$path does not exist")
path.toFile
}

def main(args: Array[String]): Unit = {
var resultWriter: Writer = null
try {
resultWriter = Files.newBufferedWriter(
Paths.get(s"$sparkHome/.connect-mima-check-result"),
StandardCharsets.UTF_8)
val clientJar: File =
findJar(
"connector/connect/client/jvm",
"spark-connect-client-jvm-assembly",
"spark-connect-client-jvm")
val sqlJar: File = findJar("sql/core", "spark-sql", "spark-sql")
val problemsWithSqlModule = checkMiMaCompatibilityWithSqlModule(clientJar, sqlJar)
appendMimaCheckErrorMessageIfNeeded(
resultWriter,
Expand Down
6 changes: 0 additions & 6 deletions connector/connect/common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-api_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
Expand All @@ -47,7 +46,6 @@
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
Expand Down Expand Up @@ -85,25 +83,21 @@
<groupId>io.netty</groupId>
<artifactId>netty-codec-http2</artifactId>
<version>${netty.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-handler-proxy</artifactId>
<version>${netty.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-transport-native-unix-common</artifactId>
<version>${netty.version}</version>
<scope>provided</scope>
</dependency>
<dependency> <!-- necessary for Java 9+ -->
<groupId>org.apache.tomcat</groupId>
<artifactId>annotations-api</artifactId>
<version>${tomcat.annotations.api.version}</version>
<scope>provided</scope>
</dependency>
<!--
This spark-tags test-dep is needed even though it isn't used in this module,
Expand Down
2 changes: 1 addition & 1 deletion dev/connect-jvm-client-mima-check
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ fi
rm -f .connect-mima-check-result

echo "Build required modules..."
build/sbt "sql/package;connect-client-jvm/assembly;connect-client-jvm/Test/package;avro/package;protobuf/assembly"
build/sbt "sql/package;connect-client-jvm/package;connect-client-jvm/Test/package;avro/package;protobuf/assembly"

CONNECT_TEST_CLASSPATH="$(build/sbt -DcopyDependencies=false "export connect-client-jvm/Test/fullClasspath" | grep jar | tail -n1)"
SQL_CLASSPATH="$(build/sbt -DcopyDependencies=false "export sql/fullClasspath" | grep jar | tail -n1)"
Expand Down

0 comments on commit 50d9a56

Please sign in to comment.