-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for S3 storage #146
Comments
Unfortunately, S3 is not supported right now. But we might add S3 support in the future. |
S3 storage should be supported in Please, let me know if it works for you. |
Does cobrix supports gs:// file system ? |
From the filesystem support perspective, |
Does cobrix support S3 file systems ?
I am getting "java.lang.IllegalArgumentException: Wrong FS" error when loading the copybook and datafile from a AWS S3 bucket.
Code:
val spark = SparkSession.builder().appName("Spark-Cobol").getOrCreate()
import spark.implicits._
import za.co.absa.cobrix.spark.cobol.source
val df = spark.read.format(
"za.co.absa.cobrix.spark.cobol.source").option(
"copybooks", "s3://xxxx/tesfile.cbl").load("s3://xxxx/sourcedata/DATAFILE0100")
df.printSchema
df.show()
Error:
java.lang.IllegalArgumentException: Wrong FS: s3://xxxx/tesfile.cbl, expected: hdfs://ip-xxx-xx-xx-85.ec2.internal:8020
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:653)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1430)
at za.co.absa.cobrix.spark.cobol.source.parameters.CobolParametersValidator$.za$co$absa$cobrix$spark$cobol$source$parameters$CobolParametersValidator$$validatePath$1(CobolParametersValidator.scala:71)
at za.co.absa.cobrix.spark.cobol.source.parameters.CobolParametersValidator$$anonfun$validateOrThrow$2.apply(CobolParametersValidator.scala:94)
at za.co.absa.cobrix.spark.cobol.source.parameters.CobolParametersValidator$$anonfun$validateOrThrow$2.apply(CobolParametersValidator.scala:93)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at za.co.absa.cobrix.spark.cobol.source.parameters.CobolParametersValidator$.validateOrThrow(CobolParametersValidator.scala:93)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createRelation(DefaultSource.scala:52)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createRelation(DefaultSource.scala:48)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:156)
... 160 elided
The text was updated successfully, but these errors were encountered: