->schema
and create-dataframe
should support fields of struct array #333
Closed
Description
- I have read through the quick start and installation sections of the README.
Info
Geni Version: 0.0.38
Problem / Steps to reproduce
user=> (require '[zero-one.geni.core.dataset-creation :as g] :reload)
nil
user=> (g/->schema {:coords [{:x :int :y :int}]})
Execution error (IllegalArgumentException) at org.apache.spark.sql.types.DataTypes/createArrayType (DataTypes.java:114).
elementType should not be null.
Expected results
user=> (g/->schema {:coords [{:x :int :y :int}]})
#object[org.apache.spark.sql.types.StructType 0x5cb6297e "StructType(StructField(coords,ArrayType(StructType(StructField(x,IntegerType,true), StructField(y,IntegerType,true)),true),true))"]
Proposed solution
At the moment, array-type
supports only simple val-type
listed in data-type->spark-type
. E.g. :bool
, :string
.
We can extend array-type
to support any Spark SQL DataType
, in the same fashion we are already doing in struct-field
.
Metadata
Assignees
Labels
No labels