-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build: Run Iceberg with JDK 17 #7391
Build: Run Iceberg with JDK 17 #7391
Conversation
private static final Pattern TIMESTAMPTZ = | ||
Pattern.compile( | ||
"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}(:\\d{2}(.\\d{1,6})?)?([-+]\\d{2}:\\d{2}|Z)"); | ||
"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}(:\\d{2}(.\\d{1,10})?)?([-+]\\d{2}:\\d{2}|Z)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when running with java 17, i get 2023-04-20T19:17:17.748170628Z
when using OffsetTime class, it only happens with me in ubuntu and not in my developement mac.
As we need to support both the env's hence adjusted the regex to account for the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 10 instead of 9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, counting mistake :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would TIME
also be affected by this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it would be 9 places even for java 8, looks like we were not hitting this case earlier
public static final DateTimeFormatter ISO_LOCAL_TIME;
static {
ISO_LOCAL_TIME = new DateTimeFormatterBuilder()
.appendValue(HOUR_OF_DAY, 2)
.appendLiteral(':')
.appendValue(MINUTE_OF_HOUR, 2)
.optionalStart()
.appendLiteral(':')
.appendValue(SECOND_OF_MINUTE, 2)
.optionalStart()
.appendFraction(NANO_OF_SECOND, 0, 9, true)
.toFormatter(ResolverStyle.STRICT, null);
}
fixed the hard coded test in time as well as updated the regex
Have added java 17 CI for all engine's as well, as part of commit 1bc99b8, except Flink couldn't find minimum flink version since when java 17 is supported, found this issue https://issues.apache.org/jira/browse/FLINK-15736, but it seems un-resolved (cc @stevenzwu, who might have more context here) This has now added new CI's, we can reduce the waiting time by the idea, i proposed idea a while back as part of reducing the load on main iceberg repo here (#5153 (comment)) :
We can do the same for iceberg, i.e run ci on our fork and iceberg just finds the run and attaches the workflow to our pr in iceberg repo. As per spark folks :
Happy to create a pr to do same for iceberg. |
....4/spark/src/test/java/org/apache/iceberg/spark/source/TestMetadataTableReadableMetrics.java
Outdated
Show resolved
Hide resolved
....3/spark/src/test/java/org/apache/iceberg/spark/source/TestMetadataTableReadableMetrics.java
Outdated
Show resolved
Hide resolved
1bc99b8
to
2c9a60e
Compare
791541f
to
b5d247d
Compare
} else if (JavaVersion.current() == JavaVersion.VERSION_17) { | ||
project.ext.jdkVersion = '17' | ||
project.ext.extraJvmArgs = ["-XX:+IgnoreUnrecognizedVMOptions", | ||
"--add-opens", "java.base/java.io=ALL-UNNAMED", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to open all these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I collected these flag by debugging individual failures of ut's from test projects (spark / arrow / aws (kryo)) this is a superset i would say we may define only the relevant one's per package, but since there was no harm in passing additional flags hence passed all the flags in all the projects. please let me know your thoughts if the above makes sense, happy to add required flags per projects
b5d247d
to
d2faa94
Compare
d2faa94
to
9bcbc9d
Compare
....4/spark/src/test/java/org/apache/iceberg/spark/source/TestMetadataTableReadableMetrics.java
Outdated
Show resolved
Hide resolved
"--add-opens", "java.base/sun.nio.ch=ALL-UNNAMED", | ||
"--add-opens", "java.base/sun.nio.cs=ALL-UNNAMED", | ||
"--add-opens", "java.base/sun.security.action=ALL-UNNAMED", | ||
"--add-opens", "java.base/sun.util.calendar=ALL-UNNAMED"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could provide a list of classes to open and generate the --add-opens
and =ALL-UNNAMED
boilerplates. But that might be over-engineering, and I am fine as is. Curious what other people think.
4ef73e0
to
3d2fae1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work! Looks good to me, except for some open questions waiting for other comments from other people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks mostly good to me (I've also double-checked whether we can specify all JVM args in gradle.properties
via org.gradle.jvmargs
but those aren't properly passed down).
I think it would be good to also double-check the difference in the sizes that required the test changes
+1 here is the analysis for the same, basically JDK 8 was reporting column size for |
@@ -47,10 +47,10 @@ public class ExpressionUtil { | |||
private static final Pattern DATE = Pattern.compile("\\d{4}-\\d{2}-\\d{2}"); | |||
private static final Pattern TIME = Pattern.compile("\\d{2}:\\d{2}(:\\d{2}(.\\d{1,6})?)?"); | |||
private static final Pattern TIMESTAMP = | |||
Pattern.compile("\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}(:\\d{2}(.\\d{1,6})?)?"); | |||
Pattern.compile("\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}(:\\d{2}(.\\d{1,9})?)?"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry forgot to ask, why is this changed from 6 to 9? Does JDK17 change default precision of timestamp or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes with my mac on using JDK 17 existing regex would pass, but with ubuntu with JDK 17, was getting 2023-04-20T19:17:17.748170628Z
more details here: #7391 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
waiting for CI to complete
Thanks for the work and everyone's review! Looks like everything is passing, and all comments are addressed. I will go ahead and merge it, we can fix anything remaining in follow-up PRs. |
* Updated README.md #7391 It's just a simple change in my thinking. * Reduce writing I like that it's shorter. Co-authored-by: Fokko Driesprong <fokko@apache.org> --------- Co-authored-by: Fokko Driesprong <fokko@apache.org>
* Updated README.md apache#7391 It's just a simple change in my thinking. * Reduce writing I like that it's shorter. Co-authored-by: Fokko Driesprong <fokko@apache.org> --------- Co-authored-by: Fokko Driesprong <fokko@apache.org>
About the change
This change adds a CI to run iceberg with JDK 17, considering now most of the engines are adding support for jdk 17 for ex : trino, spark.
Fixes: #7290
cc @jackye1995