-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
6070 Update Apache Commons Lang3 #7801
Conversation
…6070 Also switch from bulk imports of commons.lang3.* to single class style. Fixing for other parts like java.util, too.
IQSS#6070 Apache Commons Lang3 3.12.0 moved `StringEscapeUtils` to Apache Commons Text. Also, escapeHtml() and escapeXml() from Apache Commons Lang (v2) have been renamed to escapeHtml4() and escapeXml10() with v3.0
Thanks @poikilotherm for flagging the failing test. We've been dealing with some production issues on dataverse.harvard.edu this week, but we'll take a look at this as soon as we can. |
FWIW: In general, I think these changes should be fine - just updating to a later version. (We actually had a few lang3 calls already (it must be a dependency of something already in the pom)). For the failing code - it just looks like it is trying to check that the field is either null, empty, or a number potentially starting with a + or -. The code does expect that the string is at least two chars long and should probably be fixed by splitting the cases up a bit further - a one char string has to be numeric and if it is two or more chars, the existing check would work (and not give an empty value to isNumeric()). Others may know more, but it looks straight-forward enough that I think you could just make the change and let people review the result. |
Aye, the transitive dependency has been used. See also my extensive review from last year.
Jim I know that the code is expected to do that. Yet it seems to be more complicated, which is why I consider there is a bug hidding either in the test code or within the app code. Test The other tests also are failing because of detecting as continuous instead of discrete. Removing all but L1 (header) plus L7 from Using this simple reproducer with just a single CSV line to be parsed, but now using Which is leading me to: who want's to go for a 🐛 hunt? Thanks @djbrooke for sending in more troops 🚁 👍 |
I believe @qqmyers has been right all along - it was simply the single digit numbers (like the "2" and "0", in the lines 3 and 4 of the CSV file, respectively) that were breaking the integer test. The code was checking the first character - which is also the last character of the string, in this case - against
with
So I'm going to check that in. (this change makes the test pass; I'll give it another thought, whether there are advantages to sticking with |
I don't really get why Jenkins failed to build the branch again, after I checked in the fix for the CSV ingest test. (The branch is building fine for me locally). |
@landreev Travis failed as well:
|
@poikilotherm Could you please sync the branch up with develop? |
One remaining thing pops up in my mind: as formerly there had been v2 and v3 usage in parallel, should we enforce a We could let @reviewdog bark in PR reviews based on these rules. https://github.com/marketplace/actions/run-java-checkstyle |
Great, thank you. I may check in a better permanent fix for the CSV ingest (trying to decide on that now), and will move the PR after that. |
... to be safe. (IQSS#6070)
Also, yes, it would be a good idea to add a check for transitive dependencies like this, as part of the overall automated review framework. |
I created a test branch and a test PR to show the usage of a barking reviewdog in action... The check action is reporting the failed state. It does not leave comments as it's running in a fork, only showing the check report inline. It's fairly simple to use and only barks on files of the PR's changeset. Shall I include this in this PR? (Or is this a part of #7876 / #3950 ?) |
Hi @poikilotherm I'm testing this today but due to recent dataverse version change to 5.5, cannot build/deploy. Would you mind updating this branch from develop? Thanks. |
Done @kcondon |
What this PR does / why we need it:
I want to use modern Apache Commons Lang3 functionality (FailableStream) for #5989. (see usage in code)
To keep the scope of the PR for that issue smallish, I created this PR, as you folks prefer small steps.
The dependency update has been a long standing thing. See the issue.
Which issue(s) this PR closes:
Closes #6070
Special notes for your reviewer:
🐛 THERE IS A HIDDEN BUG.
Some tests for the CSV ingest fail.
Switching http://github.com/IQSS/dataverse/blob/eca8b8fd403002327439f81f54c6ed9edde391c7/src/main/java/edu/harvard/iq/dataverse/ingest/tabulardata/impl/plugins/csv/CSVFileReader.java#L211-L211 like this makes all tests pass:
Switching that function from Lang3 back to Lang2 should NOT make the tests pass. Playing around with the CSV files and the expected tests results makes the tests flaky with v2 and pass with v3. It looks like there is a hidden bug in the CSV parser not revealed by the tests. (The only change in
isNumeric()
from v2 -> v3 is that empty strings are also not counted as numeric. The test cases have no example covered by this change.)As I am not very familiar with the ingest code, I would appreciate some help.
Suggestions on how to test this:
Usual unit tests and integration tests. UI uses some escaping routines, but they have just been moved in upstream and there is no functional changes expected.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Nope.
Is there a release notes update needed for this change?:
I don't think so.
Additional documentation:
None.