-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency update: Apache Common Lang v2 -> v3 #6070
Comments
Hi @poikilotherm, do you plan to work on this one or should I keep it unassigned? |
I can submit a PR, but I thought about discussing this first? Maybe there is some technical issue with this or you guys want smaller chunks or whatever... Just wanted to be polite, ask and discuss before I do... 😉 If you @djbrooke want me to work on this and everybody at IQSS is fine with that, this should be a rather small thing to do... I'll take an assignment as a "go!". |
@poikilotherm I'm wondering if we should wait until we delete the all the locally build jars in "local_lib/com/lyncode" from the codebase before we take on this refactoring. These jars: https://github.com/IQSS/dataverse/tree/v4.15.1/local_lib/com/lyncode I found these by reading this comment in the pom.xml:
What I'm trying to say is that maybe we should just stick with the v2 version (unless there's a security issue) until we have time to move completely to v3. As I said in this issue, I'm worried about the war file getting bigger and bigger: Increasingly slow feedback loop for developers, increasingly large WAR files #5593 |
Whoops, I had some misunderstanding but @poikilotherm set me straight at http://irclog.iq.harvard.edu/dataverse/2019-08-06#i_102177 I'd like to back up to what problem we're trying to solve here. Originally, @poikilotherm said he'd like to use v3 of Apache Common Lang. As he stated above, v3 is already available in Dataverse because is it pulled in as a transitive dependency. He could use v3 right now but it's a bad practice to write code that relies on a transitive dependency. We could declare v3 as an explicit dependency in the pom and keep v2 in there as well but this is not desirable to @poikilotherm . He would rather change the version in the pom from v2 to v3. The second problem @poikilotherm is trying to solve is that v2 is old. It's flagged as "legacy" by its developers. I assume Dataverse has a lot of old dependencies like this. I don't know if there is any immediate risk in this case. I don't know of any security problems or anything. The third problem @poikilotherm is trying to solve is to help get the war file to be smaller. It is unknown how much smaller the war file will become. Those are the benefits. What are the costs or risks? There's @poikilotherm 's valuable time, of course, and he estimates that 87 Java files will be affected. That seems like a lot to me but he said he's willing to provide a list of what to test to QA. I guess we'll see how long that list before we can guess how much of a QA effort this will become. I'm not personally very enthusiastic about the benefits above (no new features or bug fixes) but if @poikilotherm wants to go ahead and make a pull request and summarize what features need to be tested, I'm happy to review it. |
Alright, my findings so far in develop@09fe94bd: Text occurences of Apache Commons Lang V3 find src/main/java src/test/java -name '*.java' -exec grep "commons\.lang3\." {} \; | sort | uniq -c | sort -n
1 import org.apache.commons.lang3.math.NumberUtils;
1 import org.apache.commons.lang3.StringUtils;
1 import org.apache.commons.lang3.text.WordUtils;
3 import org.apache.commons.lang3.StringUtils; So there is already bad code in place: usage of a transitive dependency... Text occurences of Apache Commons V2 $ find src/main/java src/test/java -name '*.java' -exec grep "commons\.lang\." {} \; | sort | uniq -c | sort -n
1 import org.apache.commons.lang.ArrayUtils;
1 import org.apache.commons.lang.builder.*;
1 import org.apache.commons.lang.builder.ToStringBuilder;
1 import org.apache.commons.lang.builder.ToStringStyle;
1 import org.apache.commons.lang.mutable.MutableBoolean;
1 import org.apache.commons.lang.mutable.MutableLong;
1 import org.apache.commons.lang.NotImplementedException;
1 import static org.apache.commons.lang.StringUtils.isEmpty;
2 import static org.apache.commons.lang.StringUtils.isNumeric;
4 import org.apache.commons.lang.RandomStringUtils;
8 import org.apache.commons.lang.StringEscapeUtils;
9 import org.apache.commons.lang.*;
56 import org.apache.commons.lang.StringUtils; Classes using Apache Commons Lang V2 Click to expand list of 81 files "using" it in source code...$ find src/main/java src/test/java -name '*.java' -exec grep -l "commons\.lang\." {} \; | sort -f | cut -f2,8- -d"/"
main/api/AbstractApiBean.java
main/api/datadeposit/SwordServiceBean.java
main/api/datadeposit/UrlManager.java
main/api/DatasetFieldServiceApi.java
main/api/Groups.java
main/api/HarvestingServer.java
main/api/imports/ImportDDIServiceBean.java
main/api/imports/ImportServiceBean.java
main/api/Search.java
main/authorization/AuthTestDataServiceBean.java
main/authorization/DataverseRolePermissionHelper.java
main/authorization/providers/builtin/DataverseUserPage.java
main/authorization/providers/builtin/PasswordEncryption.java
main/dataaccess/TabularSubsetGenerator.java
main/DataCitation.java
main/DataFileServiceBean.java
main/DataFileTag.java
main/DatasetField.java
main/DatasetFieldCompoundValue.java
main/DatasetFieldValidator.java
main/DatasetFieldValue.java
main/DatasetFieldValueValidator.java
main/DatasetPage.java
main/DatasetServiceBean.java
main/datasetutility/AddReplaceFileHelper.java
main/DatasetVersionDifference.java
main/DatasetVersionServiceBean.java
main/DataverseHeaderFragment.java
main/DataversePage.java
main/DOIDataCiteRegisterService.java
main/DvObjectServiceBean.java
main/EditDatafilesPage.java
main/engine/command/impl/ImportDatasetCommand.java
main/Guestbook.java
main/GuestbookPage.java
main/HandlenetServiceBean.java
main/harvest/client/HarvesterServiceBean.java
main/harvest/client/oai/OaiHandler.java
main/harvest/server/web/servlet/OAIServlet.java
main/HarvestingClientsPage.java
main/HarvestingSetsPage.java
main/ingest/IngestableDataChecker.java
main/ingest/tabulardata/impl/plugins/csv/CSVFileReader.java
main/ingest/tabulardata/impl/plugins/dta/DTAFileReader.java
main/ingest/tabulardata/impl/plugins/dta/NewDTAFileReader.java
main/ingest/tabulardata/impl/plugins/por/PORFileReader.java
main/ingest/tabulardata/impl/plugins/rdata/RDATAFileReader.java
main/ingest/tabulardata/impl/plugins/rdata/RTabFileParser.java
main/ingest/tabulardata/impl/plugins/sav/SAVFileReader.java
main/ingest/tabulardata/impl/plugins/xlsx/XLSXFileReader.java
main/ingest/tabulardata/InvalidData.java
main/MailServiceBean.java
main/ManageFilePermissionsPage.java
main/ManageGroupsPage.java
main/ManagePermissionsPage.java
main/mydata/DataRetrieverAPI.java
main/mydata/MyDataFilterParams.java
main/mydata/MyDataFinder.java
main/mydata/RoleTagRetriever.java
main/mydata/SolrQueryFormatter.java
main/NavigationWrapper.java
main/passwordreset/PasswordResetPage.java
main/RoleAssigneeServiceBean.java
main/RolePermissionFragment.java
main/rserve/RemoteDataFrameService.java
main/rserve/RJobRequest.java
main/rserve/VariableNameCheckerForR.java
main/search/AdvancedSearchPage.java
main/search/IndexServiceBean.java
main/search/SearchIncludeFragment.java
main/search/SearchUtil.java
main/ThemeWidgetFragment.java
main/UserServiceBean.java
main/util/FileSortFieldAndOrder.java
main/util/MarkupChecker.java
main/util/SumStatCalculator.java
main/validation/PasswordValidatorUtil.java
test/api/DatasetsIT.java
test/DataCitationTest.java
test/mydata/SolrQueryFormatterTest.java
test/validation/PasswordValidatorTest.java Static code analysis detects real usage $ jdeps -v target/classes | grep "commons\.lang\." | tr -s " " | cut -f4 -d" " | sort -f | uniq -c
3 org.apache.commons.lang.ArrayUtils
2 org.apache.commons.lang.builder.ToStringBuilder
2 org.apache.commons.lang.builder.ToStringStyle
1 org.apache.commons.lang.mutable.MutableBoolean
1 org.apache.commons.lang.mutable.MutableLong
1 org.apache.commons.lang.NotImplementedException
5 org.apache.commons.lang.RandomStringUtils
7 org.apache.commons.lang.StringEscapeUtils
66 org.apache.commons.lang.StringUtils Static code analysis shows which files are really using the library Click to view 79 classes using org.apache.commons.lang.*jdeps -v target/classes target/test-classes | grep "commons\.lang\." | tr -s " " | cut -f2 -d" " | sort -u | cut -f5- -d"."
api.AbstractApiBean
api.datadeposit.SwordServiceBean
api.datadeposit.UrlManager
api.DatasetFieldServiceApi
api.DatasetsIT
api.Groups
api.HarvestingServer
api.imports.ImportDDIServiceBean
api.imports.ImportServiceBean
api.Search
authorization.AuthTestDataServiceBean
authorization.DataverseRolePermissionHelper
authorization.providers.builtin.DataverseUserPage
authorization.providers.builtin.PasswordEncryption
dataaccess.TabularSubsetGenerator
DataCitation
DataCitationTest
DataFileServiceBean
DataFileTag
DatasetField
DatasetFieldCompoundValue
DatasetFieldValidator
DatasetFieldValue
DatasetFieldValueValidator
DatasetPage
DatasetServiceBean
datasetutility.AddReplaceFileHelper
DatasetVersionDifference
DatasetVersionServiceBean
DataverseHeaderFragment
DataversePage
DOIDataCiteRegisterService
DvObjectServiceBean
EditDatafilesPage
engine.command.impl.ImportDatasetCommand
GuestbookPage
HandlenetServiceBean
harvest.client.HarvesterServiceBean
harvest.client.oai.OaiHandler
HarvestingClientsPage
HarvestingSetsPage
harvest.server.web.servlet.OAIServlet
ingest.IngestableDataChecker
ingest.tabulardata.impl.plugins.csv.CSVFileReader
ingest.tabulardata.impl.plugins.dta.DTAFileReader
ingest.tabulardata.impl.plugins.dta.NewDTAFileReader
ingest.tabulardata.impl.plugins.por.PORFileReader
ingest.tabulardata.impl.plugins.rdata.RDATAFileReader
ingest.tabulardata.impl.plugins.rdata.RTabFileParser
ingest.tabulardata.impl.plugins.sav.SAVFileReader
ingest.tabulardata.impl.plugins.xlsx.XLSXFileReader
ingest.tabulardata.impl.plugins.xlsx.XLSXFileReader$SheetHandler
ingest.tabulardata.InvalidData
MailServiceBean
ManageGroupsPage
ManagePermissionsPage
mydata.DataRetrieverAPI
mydata.MyDataFilterParams
mydata.MyDataFinder
mydata.SolrQueryFormatter
mydata.SolrQueryFormatterTest$SolrQueryFormatterNoParamTest
NavigationWrapper
passwordreset.PasswordResetPage
RoleAssigneeServiceBean
RolePermissionFragment
rserve.RemoteDataFrameService
rserve.RJobRequest
rserve.VariableNameCheckerForR
search.AdvancedSearchPage
search.IndexServiceBean
search.SearchIncludeFragment
search.SearchUtil
ThemeWidgetFragment
UserServiceBean
util.FileSortFieldAndOrder
util.MarkupChecker
util.SumStatCalculator
validation.PasswordValidatorTest$Params
validation.PasswordValidatorUtil Unit test coverage for these classes Click to expand test coverage calculationfor I in `cat classes-oacl`
do
grep ",$I," target/site/jacoco/jacoco.csv | awk -F, '{ print $5/($4+$5)*100 " " $2 "." $3 }'
done | LC_ALL=C sort -n | sed -e "s/edu.harvard.iq.dataverse.//"
0 DOIDataCiteRegisterService
0 DatasetPage
0 DatasetVersionDifference
0 DataverseHeaderFragment
0 DataversePage
0 DvObjectServiceBean
0 EditDatafilesPage
0 GuestbookPage
0 HarvestingClientsPage
0 HarvestingSetsPage
0 ManageGroupsPage
0 ManagePermissionsPage
0 NavigationWrapper
0 RolePermissionFragment
0 ThemeWidgetFragment
0 api.DatasetFieldServiceApi
0 api.Groups
0 api.HarvestingServer
0 api.Search
0 api.datadeposit.SwordServiceBean
0 api.datadeposit.UrlManager
0 api.imports.ImportDDIServiceBean
0 api.imports.ImportServiceBean
0 authorization.AuthTestDataServiceBean
0 authorization.DataverseRolePermissionHelper
0 datasetutility.AddReplaceFileHelper
0 engine.command.impl.ImportDatasetCommand
0 harvest.client.HarvesterServiceBean
0 harvest.client.oai.OaiHandler
0 harvest.server.web.servlet.OAIServlet
0 ingest.tabulardata.InvalidData
0 ingest.tabulardata.impl.plugins.por.PORFileReader
0 ingest.tabulardata.impl.plugins.rdata.RDATAFileReader
0 ingest.tabulardata.impl.plugins.rdata.RTabFileParser
0 ingest.tabulardata.impl.plugins.xlsx.XLSXFileReader
0 mydata.DataRetrieverAPI
0 mydata.MyDataFilterParams
0 mydata.MyDataFinder
0 passwordreset.PasswordResetPage
0 rserve.RJobRequest
0 rserve.RemoteDataFrameService
0 rserve.VariableNameCheckerForR
0 search.AdvancedSearchPage
0 search.SearchIncludeFragment
0 util.SumStatCalculator
0.28393 MailServiceBean
0.397456 UserServiceBean
0.505433 search.IndexServiceBean
1.9802 HandlenetServiceBean
2.01845 DatasetServiceBean
3.03605 DatasetVersionServiceBean
3.55086 RoleAssigneeServiceBean
4.7619 authorization.providers.builtin.DataverseUserPage
8.22237 DataFileServiceBean
9.35412 api.AbstractApiBean
10.0428 dataaccess.TabularSubsetGenerator
13.1356 DatasetFieldCompoundValue
43.9465 DatasetField
45.696 ingest.tabulardata.impl.plugins.dta.DTAFileReader
46.2121 DataFileTag
49.2537 DatasetFieldValue
50.6716 ingest.tabulardata.impl.plugins.sav.SAVFileReader
50.8178 ingest.IngestableDataChecker
51.4436 search.SearchUtil
54.9595 DataCitation
65.2174 DatasetFieldValidator
73.1868 DatasetFieldValueValidator
85.3357 ingest.tabulardata.impl.plugins.dta.NewDTAFileReader
89.3822 ingest.tabulardata.impl.plugins.csv.CSVFileReader
91.4286 authorization.providers.builtin.PasswordEncryption
91.7208 validation.PasswordValidatorUtil
97.2477 util.MarkupChecker
97.4619 mydata.SolrQueryFormatter
100 util.FileSortFieldAndOrder |
…6070 Also switch from bulk imports of commons.lang3.* to single class style. Fixing for other parts like java.util, too.
IQSS#6070 Apache Commons Lang3 3.12.0 moved `StringEscapeUtils` to Apache Commons Text. Also, escapeHtml() and escapeXml() from Apache Commons Lang (v2) have been renamed to escapeHtml4() and escapeXml10() with v3.0
I just created a PR for this, as I want to use stuff from Lang3 (see PR). |
... to be safe. (IQSS#6070)
This is related to my other dependency housekeeping things. See #5288 and #5360.
While programming unit tests for #6000, I wanted to use
org.apache.commons.lang3.RandomStringUtils
. In Maven POM I saw that Dataverse is still onorg.apache.commons.lang
v2.6, which is dead since 2011, when 3.0 was released.See also: http://www.h-online.com/open/news/item/Apache-Commons-Lang-3-0-makes-a-break-with-the-past-1283761.html
Dataverse uses a lot of Commons Lang stuff. It should be refactored to use v3 of the lib, which gets included anyway, as it is a dep of
com.lyncode:xoai-common
. (seemvn dependency:tree
, grepping forcommons-lang
). Nothing else is relying on the v2 lib.Notes about upgrading from upstream project: https://commons.apache.org/proper/commons-lang/article3_0.html
The text was updated successfully, but these errors were encountered: