global: clean up duplicate table DOIs in production instance #790
Open
Description
When reindexing the QA instance after deploying PR #766 some of the records gave an exception:
sqlalchemy.exc.MultipleResultsFound: Multiple rows were found when exactly one was required
from the line:
I just changed this line in commit 319ff15 to make it tolerate multiple results. However, it should be investigated in more detail why there are multiple DataSubmission
objects with the same doi
. I found 6 examples:
- https://www.hepdata.net/record/78551 (
10.17182/hepdata.78551.v1/t3
appears twice) - https://www.hepdata.net/record/77606 (
10.17182/hepdata.77606.v1/t54
appears twice) - https://www.hepdata.net/record/78402 (
10.17182/hepdata.78402.v1/t29
appears twice) - https://www.hepdata.net/record/80608 (
10.17182/hepdata.80608.v1/t14
appears twice) - https://www.hepdata.net/record/77761 (
10.17182/hepdata.77761.v1/t3
appears twice) - https://www.hepdata.net/record/76842 (
10.17182/hepdata.76842.v1/t3
appears twice)
These all date from the early days of hepdata.net
in 2017/2018 when the submission code was buggy and the procedure for replacing uploads was not done cleanly. It should be investigated how to clean up the database to remove the duplicate DOIs.
Metadata
Assignees
Type
Projects
Status
To do