-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better handling of duplicate headers when parsing chunks #1017
base: master
Are you sure you want to change the base?
Conversation
…ate-header More fix cursor for duplicate header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR. I've added some comments.
It is not clear for me what the new test is doing. IT will be great if you can explain it.
} | ||
}, | ||
complete: function() { | ||
assert(startsWithEtiamOrLorem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure which is the objective of such test.
Could add more assertions? Can we just test the stepped variable to ensure it has steeped correctly?
@@ -164,6 +164,46 @@ describe('PapaParse', function() { | |||
}); | |||
}); | |||
|
|||
it('Checks cursor when file is large and has duplicate headers', function(done) { | |||
this.timeout(30000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the timeout required? I think it should be removed.
@@ -1508,7 +1508,17 @@ License: MIT | |||
if (duplicateHeaders) { | |||
var editedInput = input.split(newline); | |||
editedInput[0] = Array.from(headerMap).join(delim); | |||
// If we change the size of the input due to duplicate headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make the comment smaller?
What about: store the number of removed chars to correctly report meta.cursor
?
@@ -1517,12 +1527,7 @@ License: MIT | |||
for (var i = 0; i < rows.length; i++) | |||
{ | |||
row = rows[i]; | |||
// use firstline as row length may be changed due to duplicated headers | |||
if (i === 0 && firstLine !== undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As firstline is no longer used, probably it can be removed as variable
The previous fix for cursor management when there are duplicate headers worked for the faseMode parse path, but not for the slow mode parse path (that handles quotes).
This change is more general and works for both parse paths.
I also spotted a bug in handling quotes at the end of the file when the headers change.
Added new tests for both of these issues.