errror when comment character contained within CSV data #325
Description
Summary
Hi 👋 thanks for the great lib!
We are using the option { "comment": "#" }
to remove a header section from the CSV file which contains multiple lines beginning with '#' (as per bash syntax).
Motivation
The issue we face is that the hash (#
) character may also exist as a valid character within the body of some rows, this results in a fatal columns mismatch error.
For example:
# comment
# comment
col1,col2,col3
a,b,c
a,###,c
Alternative
My understanding of the documentation "Treat all the characters after this one as a comment" is that currently both infix and prefix matching are supported, which makes sense for lines like this a,b,c # this is a comment
.
In my case at least I was caught out by this, as I assumed that the match was prefix only, I guess I was expecting it to only apply to lines which begin with the comment
string (as per bash).
Draft
What I'd love to have is the ability to control whether this was applied as an infix match or only as a prefix.
For example, if I were able to supply a regular expression I could use ^#
to 'anchor' the string at the beginning of the row.
Additional context
We're using the stream
API, I wasn't able to find the exact places in the code where this is implemented, but presumably this is handled in a streaming fashion and so therefore may or may not have access to the newline, depending on where in the parser it is implemented.
If you'd like to point me to the places in the code which are relevant I might be able to draft a PR, although we'd need to discuss how best to change the JS API to allow users to configure whether infix matching was enabled or not.