Way to differentate file that ends without a newline vs a line that was truncated to the buffer size #3

dimo414 · 2022-10-03T05:02:23Z

When next_batch() returns a slice that doesn't end in the delimiter there's no easy way to tell whether this is because there's nothing more to read or because the line is larger than the buffer.

You can call next_*() once more to see if it returns None, but you need to copy the previously returned line(s) before you can do so. This is tedious to get right (e.g. bstr does something like this).
You can configure the reader's capacity (since there isn't a capacity() method on LineReader) and then check if the returned string is the same size as that capacity. This is roundabout and can still have false-positives.

It would be great if it was apparent from the API whether the returned slice was incomplete or not, such as by returning an error or a different type that contained this bit.

Taking this a step further, would it be feasible/welcome to eliminate this limitation of LineReader (possibly as optional behavior)? For a caller that wants to support arbitrarily long lines there's really no option other than allocating enough memory to fit the whole line, so it seems like LineReader could just do this for the caller by resizing its buffer in response to overly-long lines.

I might be able to contribute some of these changes if there's interest.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Way to differentate file that ends without a newline vs a line that was truncated to the buffer size #3

Way to differentate file that ends without a newline vs a line that was truncated to the buffer size #3

dimo414 commented Oct 3, 2022

Way to differentate file that ends without a newline vs a line that was truncated to the buffer size #3

Way to differentate file that ends without a newline vs a line that was truncated to the buffer size #3

Comments

dimo414 commented Oct 3, 2022