Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (limited) support for streamed reading of large spreadsheets #296

Closed
cfsimplicity opened this issue Aug 8, 2022 · 0 comments
Closed
Assignees
Milestone

Comments

@cfsimplicity
Copy link
Owner

When POI reads spreadsheets, the entire data is stored in memory to allow random access to specific rows. Reading large files can therefore lead to out-of-memory exceptions.

POI does have a streaming API for reading XLSX but it is complex to use.

However, a third-party POI wrapper - excel-streaming-reader - now exists which makes this much easier.

Integrate this library and use it to support a new readLargeFile() method able to read large XLSX files with less likelihood of running out of memory.

Limitations

  • Only works with XLSX files
  • Only works with Lucee (ACF throws an exception for reasons I have been unable to fathom)
  • Can only return a query object, CSV or HTML (not a workbook object, because too many of the expected POI methods are unsupported by the wrapper)
  • Doesn't support all read() options, in particular specifying rows or columns (i.e. it reads the entire sheet)
@cfsimplicity cfsimplicity added this to the 3.5.0 milestone Aug 8, 2022
@cfsimplicity cfsimplicity self-assigned this Aug 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant