Skip to content

Commit

Permalink
Fix get_file_size behavior inconsistency for folders
Browse files Browse the repository at this point in the history
Different OSes have different behavior when trying to fopen/fseek/ftell
a folder. On Linux, some systems return 0 size, some systems return an
error, and some systems return LONG_MAX. LONG_MAX is particularly
problematic because that causes spurious OOMs under address sanitizer.

Using fstat manually cleans this up, however it introduces a new
dependency on platform specific headers that we didn't have before, and
also has unclear behavior on 64-bit systems wrt 32-bit sizes which will
need to be tested further as I'm not certain if the behavior needs to be
special-cased only for MSVC/MinGW, which are currently not handled by
this path (unless MinGW defines __unix__...)
  • Loading branch information
zeux committed Apr 15, 2023
1 parent e383ce5 commit d3199a0
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 2 deletions.
17 changes: 16 additions & 1 deletion src/pugixml.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@
// For placement new
#include <new>

// For load_file
#if defined(__unix__) || defined(__APPLE__)
#include <sys/stat.h>
#endif

#ifdef _MSC_VER
# pragma warning(push)
# pragma warning(disable: 4127) // conditional expression is constant
Expand Down Expand Up @@ -4759,7 +4764,17 @@ PUGI_IMPL_NS_BEGIN
// we need to get length of entire file to load it in memory; the only (relatively) sane way to do it is via seek/tell trick
PUGI_IMPL_FN xml_parse_status get_file_size(FILE* file, size_t& out_result)
{
#if defined(PUGI_IMPL_MSVC_CRT_VERSION) && PUGI_IMPL_MSVC_CRT_VERSION >= 1400
#if defined(__unix__) || defined(__APPLE__)
// this simultaneously retrieves the file size and file mode (to guard against loading non-files)
struct stat st;
if (fstat(fileno(file), &st) != 0) return status_io_error;

// anything that's not a regular file doesn't have a coherent length
if (!S_ISREG(st.st_mode)) return status_io_error;

typedef off_t length_type;
length_type length = st.st_size;
#elif defined(PUGI_IMPL_MSVC_CRT_VERSION) && PUGI_IMPL_MSVC_CRT_VERSION >= 1400
// there are 64-bit versions of fseek/ftell, let's use them
typedef __int64 length_type;

Expand Down
2 changes: 1 addition & 1 deletion tests/test_document.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -589,7 +589,7 @@ TEST(document_load_file_wide_out_of_memory)
CHECK(result.status == status_out_of_memory || result.status == status_file_not_found);
}

#if defined(__APPLE__)
#if defined(__unix__) || defined(__APPLE__)
TEST(document_load_file_special_folder)
{
xml_document doc;
Expand Down

0 comments on commit d3199a0

Please sign in to comment.