Skip to content

Git can't read huge files out of a pack #2065

Closed
@filcab

Description

  • I was not able to find an open or closed issue matching what I'm seeing

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
$ git --version --build-options

git version 2.20.1.windows.1
cpu: x86_64
built from commit: 7c9fbc07db0e2939b36095df45864b8cda19b64f
sizeof-long: 4
sizeof-size_t: 8
  • Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit?
$ cmd.exe /c ver

Microsoft Windows [Version 10.0.17134.523]
  • What options did you set as part of the installation? Or did you choose the
    defaults?
# One of the following:
> type "C:\Program Files\Git\etc\install-options.txt"
> type "C:\Program Files (x86)\Git\etc\install-options.txt"
> type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt"
$ cat /etc/install-options.txt

Editor Option: VIM
Custom Editor Path:
Path Option: CmdTools
SSH Option: OpenSSH
CURL Option: OpenSSL
CRLF Option: CRLFCommitAsIs
Bash Terminal Option: MinTTY
Performance Tweaks FSCache: Enabled
Use Credential Manager: Enabled
Enable Symlinks: Enabled
  • Any other interesting things about your environment that might be related
    to the issue you're seeing?

Nope.

Details

  • Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other

bash

$ mkdir git-repo

$ cd git-repo/

$ git init .
Initialized empty Git repository in .../git-repo/.git/

$ dd if=/dev/zero of=test4G bs=4M count=1024
1024+0 records in
1024+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 2.5895 s, 1.7 GB/s

$ git add test4G

$ git commit -m hello
[master (root-commit) 636cdb5] hello
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 test4G

$ git ls-tree $(git log -1 --pretty="%T" HEAD)
100644 blob 451971a31ea5a207a10b391df2d5949910133565    test4G

$ git show 451971a31ea5a207a10b391df2d5949910133565 | wc -c
error: bad object header
fatal: packed object 451971a31ea5a207a10b391df2d5949910133565 (stored in .git/objects/pack/pack-43e2c696ce675c3ed09d82deeed262b870b6f27b.pack) is corrupt
0
  • What did you expect to occur after running these commands?

No errors when running git show $hash

  • What actually happened instead?

Errors

  • If the problem was occurring with a specific repository, can you provide the
    URL to that repository to help us with testing?

No need.

  • Additional notes:
    Getting to that folder with WSL's git (git version 2.17.1) works (as expected. It looks like the problem is only when reading the packfile, not when writing. The packfile is not corrupt):
$ git show 451971a31ea5a207a10b391df2d5949910133565 | wc -c
4294967296

This seems to be related to code like (in packfile.c):

int unpack_object_header(struct packed_git *p,
			 struct pack_window **w_curs,
			 off_t *curpos,
			 unsigned long *sizep)
{
	unsigned char *base;
	unsigned long left;
	unsigned long used;
	enum object_type type;

	/* use_pack() assures us we have [base, base + 20) available
	 * as a range that we can look at.  (Its actually the hash
	 * size that is assured.)  With our object header encoding
	 * the maximum deflated object size is 2^137, which is just
	 * insane, so we know won't exceed what we have been given.
	 */
	base = use_pack(p, w_curs, *curpos, &left);
	used = unpack_object_header_buffer(base, left, &type, sizep);
	if (!used) {
		type = OBJ_BAD;
	} else
		*curpos += used;

	return type;
}

curpos is off_t, which should be ok. But used is only an unsigned long, which is 32-bit on Windows (https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models ).
Changing left and used to be off_t (and then changing all callees) should fix this. How ok is it to create a test which generates a 4GiB file when being run?
I can try to fix this if I manage to get a build of git working on Windows. Should I then file a patch to this project or to the main git project? (Other platforms might have this problem, but I'm not aware of any. Still: It's probably more correct to use off_t in this code).

Thank you,
Filipe

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions