Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect Exif orientation in LaTeX, Docx, ODT output #10386

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

silby
Copy link
Contributor

@silby silby commented Nov 14, 2024

This branch uses EXIF orientation metadata in image files to affect the output of images in Docx, ODT, and LaTeX writers. Addresses #10311 for these three writers only.

Having worked on this for a few days I am a bit dubious that this in scope for Pandoc. On the one hand, it's probably somewhat common for users to encounter images with a flipped/rotated orientation, given the behavior smartphone cameras and so on, and common to not know what's going on or how to fix it when you generate a PDF with Pandoc. On the other hand, the problem has to be addressed in a specific way for each output format and the results can questionable.

The LaTeX case is particularly bad here, I think. For one thing, it introduces a need to read the image bytes in the LaTeX writer where it was previously unnecessary, and so I'm just ignoring all fetchItem exceptions and defaulting to no rotation. There's also no necessary relationship between the environment where pandoc is being run and the environment where LaTeX is being run: if someone has a rotated foo.jpg in their working directory when they run pandoc, then they execute LaTeX in a directory with an unrotated foo.jpg, have we done the right thing or not?

In the ODT case, there's no good-looking solution available without actually rotating the pixels of the image data. See the commit message for cc73c01.

I think the approach here is more or less reasonable but I don't wholeheartedly endorse it.

Incidentally, typst natively respects EXIF orientation, seemingly by actually rotating and flipping the pixels, see here

@silby silby force-pushed the exif branch 4 times, most recently from c00b4d1 to 72f7da0 Compare November 20, 2024 01:21
@silby silby changed the title wip: respect Exif orientation Respect Exif orientation in LaTeX, Docx, ODT output Nov 20, 2024
@silby silby marked this pull request as ready for review November 20, 2024 01:55
@silby
Copy link
Contributor Author

silby commented Nov 22, 2024

libreoffice bug for the image rotation thing

Transforms images based on exif metadata
The desired size of the final image in the docx output wants to be based
on the rotated dimensions, not the original dimensions, but then we have
to set the extents of the image object based on the original aspect
ratio, so we have to swap those dimensions back.

An earlier iteration of this diff added the swapping to imageSize, but I
thought better of it. rotatedDesiredSizeInPoints is a new function
because T.P.ImageSize is in the public API and I didn't want to create
an API break.
Extensive experimentation with LibreOffice 24.8.3.2 on Mac didn't come
up with a way to create an ODT with an image that is displayed
uncropped, at the correct aspect ratio, with the correct size of
bounding box, after rotating 90° or 270°, while anchored "as character".

As of this commit, if the EXIF orientation says an image needs to be
quarter-rotated, the solution chosen has a bounding box the original,
unrotated size of the image, which displays the image correctly rotated
and with the correct aspect ratio, but cropped.  It is unknown to this
author if this looks correct in other software than LibreOffice.

Pandoc doesn't currently pull in any dependencies capable of rotating
the actual pixels in image data. Document authors needing to mitigate
this issue will have to edit their images themselves.
Ordinarily the LaTeX writer doesn't really need to get the bytes of the
images it's including, but we have to check them to get an EXIF
orientation. If the media fetch fails we're just punting and assuming a
default orientation.

It would be slightly dubious to bother doing this at all but LaTeX is
the default format for PDF output so it's worth making the effort.
@@ -451,3 +495,32 @@ webpSize opts img =
case AW.parseOnly pWebpSize img of
Left _ -> Nothing
Right sz -> Just sz { dpiX = fromIntegral $ writerDpi opts, dpiY = fromIntegral $ writerDpi opts}

imageTransform :: ByteString -> ImageTransform
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imageTransform seems too general a name for a function that just operates on exif.
Also, all exported functions and types should have Haddock comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants