Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support specifying output channels in io.image.read_image #2988

Merged
merged 16 commits into from
Nov 18, 2020

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Nov 11, 2020

Fixes #2948

I add a channels parameter that allows the users to specify the number of output channels while reading an image. The default value is 0 which leaves the image as-is and ensures the change is BC. The following public API methods of the torchvision.io.image package were updated:

  • decode_png(input, channels=0)
  • decode_jpeg(input, channels=0)
  • decode_image(input, channels=0)
  • read_image(path, channels=0)

There is a small update on the originally proposed pitch because I added support for grayscale transparency and handling for palette images. Here are the supported values:

  • channels=0 - leave as original (grayscale, palette, grayscale with alpha, rgb, rgb with alpha, CMYK etc)
  • channels=1 - Grayscale
  • channels=2 - Grayscale with Alpha (PNG only, not valid for JPEG)
  • channels=3 - RGB
  • channels=4 - RGB with Alpha (PNG only, not valid for JPEG)

The PR adds 3 JPEG assets with total size 7kb. These are used to test the supported conversions. It also removes a 900kb asset file which is no longer needed. The assets were produced using the following snippet:

from PIL import Image

# manually downloaded from https://pytorch.org/assets/images/pytorch-logo.png
original = 'pytorch-logo.png'
with Image.open(original) as img:
    img = img.convert("RGBA")
    img = img.resize((100, 100)) 
    img.convert("L").save("gray_pytorch.jpg")
    img.convert("RGB").save("rgb_pytorch.jpg")
    img.convert("CMYK").save("cmyk_pytorch.jpg")

Copy link
Contributor Author

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments to explain parts of the implementation.`

test/test_cpp_models.py Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
test/test_image.py Outdated Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
torchvision/io/image.py Outdated Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
torchvision/csrc/cpu/image/readpng_cpu.cpp Show resolved Hide resolved
torchvision/io/image.py Show resolved Hide resolved
@datumbox
Copy link
Contributor Author

The failing tests on Travis are not related to this PR. I would like to rebase to master once #2985 is merged to ensure all tests still pass.

@datumbox datumbox changed the title [WIP] Support specifying output channels in io.image.read_image Support specifying output channels in io.image.read_image Nov 12, 2020
@datumbox datumbox requested a review from fmassa November 12, 2020 10:20
@codecov
Copy link

codecov bot commented Nov 12, 2020

Codecov Report

Merging #2988 (161dbce) into master (80f41f8) will decrease coverage by 0.02%.
The diff coverage is 75.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2988      +/-   ##
==========================================
- Coverage   73.39%   73.37%   -0.03%     
==========================================
  Files          99       99              
  Lines        8825     8825              
  Branches     1391     1391              
==========================================
- Hits         6477     6475       -2     
- Misses       1929     1931       +2     
  Partials      419      419              
Impacted Files Coverage Δ
torchvision/io/image.py 79.03% <75.00%> (-3.23%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80f41f8...ada56a2. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR, it looks great!

I would really prefer if we could unify the name of the arguments across jpeg / png, this way it is clear to the user that they both represent the same thing.

About channels=1 keeping the image as a palette type, although it makes sense I wonder if the expected behavior would be instead to convert it to a grayscale. Thoughts?

I've made a few other comments on the PR, let me know what you think

test/test_cpp_models.py Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
test/test_image.py Outdated Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
torchvision/csrc/cpu/image/readjpeg_cpu.cpp Outdated Show resolved Hide resolved
torchvision/csrc/cpu/image/readpng_cpu.cpp Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
test/test_image.py Show resolved Hide resolved
@andfoy
Copy link
Contributor

andfoy commented Nov 13, 2020

The usage of pth files is due to the difference between libjpeg and libjpeg-turbo on Windows and Mac, which right now we are not able to use.

@datumbox
Copy link
Contributor Author

@fmassa Thanks for the review. I marked as resolved anything that I either accept your proposal or is already covered discussed. I kept open anything that requires a second look. I'll send now another commit with the changes. I would appreciate to review the last remaining points.

@andfoy Thanks for providing the background story.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Vasilis!

@fmassa fmassa merged commit 4d6ba67 into pytorch:master Nov 18, 2020
@datumbox datumbox deleted the feature/channels_in_read_image branch November 18, 2020 11:50
bryant1410 pushed a commit to bryant1410/vision-1 that referenced this pull request Nov 22, 2020
* Adding output channels implementation for pngs.

* Adding tests for png.

* Adding channels in the API and documentation.

* Fixing formatting.

* Refactoring test_image.py to remove huge grace_hopper_517x606.pth file from assets and reduce duplicate code. Moving jpeg assets used by encode and write unit-tests on their separate folders.

* Adding output channels implementation for jpegs. Fix asset locations.

* Add tests for JPEG, adding the channels in the API and documentation and adding checks for inputs.

* Changing folder for unit-test.

* Fixing windows flakiness, removing duplicate test.

* Replacing components to channels.

* Adding reference for supporting CMYK.

* Minor changes: num_components to output_components, adding comments, fixing variable name etc.

* Reverting output_components to num_components.

* Replacing decoding with generic method on tests.

* Palette converted to Gray.
vfdev-5 pushed a commit to Quansight/vision that referenced this pull request Dec 4, 2020
* Adding output channels implementation for pngs.

* Adding tests for png.

* Adding channels in the API and documentation.

* Fixing formatting.

* Refactoring test_image.py to remove huge grace_hopper_517x606.pth file from assets and reduce duplicate code. Moving jpeg assets used by encode and write unit-tests on their separate folders.

* Adding output channels implementation for jpegs. Fix asset locations.

* Add tests for JPEG, adding the channels in the API and documentation and adding checks for inputs.

* Changing folder for unit-test.

* Fixing windows flakiness, removing duplicate test.

* Replacing components to channels.

* Adding reference for supporting CMYK.

* Minor changes: num_components to output_components, adding comments, fixing variable name etc.

* Reverting output_components to num_components.

* Replacing decoding with generic method on tests.

* Palette converted to Gray.
@fmassa fmassa mentioned this pull request Dec 8, 2020
facebook-github-bot pushed a commit that referenced this pull request Dec 8, 2020
Summary:
This image was moved to `test/assets/encode_jpeg` in #2988 but was not removed in this branch for some reason

Pull Request resolved: #3139

Reviewed By: datumbox

Differential Revision: D25395596

Pulled By: fmassa

fbshipit-source-id: a0afdec2d1da41e6743d7d723e71ffde442cf3a7
@datumbox datumbox mentioned this pull request Jan 5, 2021
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support specifying output channels (RGB vs grayscale) for io.image.read_image
4 participants