Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] PARSeq pytorch fixes #1227

Merged
merged 6 commits into from
Jun 23, 2023

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Jun 21, 2023

This PR:

  • contains fixes for the pytorch version of PARSeq

  • Fix masks for training

dummy run:

Validation loss decreased 0.407558 --> 0.322983: saving state...
Epoch 8/20 - Validation loss: 0.322983 (Exact: 82.78% | Partial: 83.81%)

@felixdittrich92
Copy link
Contributor Author

@baudm

train:

tgt_perms: tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 9, 8, 7, 6, 5, 4, 3, 2, 1],
        [0, 6, 5, 2, 3, 4, 7, 1, 8, 9],
        [0, 8, 1, 7, 4, 3, 2, 5, 6, 9],
        [0, 5, 6, 3, 4, 8, 1, 7, 2, 9],
        [0, 2, 7, 1, 8, 4, 3, 6, 5, 9]], device='cuda:0', dtype=torch.int32)
padding_mask: tensor([[True, True, True, True, True, True, True, True, True]],
       device='cuda:0')
padding_mask.shape: torch.Size([1, 9])
key_padding_mask_expanded: tensor([[[[True, True, True, True, True, True, True, True, True]]]],
       device='cuda:0')
key_padding_mask_expanded.shape: torch.Size([1, 1, 1, 9])
target_mask: tensor([[1, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0', dtype=torch.int32)
target_mask.shape: torch.Size([9, 9])
attn_mask_expanded: tensor([[[[ True, False, False, False, False, False, False, False, False],
          [ True,  True, False, False, False, False, False, False, False],
          [ True,  True,  True, False, False, False, False, False, False],
          [ True,  True,  True,  True, False, False, False, False, False],
          [ True,  True,  True,  True,  True, False, False, False, False],
          [ True,  True,  True,  True,  True,  True, False, False, False],
          [ True,  True,  True,  True,  True,  True,  True, False, False],
          [ True,  True,  True,  True,  True,  True,  True,  True, False],
          [ True,  True,  True,  True,  True,  True,  True,  True,  True]]]],
       device='cuda:0')
attn_mask_expanded.shape: torch.Size([1, 1, 9, 9])
mask: tensor([[[[1, 0, 0, 0, 0, 0, 0, 0, 0],
          [1, 1, 0, 0, 0, 0, 0, 0, 0],
          [1, 1, 1, 0, 0, 0, 0, 0, 0],
          [1, 1, 1, 1, 0, 0, 0, 0, 0],
          [1, 1, 1, 1, 1, 0, 0, 0, 0],
          [1, 1, 1, 1, 1, 1, 0, 0, 0],
          [1, 1, 1, 1, 1, 1, 1, 0, 0],
          [1, 1, 1, 1, 1, 1, 1, 1, 0],
          [1, 1, 1, 1, 1, 1, 1, 1, 1]]]], device='cuda:0', dtype=torch.int32)
mask.shape: torch.Size([1, 1, 9, 9])
target_mask: tensor([[1, 0, 1, 1, 1, 1, 1, 1, 1],
        [1, 0, 0, 1, 1, 1, 1, 1, 1],
        [1, 0, 0, 0, 1, 1, 1, 1, 1],
        [1, 0, 0, 0, 0, 1, 1, 1, 1],
        [1, 0, 0, 0, 0, 0, 1, 1, 1],
        [1, 0, 0, 0, 0, 0, 0, 1, 1],
        [1, 0, 0, 0, 0, 0, 0, 0, 1],
        [1, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 0, 0, 0, 0, 0, 0, 0, 0]], device='cuda:0', dtype=torch.int32)
target_mask.shape: torch.Size([9, 9])
attn_mask_expanded: tensor([[[[ True, False,  True,  True,  True,  True,  True,  True,  True],
          [ True, False, False,  True,  True,  True,  True,  True,  True],
          [ True, False, False, False,  True,  True,  True,  True,  True],
          [ True, False, False, False, False,  True,  True,  True,  True],
          [ True, False, False, False, False, False,  True,  True,  True],
          [ True, False, False, False, False, False, False,  True,  True],
          [ True, False, False, False, False, False, False, False,  True],
          [ True, False, False, False, False, False, False, False, False],
          [ True, False, False, False, False, False, False, False, False]]]],
       device='cuda:0')
attn_mask_expanded.shape: torch.Size([1, 1, 9, 9])
mask: tensor([[[[1, 0, 1, 1, 1, 1, 1, 1, 1],
          [1, 0, 0, 1, 1, 1, 1, 1, 1],
          [1, 0, 0, 0, 1, 1, 1, 1, 1],
          [1, 0, 0, 0, 0, 1, 1, 1, 1],
          [1, 0, 0, 0, 0, 0, 1, 1, 1],
          [1, 0, 0, 0, 0, 0, 0, 1, 1],
          [1, 0, 0, 0, 0, 0, 0, 0, 1],
          [1, 0, 0, 0, 0, 0, 0, 0, 0],
          [1, 0, 0, 0, 0, 0, 0, 0, 0]]]], device='cuda:0', dtype=torch.int32)
mask.shape: torch.Size([1, 1, 9, 9])
target_mask: tensor([[1, 0, 1, 1, 1, 1, 1, 1, 0],
        [1, 0, 0, 0, 0, 1, 1, 0, 0],
        [1, 0, 1, 0, 0, 1, 1, 0, 0],
        [1, 0, 1, 1, 0, 1, 1, 0, 0],
        [1, 0, 0, 0, 0, 0, 1, 0, 0],
        [1, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 0, 1, 1, 1, 1, 1, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0', dtype=torch.int32)
target_mask.shape: torch.Size([9, 9])
attn_mask_expanded: tensor([[[[ True, False,  True,  True,  True,  True,  True,  True, False],
          [ True, False, False, False, False,  True,  True, False, False],
          [ True, False,  True, False, False,  True,  True, False, False],
          [ True, False,  True,  True, False,  True,  True, False, False],
          [ True, False, False, False, False, False,  True, False, False],
          [ True, False, False, False, False, False, False, False, False],
          [ True, False,  True,  True,  True,  True,  True, False, False],
          [ True,  True,  True,  True,  True,  True,  True,  True, False],
          [ True,  True,  True,  True,  True,  True,  True,  True,  True]]]],
       device='cuda:0')
attn_mask_expanded.shape: torch.Size([1, 1, 9, 9])
mask: tensor([[[[1, 0, 1, 1, 1, 1, 1, 1, 0],
          [1, 0, 0, 0, 0, 1, 1, 0, 0],
          [1, 0, 1, 0, 0, 1, 1, 0, 0],
          [1, 0, 1, 1, 0, 1, 1, 0, 0],
          [1, 0, 0, 0, 0, 0, 1, 0, 0],
          [1, 0, 0, 0, 0, 0, 0, 0, 0],
          [1, 0, 1, 1, 1, 1, 1, 0, 0],
          [1, 1, 1, 1, 1, 1, 1, 1, 0],
          [1, 1, 1, 1, 1, 1, 1, 1, 1]]]], device='cuda:0', dtype=torch.int32)
mask.shape: torch.Size([1, 1, 9, 9])
target_mask: tensor([[1, 0, 0, 0, 0, 0, 0, 0, 1],
        [1, 1, 0, 1, 1, 0, 0, 1, 1],
        [1, 1, 0, 0, 1, 0, 0, 1, 1],
        [1, 1, 0, 0, 0, 0, 0, 1, 1],
        [1, 1, 1, 1, 1, 0, 0, 1, 1],
        [1, 1, 1, 1, 1, 1, 0, 1, 1],
        [1, 1, 0, 0, 0, 0, 0, 0, 1],
        [1, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0', dtype=torch.int32)
target_mask.shape: torch.Size([9, 9])
attn_mask_expanded: tensor([[[[ True, False, False, False, False, False, False, False,  True],
          [ True,  True, False,  True,  True, False, False,  True,  True],
          [ True,  True, False, False,  True, False, False,  True,  True],
          [ True,  True, False, False, False, False, False,  True,  True],
          [ True,  True,  True,  True,  True, False, False,  True,  True],
          [ True,  True,  True,  True,  True,  True, False,  True,  True],
          [ True,  True, False, False, False, False, False, False,  True],
          [ True, False, False, False, False, False, False, False, False],
          [ True,  True,  True,  True,  True,  True,  True,  True,  True]]]],
       device='cuda:0')
attn_mask_expanded.shape: torch.Size([1, 1, 9, 9])
mask: tensor([[[[1, 0, 0, 0, 0, 0, 0, 0, 1],
          [1, 1, 0, 1, 1, 0, 0, 1, 1],
          [1, 1, 0, 0, 1, 0, 0, 1, 1],
          [1, 1, 0, 0, 0, 0, 0, 1, 1],
          [1, 1, 1, 1, 1, 0, 0, 1, 1],
          [1, 1, 1, 1, 1, 1, 0, 1, 1],
          [1, 1, 0, 0, 0, 0, 0, 0, 1],
          [1, 0, 0, 0, 0, 0, 0, 0, 0],
          [1, 1, 1, 1, 1, 1, 1, 1, 1]]]], device='cuda:0', dtype=torch.int32)
mask.shape: torch.Size([1, 1, 9, 9])
target_mask: tensor([[1, 0, 0, 1, 1, 1, 1, 0, 1],
        [1, 1, 0, 1, 1, 1, 1, 1, 1],
        [1, 0, 0, 0, 0, 1, 1, 0, 0],
        [1, 0, 0, 1, 0, 1, 1, 0, 0],
        [1, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 0, 0, 0, 0, 1, 0, 0, 0],
        [1, 1, 0, 1, 1, 1, 1, 0, 1],
        [1, 0, 0, 1, 1, 1, 1, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0', dtype=torch.int32)
target_mask.shape: torch.Size([9, 9])
attn_mask_expanded: tensor([[[[ True, False, False,  True,  True,  True,  True, False,  True],
          [ True,  True, False,  True,  True,  True,  True,  True,  True],
          [ True, False, False, False, False,  True,  True, False, False],
          [ True, False, False,  True, False,  True,  True, False, False],
          [ True, False, False, False, False, False, False, False, False],
          [ True, False, False, False, False,  True, False, False, False],
          [ True,  True, False,  True,  True,  True,  True, False,  True],
          [ True, False, False,  True,  True,  True,  True, False, False],
          [ True,  True,  True,  True,  True,  True,  True,  True,  True]]]],
       device='cuda:0')
attn_mask_expanded.shape: torch.Size([1, 1, 9, 9])
mask: tensor([[[[1, 0, 0, 1, 1, 1, 1, 0, 1],
          [1, 1, 0, 1, 1, 1, 1, 1, 1],
          [1, 0, 0, 0, 0, 1, 1, 0, 0],
          [1, 0, 0, 1, 0, 1, 1, 0, 0],
          [1, 0, 0, 0, 0, 0, 0, 0, 0],
          [1, 0, 0, 0, 0, 1, 0, 0, 0],
          [1, 1, 0, 1, 1, 1, 1, 0, 1],
          [1, 0, 0, 1, 1, 1, 1, 0, 0],
          [1, 1, 1, 1, 1, 1, 1, 1, 1]]]], device='cuda:0', dtype=torch.int32)
mask.shape: torch.Size([1, 1, 9, 9])
target_mask: tensor([[1, 0, 1, 0, 0, 0, 0, 1, 0],
        [1, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 0, 1, 0, 0, 1, 1],
        [1, 1, 1, 0, 0, 0, 0, 1, 1],
        [1, 1, 1, 1, 1, 0, 1, 1, 1],
        [1, 1, 1, 1, 1, 0, 0, 1, 1],
        [1, 0, 1, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 0, 0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0', dtype=torch.int32)
target_mask.shape: torch.Size([9, 9])
attn_mask_expanded: tensor([[[[ True, False,  True, False, False, False, False,  True, False],
          [ True, False, False, False, False, False, False, False, False],
          [ True,  True,  True, False,  True, False, False,  True,  True],
          [ True,  True,  True, False, False, False, False,  True,  True],
          [ True,  True,  True,  True,  True, False,  True,  True,  True],
          [ True,  True,  True,  True,  True, False, False,  True,  True],
          [ True, False,  True, False, False, False, False, False, False],
          [ True,  True,  True, False, False, False, False,  True, False],
          [ True,  True,  True,  True,  True,  True,  True,  True,  True]]]],
       device='cuda:0')
attn_mask_expanded.shape: torch.Size([1, 1, 9, 9])
mask: tensor([[[[1, 0, 1, 0, 0, 0, 0, 1, 0],
          [1, 0, 0, 0, 0, 0, 0, 0, 0],
          [1, 1, 1, 0, 1, 0, 0, 1, 1],
          [1, 1, 1, 0, 0, 0, 0, 1, 1],
          [1, 1, 1, 1, 1, 0, 1, 1, 1],
          [1, 1, 1, 1, 1, 0, 0, 1, 1],
          [1, 0, 1, 0, 0, 0, 0, 0, 0],
          [1, 1, 1, 0, 0, 0, 0, 1, 0],
          [1, 1, 1, 1, 1, 1, 1, 1, 1]]]], device='cuda:0', dtype=torch.int32)
mask.shape: torch.Size([1, 1, 9, 9])

@felixdittrich92
Copy link
Contributor Author

val decode_autoregressive:

query_mask: tensor([[1, 0, 0],
        [1, 1, 0],
        [1, 1, 1]], device='cuda:0', dtype=torch.int32)
query_mask.shape: torch.Size([3, 3])
updated query_mask: tensor([[1, 0, 1],
        [1, 1, 0],
        [1, 1, 1]], device='cuda:0', dtype=torch.int32)
updated query_mask.shape: torch.Size([3, 3])
target_pad_mask: tensor([[[[ True, False, False]]]], device='cuda:0')
target_pad_mask.shape: torch.Size([1, 1, 1, 3])
mask: tensor([[[[1, 0, 0],
          [1, 0, 0],
          [1, 0, 0]]]], device='cuda:0', dtype=torch.int32)
mask.shape: torch.Size([1, 1, 3, 3])

@felixdittrich92
Copy link
Contributor Author

Seems to work now ^^

@felixdittrich92
Copy link
Contributor Author

Validation loss decreased 0.407558 --> 0.322983: saving state...
Epoch 8/20 - Validation loss: 0.322983 (Exact: 82.78% | Partial: 83.81%)

@felixdittrich92 felixdittrich92 added this to the 0.6.1 milestone Jun 21, 2023
@felixdittrich92 felixdittrich92 added type: bug Something isn't working module: models Related to doctr.models framework: pytorch Related to PyTorch backend topic: text recognition Related to the task of text recognition labels Jun 21, 2023
@felixdittrich92 felixdittrich92 self-assigned this Jun 21, 2023
@felixdittrich92 felixdittrich92 marked this pull request as ready for review June 21, 2023 20:31
@codecov
Copy link

codecov bot commented Jun 21, 2023

Codecov Report

Merging #1227 (6755842) into main (fdd00a3) will decrease coverage by 0.15%.
The diff coverage is 48.38%.

@@            Coverage Diff             @@
##             main    #1227      +/-   ##
==========================================
- Coverage   93.68%   93.53%   -0.15%     
==========================================
  Files         154      154              
  Lines        6903     6915      +12     
==========================================
+ Hits         6467     6468       +1     
- Misses        436      447      +11     
Flag Coverage Δ
unittests 93.53% <48.38%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
doctr/models/recognition/vitstr/pytorch.py 100.00% <ø> (ø)
doctr/models/recognition/parseq/pytorch.py 73.58% <48.38%> (-4.79%) ⬇️

... and 7 files with indirect coverage changes

Copy link
Collaborator

@charlesmindee charlesmindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that, LGTM

@felixdittrich92 felixdittrich92 merged commit b4b613a into mindee:main Jun 23, 2023
@felixdittrich92 felixdittrich92 deleted the parseq-torch-fixes branch June 23, 2023 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework: pytorch Related to PyTorch backend module: models Related to doctr.models topic: text recognition Related to the task of text recognition type: bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants