-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to the "English-to-Spanish Translation with a Sequence-to-Sequence Transformer" Code Example #1997
Conversation
Given 30 epochs of training, the model never ended producing sensible output. These are examples: 1) Tom didn't like Mary. → [start] ha estoy qué 2) Tom called Mary and canceled their date. → [start] sola qué yo pasatiempo visto campo When fitting the model the following relevant warning was emitted: ``` UserWarning: The structure of `inputs` doesn't match the expected structure: ['encoder_inputs', 'decoder_inputs']. Received: the structure of inputs={'encoder_inputs': '*', 'decoder_inputs': '*'} ``` After the fix the model now outputs sentences that are close to proper Spanish: 1)That's what Tom told me. → [start] eso es lo que tom me dijo [end] 2) Does Tom like cheeseburgers? → [start] a tom le gustan las queso de queso [end]
The check essentially disables the mask calculation, as the layer is the first one to receive the input, and thus never has a previous. With this change mask is now passed on to the encoder. Looks like a regression error. The initial commit looks very similar to this.
As per https://github.com/tensorflow/tensorflow/blob/6550e4bd80223cdb8be6c3afd1f81e86a4d433c3/tensorflow/python/keras/engine/base_layer.py#L965 the inputs should be a list, and not kwargs. When this is done, both the masks are received as a tuple in the mask argument.
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes! The code looks good. Please add the generated files as well.
Perfect, here you go. The generated markdown file contained thousands of lines of progress bar updates, so I removed them by hand and checked it was nice and readable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you. Good call on cleaning up the logs.
Many thanks, Tom! As soon as I can, I will test it. |
This pull request addresses several issues in the "English-to-Spanish translation with a sequence-to-sequence Transformer" code example:
Swapped input tensors:
Padding handling:
I identified these issues while using the example to build my own Keras-based transformer models. I noticed that the model wasn't producing meaningful translations, which these corrections address. By fixing these issues, this pull request ensures that the example functions properly, allowing others to use it as a reliable foundation or learning resource for their projects.
It seems these issues may have arisen due to regression errors over time, and this update aims to restore the example to its intended functionality.