Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot call AttentionLayer with symbolic tensors and return_attention_scores=True in Keras 3 #20621

Closed
boweia opened this issue Dec 10, 2024 · 5 comments · Fixed by #20689
Closed
Assignees
Labels
type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.

Comments

@boweia
Copy link

boweia commented Dec 10, 2024

In Keras 3 I cannot call the call method of AttentionLayer (or other attention layer classes) with a symbolic KerasTensor input and return_attention_scores=True.

I am using Python 3.12.1 and Keras 3.6.0

>>> in1 = keras.Input(shape=(10, 7))
>>> in2 = keras.Input(shape=(8, 7))
>>> attLayer = keras.layers.Attention()     
>>> out1, out2 = attLayer([in1, in2], return_attention_scores=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<venv_root>/lib/python3.12/site-packages/keras/src/backend/common/keras_tensor.py", line 167, in __iter__
    raise NotImplementedError(
NotImplementedError: Iterating over a symbolic KerasTensor is not supported.

The layer call method works when replacing the symbolic inputs with np.ndarray inputs.

@dhantule
Copy link
Contributor

dhantule commented Dec 12, 2024

Hi @boweia, Thanks for reporting this.

You are trying to access attention scores with symbolic KerasTensor, which are only placeholders and do not hold any real data; while an eager tensor holds actual data and can be evaluated immediately in eager execution mode. Attention layer computes attention scores when the model is executed with real data.
If we pass the symbolic tensor to a Keras model, it will produce an eager tensor during the execution phase. Once the model is defined, called and data is passed to the model, you can get attention scores.

class AttentionModel(Model):
    def __init__(self):
        super(AttentionModel, self).__init__()
        self.attention = layers.Attention()

    def call(self,inputs):
        in1, in2 = inputs
        out1, out2 = self.attention([in1, in2], return_attention_scores=True)
        return out1, out2

in1 = keras.Input(shape=(10, 7))
in2 = keras.Input(shape=(8, 7))

model = AttentionModel()
out1, out2 =model([in1,in2]

@dhantule dhantule added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. stat:awaiting response from contributor labels Dec 12, 2024
@boweia
Copy link
Author

boweia commented Dec 14, 2024

Thanks @dhantule. I think what I'm unclear on is why the attention scores cannot be returned from the call method as a symbolic tensor, the same way the attention output is returned. Also, this workflow previously worked in TensorFlow 2.14 with Keras 2 and I only started seeing it after updating to newer versions recently.

If defining a wrapper class is the only way to support this I see that it can work, but it seems silly that its necessary instead of purely using the Keras functional API to define a model.

@dhantule
Copy link
Contributor

Hi @boweia,
In TensorFlow 2.14 with Keras 2, eager execution is enabled by default. In Keras 3 with TensorFlow 2.14 and beyond while eager execution is still supported, the framework encourages graph execution (using tf.function) for better performance in production and scaling scenarios. In Keras 3, eager execution is no longer the default mode for all operations.
You can read more about migrating Keras 2 to Keras 3 here.

@Surya2k1
Copy link
Contributor

Hi @boweia, In TensorFlow 2.14 with Keras 2, eager execution is enabled by default. In Keras 3 with TensorFlow 2.14 and beyond while eager execution is still supported, the framework encourages graph execution (using tf.function) for better performance in production and scaling scenarios. In Keras 3, eager execution is no longer the default mode for all operations. You can read more about migrating Keras 2 to Keras 3 here.

I don't think the issue is related to Graph or Eager execution. Calling layer.build along with layer.call will produce output and attention scores as Keras tensors.

in1 = keras.Input(shape=(10, 7))
in2 = keras.Input(shape=(8, 7))
attLayer = keras.layers.Attention()
attLayer.build([getattr(in1,'shape'),getattr(in2,'shape')])
out1,out2 = attLayer.call([in1, in2], return_attention_scores=True)

I believe there is an issue with layer.__call__ . It seems the kwargs being ignored in symbolic call. If someone confirms this needs fix may be I can look into it.

Or is there is any reason for current behaviour?

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants