XLNet bias fix on resize embeddings (cf #1124) #1162

LysandreJik · 2019-08-31T04:53:22Z

Fixed an issue where the linear layer bias wouldn't be resized along the weight resize when there was an embedding matrix resize with XLNet (cf #1124).

This fix works for any model that needs to tie its weights between an embedding layer & a linear layer if . that linear layer has a bias.

thomwolf · 2019-09-01T23:56:54Z

pytorch_transformers/modeling_utils.py

@@ -327,6 +327,14 @@ def _tie_or_clone_weights(self, first_module, second_module):
        else:
            first_module.weight = second_module.weight

+        if hasattr(first_module, 'bias') and first_module.bias is not None:
+            first_module.bias.data = torch.nn.functional.pad(


It's a nice and concise way to do it but I'm worried about two things here:

torch.nn.functional.pad is not present before pytorch 1.2.0 (and I think we should aim to keep compatibility with +1.0.1 if possible for now).

when we reduce the size of the embeddings (which is supported right now), this will break, I believe.

I believe torch.nn.functional.pad was actually introduced in torch way back, and is available in the documentation of version 1.0.0. I've run this code with torch 1.0.0 installed successfully.

The pad function actually accepts negative indices, and will then remove the overflowing elements. In this scenario, it removes the last elements, similar to the resize_token_embeddings method.

Here's an example, running on torch 1.0.0:

from pytorch_transformers import XLNetTokenizer, XLNetLMHeadModel model: XLNetLMHeadModel = XLNetLMHeadModel.from_pretrained("xlnet-base-cased") tok = XLNetTokenizer.from_pretrained("xlnet-base-cased") print(model.lm_loss.bias.shape, model.lm_loss.weight.shape) # torch.Size([32000]) torch.Size([32000, 768]) tok.add_tokens(["token"]) model.resize_token_embeddings(len(tok)) print(model.lm_loss.bias.shape, model.lm_loss.weight.shape) # torch.Size([32001]) torch.Size([32001, 768]) model.resize_token_embeddings(len(tok) - 100) print(model.lm_loss.bias.shape, model.lm_loss.weight.shape) # torch.Size([31901]) torch.Size([31901, 768])

Damned, my doc search capabilities look quite rusty! Ok all good then :)

LysandreJik added 2 commits August 31, 2019 00:50

XLNet bias fix on resize embeddings (cf #1124)

e0f867a

Check for None

ea86bef

thomwolf reviewed Sep 1, 2019

View reviewed changes

thomwolf merged commit 0287d26 into master Sep 2, 2019

julien-c deleted the xlnet-bias branch December 18, 2019 01:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XLNet bias fix on resize embeddings (cf #1124) #1162

XLNet bias fix on resize embeddings (cf #1124) #1162

LysandreJik commented Aug 31, 2019 •

edited by thomwolf

Loading

thomwolf Sep 1, 2019

LysandreJik Sep 2, 2019

thomwolf Sep 2, 2019

XLNet bias fix on resize embeddings (cf #1124) #1162

XLNet bias fix on resize embeddings (cf #1124) #1162

Conversation

LysandreJik commented Aug 31, 2019 • edited by thomwolf Loading

thomwolf Sep 1, 2019

Choose a reason for hiding this comment

LysandreJik Sep 2, 2019

Choose a reason for hiding this comment

thomwolf Sep 2, 2019

Choose a reason for hiding this comment

LysandreJik commented Aug 31, 2019 •

edited by thomwolf

Loading