Closed
Description
Hello,
How hard would it be to add GPT-J support ?
At a first glance it looks pretty similar to GPT-2, does copying its code and adapt specific layers be a good starting point ?
It looks quite memory intensive so the float16 version seems more usable in practice, would supporting float16 out-of-the-box be difficult ?
Metadata
Assignees
Labels
No labels