You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for the great work! I'm wondering if there are any plans to train the model on some datasets in the near future. Since there are currently no released weights for this model architecture, it would be incredibly helpful if training could be done, even at a smaller scale. This would allow the community to experiment further and potentially build on this architecture.
The text was updated successfully, but these errors were encountered:
Hi, yes I plan to train in it on the coco captions dataset, soon! Do you know any story generation datasets that contain both image and text modality (it should not be very large, just like we have MNIST for CV)?
Hi, yes I plan to train in it on the coco captions dataset, soon! Do you know any story generation datasets that contain both image and text modality (it should not be very large, just like we have MNIST for CV)?
@Conless, could you prepare a preprocessing script for the dataset (please refer to the README for how the inputs are arranged for the model) and send a PR? If you're okay with it, otherwise I can start working on it in a couple of days as I’m busy. :)
Hi, thanks for the great work! I'm wondering if there are any plans to train the model on some datasets in the near future. Since there are currently no released weights for this model architecture, it would be incredibly helpful if training could be done, even at a smaller scale. This would allow the community to experiment further and potentially build on this architecture.
The text was updated successfully, but these errors were encountered: