Skip to content

New classification datasets support for FLAVA #5108

Closed
@NicolasHug

Description

To support our colleagues' work on the FLAVA paper, and to foster collaborations in the multi-modal space, we would like to implement a few new datasets. Almost all of them are classification datasets but some also support other tasks like segmentation.

CC-ing @pmeier and @jdsgomes as previously discussed. We're on a fairly short timeline for this work, and ideally we would get all these in by end of January 2022.
I'm also wondering whether this is something that our open source contributors @oke-aditya @frgfm @zhiqwang could be interested in 🚀 ?

Implementing a new dataset

Implementing a dataset consists of 2 main things:

  • The dataset class with a root, split, transform and target_transform parameter. When available we should also support a download parameter (from what I checked, most of these are download-able apart maybe FER2013). See e.g. the MNIST class
  • A test class which will generate automatic tests, e.g. this one for MNIST.

If there's some ambiguity in the choices to make, the reference to follow is the VISSL where most of these datasets are already supported.

For contritbutors

If you're interesting in taking one of the datasets above, please comment below with "I'm working on dataset X" so that others don't pick the same! :)

cc @pmeier

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions