How to Construct a Ground-Truth Test Dataset

Hi,

I noticed that you used SAM and Grounding DINO to generate segmentation masks.

Could you please explain how you merge the outputs from SAM and Grounding DINO to create the ground-truth in the GranD-f dataset?
Additionally, could you describe the process of creating the final dense caption?
Is your method fully automated, or does it require manual verification?
I am interested in applying your method to the GranD dataset.

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Construct a Ground-Truth Test Dataset #69

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development