Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding utility for selecting objects from datasets #616

Merged
merged 7 commits into from
Oct 20, 2020
Merged

Conversation

brimoor
Copy link
Contributor

@brimoor brimoor commented Oct 20, 2020

Adds a fiftyone.utils.selection module with select_objects() and exclude_objects() utilities that can select/exclude Label instances as specified by a list of the following format:

# A list of dicts specifying the selected Labels
[
    {
        'sample_id': '5f8e4b4ec3545f3720656b0a',
        'field': 'ground_truth',
        'object_id': '5f45247cef00e6374aacbf65',
    },
    ....
]

Example usage

import random

import fiftyone as fo
import fiftyone.utils.selection as fous
import fiftyone.zoo as foz


NUM_SAMPLES_TO_SELECT = 3
MAX_OBJECTS_PER_SAMPLE_TO_SELECT = 5


def count_detections(sample_collection, label_field):
    num_objects = 0
    for sample in sample_collection:
        num_objects += len(sample[label_field].detections)

    return num_objects


dataset = foz.load_zoo_dataset("quickstart")


# Generate some random object selections
selected_objects = []
for sample in dataset.take(NUM_SAMPLES_TO_SELECT):
    detections = sample.ground_truth.detections
    num_objects = random.randint(
        1, min(len(detections), MAX_OBJECTS_PER_SAMPLE_TO_SELECT)
    )
    for detection in random.sample(detections, num_objects):
        selected_objects.append(
            {
                "sample_id": sample.id,
                "field": "ground_truth",
                "object_id": detection.id,
            }
        )

print("Selected objects:\n%s\n" % fo.pformat(selected_objects))

# Get only selected objects
selected_view = fous.select_objects(dataset, selected_objects)

# Exclude selected objects
excluded_view = fous.exclude_objects(dataset, selected_objects)

total_objects = count_detections(dataset, "ground_truth")
num_selected_objects = len(selected_objects)
num_objects_in_selected_view = count_detections(selected_view, "ground_truth")
num_objects_in_excluded_view = count_detections(excluded_view, "ground_truth")
num_objects_excluded = total_objects - num_objects_in_excluded_view

print("Total objects: %d" % total_objects)
print("Selected objects: %d" % num_selected_objects)
print("Objects in selected view: %d" % num_objects_in_selected_view)
print("Objects in excluded view: %d (%d missing)" % (
    num_objects_in_excluded_view, num_objects_excluded
))

Example output

Selected objects:
[
    {
        'sample_id': '5f8e4b4ec3545f3720656b0a',
        'field': 'ground_truth',
        'object_id': '5f45247cef00e6374aacbf65',
    },
    {
        'sample_id': '5f8e4b4ec3545f3720656b0a',
        'field': 'ground_truth',
        'object_id': '5f45247cef00e6374aacbf64',
    },
    {
        'sample_id': '5f8e4b4ec3545f37206569fb',
        'field': 'ground_truth',
        'object_id': '5f452478ef00e6374aac9814',
    },
    {
        'sample_id': '5f8e4b4ec3545f37206569fb',
        'field': 'ground_truth',
        'object_id': '5f452478ef00e6374aac980b',
    },
    {
        'sample_id': '5f8e4b4ec3545f37206569fb',
        'field': 'ground_truth',
        'object_id': '5f452478ef00e6374aac9813',
    },
    {
        'sample_id': '5f8e4b4ec3545f37206569fb',
        'field': 'ground_truth',
        'object_id': '5f452478ef00e6374aac980f',
    },
    {
        'sample_id': '5f8e4b4cc3545f372065595c',
        'field': 'ground_truth',
        'object_id': '5f45246cef00e6374aac2ae5',
    },
]

Total objects: 1232
Selected objects: 7
Objects in selected view: 7
Objects in excluded view: 1225 (7 missing)

@brimoor brimoor added the feature Work on a feature request label Oct 20, 2020
@brimoor brimoor requested a review from a team October 20, 2020 02:37
@brimoor brimoor self-assigned this Oct 20, 2020
@brimoor brimoor requested a review from lethosor October 20, 2020 02:37
Copy link
Contributor

@benjaminpkane benjaminpkane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!


[
{
"sample_id": "5f8d254a27ad06815ab89df4",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also need frame_number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add that yet because filtering frame-level labels is not yet supported. I believe @benjaminpkane plans to work on that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I plan on starting work on this today.

@@ -0,0 +1,75 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be fine (as well as consistent) to have this file in tests/unittests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't because it involves downloading the quickstart dataset (~25MB). If you're comfortable with that happening all the time, feel free to upgrade it from misc/ to unittests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't see that. Looks like it's running in GitHub Actions and not taking too much time there, but separating it into another folder (like you did) is probably helpful for people running tests locally.

@brimoor brimoor merged commit 771f216 into develop Oct 20, 2020
@brimoor brimoor deleted the selections branch October 20, 2020 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Work on a feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants