Skip to content

Commit

Permalink
add visuals
Browse files Browse the repository at this point in the history
  • Loading branch information
jacobmarks committed Apr 12, 2024
1 parent aed312a commit cb657cf
Show file tree
Hide file tree
Showing 9 changed files with 119 additions and 8 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/tutorials/images/sahi_base_model.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/tutorials/images/sahi_dataset.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/tutorials/images/sahi_slices.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/source/tutorials/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ your datasets and turn your good models into *great models*.
:header: Small Object Detection with SAHI
:description: Detect small objects in your images with Slicing-Aided Hyper-Inference (SAHI) and FiftyOne.
:link: small_object_detection.html
:image: ../_static/images/tutorials/small_object_detection.png
:image: ../_static/images/tutorials/small_object_detection.jpg
:tags: Model-Evaluation,Model-Zoo

.. End of tutorial cards
Expand Down Expand Up @@ -216,4 +216,4 @@ your datasets and turn your good models into *great models*.
Zero-shot classification <zero_shot_classification.ipynb>
Data augmentation <data_augmentation.ipynb>
Clustering images <clustering.ipynb>
Small object detection with SAHI<small_object_detection.ipynb>
Detecting small objects<small_object_detection.ipynb>
123 changes: 117 additions & 6 deletions docs/source/tutorials/small_object_detection.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,50 @@
"# Detecting Small Objects with SAHI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Teaser](../_static/images/tutorials/small_object_detection.jpg)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Object detection is one of the fundamental tasks in computer vision, but detecting small objects can be particularly challenging.\n",
"\n",
"In this walkthrough, you'll learn how to use a technique called SAHI (Slicing Aided Hyper Inference) in conjunction with state-of-the-art object detection models to improve the detection of small objects. We'll apply SAHI with Ultralytics' YOLOv8 model to detect small objects in the VisDrone dataset, and then evaluate these predictions to better understand how slicing impacts detection performance.\n",
"\n",
"It covers the following:\n",
"\n",
"- Loading the VisDrone dataset from the Hugging Face Hub\n",
"- Applying Ultralytics' YOLOv8 model to the dataset\n",
"- Using SAHI to run inference on slices of the images\n",
"- Evaluating model performance with and without SAHI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup and Installation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For this walkthrough, we'll be using the following libraries:\n",
"\n",
"- `fiftyone` for dataset exploration and manipulation\n",
"- `huggingface_hub` for loading the VisDrone dataset\n",
"- `ultralytics` for running object detection with YOLOv8\n",
"- `sahi` for slicing aided hyper inference\n",
"\n",
"If you haven't already, install the latest versions of these libraries:"
]
},
{
"cell_type": "code",
"execution_count": 62,
Expand All @@ -31,6 +68,15 @@
"pip install -U fiftyone sahi ultralytics huggingface_hub --quiet"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's get started! 🚀\n",
"\n",
"First, import the necessary modules from FiftyOne:"
]
},
{
"cell_type": "code",
"execution_count": 1,
Expand All @@ -39,7 +85,6 @@
"source": [
"import fiftyone as fo\n",
"import fiftyone.zoo as foz\n",
"import fiftyone.brain as fob\n",
"import fiftyone.utils.huggingface as fouh\n",
"from fiftyone import ViewField as F"
]
Expand All @@ -48,7 +93,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll be taking advantage of FiftyOne's [Hugging Face Hub integration](https://docs.voxel51.com/integrations/huggingface.html#huggingface-hub) to load a subset of the [VisDrone dataset](https://github.com/VisDrone/VisDrone-Dataset) directly from the Hugging Face Hub:"
"Now, let's download some data. We'll be taking advantage of FiftyOne's [Hugging Face Hub integration](https://docs.voxel51.com/integrations/huggingface.html#huggingface-hub) to load a subset of the [VisDrone dataset](https://github.com/VisDrone/VisDrone-Dataset) directly from the [Hugging Face Hub](https://huggingface.co/docs/hub/en/index):"
]
},
{
Expand All @@ -71,6 +116,13 @@
"dataset = fouh.load_from_hub(\"jamarks/VisDrone2019-DET\", name=\"sahi-test\", max_samples=100, overwrite=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before adding any predictions, let's take a look at the dataset:"
]
},
{
"cell_type": "code",
"execution_count": 22,
Expand All @@ -88,6 +140,13 @@
"session = fo.launch_app(dataset)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![VisDrone](./images/sahi_dataset.jpg)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -99,7 +158,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To start off, let's run our standard inference pipeline with a YOLOv8 (large-variant) model. We can load the model from Ultralytics and then apply this directly to our FiftyOne dataset using `apply_model()`, thanks to [FiftyOne's Ultralytics integration](https://docs.voxel51.com/integrations/ultralytics.html):"
"Now that we know what our data looks like, let's run our standard inference pipeline with a YOLOv8 (large-variant) model. We can load the model from Ultralytics and then apply this directly to our FiftyOne dataset using `apply_model()`, thanks to [FiftyOne's Ultralytics integration](https://docs.voxel51.com/integrations/ultralytics.html):"
]
},
{
Expand Down Expand Up @@ -160,6 +219,13 @@
"session = fo.launch_app(dataset)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Base Model Predictions](./images/sahi_base_model.gif)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -234,13 +300,27 @@
"session.view = filtered_view.view()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Filtered View](./images/sahi_base_model_predictions_filtered.jpg)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that the classes are aligned and we've reduced the crowding in our images, we can see that while the model does a pretty good job of detecting objects, it struggles with the small objects, especially people in the distance. This can happen with large images, as most detection models are trained on fixed-size images. As an example, YOLOv8 is trained on images with maximum side length $640$. When we feed it an image of size $1920$ x $1080$, the model will downsample the image to $640$ x $360$ before making predictions. This downsampling can cause small objects to be missed, as the model may not have enough information to detect them."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Detecting Small Objects with SAHI"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -252,7 +332,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Detecting Small Objects with SAHI"
"<figure>\n",
" <img src=\"https://raw.githubusercontent.com/obss/sahi/main/resources/sliced_inference.gif\" alt=\"Alt text\" style=\"width:100%\">\n",
" <figcaption style=\"text-align:center; color:gray;\">Illustration of Slicing Aided Hyper Inference. Image courtesy of SAHI Github Repo.</figcaption>\n",
"</figure>"
]
},
{
Expand Down Expand Up @@ -742,6 +825,13 @@
"session = fo.launch_app(filtered_view, auto=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Sliced Model Predictions](./images/sahi_slices.gif)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -760,7 +850,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### FiftyOne's Evaluation API"
"### Using FiftyOne's Evaluation API"
]
},
{
Expand Down Expand Up @@ -859,7 +949,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Evaluation Performance on Small Objects"
"### Evaluating Performance on Small Objects"
]
},
{
Expand Down Expand Up @@ -897,6 +987,13 @@
"session.view = small_boxes_view.view()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Small Box View](./images/sahi_small_boxes_view.gif)"
]
},
{
"cell_type": "code",
"execution_count": 112,
Expand Down Expand Up @@ -1092,6 +1189,13 @@
"session.view = high_conf_fp_view.view()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![False Positives View](./images/sahi_high_conf_fp_view.jpg)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -1123,6 +1227,13 @@
"You will also want to determine which evaluation metrics make the most sense for your use case!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Additional Resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down

0 comments on commit cb657cf

Please sign in to comment.