adds visuals

jainanisha90 · Apr 8, 2024 · 06af687 · 06af687
1 parent 40a17fd
commit 06af687
Show file tree

Hide file tree

Showing 13 changed files with 21 additions and 19 deletions.
diff --git a/docs/source/_static/images/tutorials/clustering.jpg b/docs/source/_static/images/tutorials/clustering.jpg
diff --git a/docs/source/tutorials/clustering.ipynb b/docs/source/tutorials/clustering.ipynb
@@ -7,6 +7,13 @@
     "# Clustering Images with Embeddings"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![Clustering](./images/clustering_preview.jpg)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -94,8 +101,8 @@
     "\n",
     "**Hierarchical clustering**: These techniques seek to either:\n",
     "\n",
-    "1. Construct clusters by starting with individual points and iteratively combining clusters into larger composites or \n",
-    "2. Deconstruct clusters, starting with all objects in one cluster and iteratively diving clusters into smaller components.\n",
+    "1. *Construct* clusters by starting with individual points and iteratively combining clusters into larger composites or \n",
+    "2. *Deconstruct* clusters, starting with all objects in one cluster and iteratively diving clusters into smaller components.\n",
     "\n",
     "Constructive techniques like [Agglomerative Clustering](https://scikit-learn.org/stable/modules/clustering.html#hierarchical-clustering) become computationally expensive as the dataset grows, but performance can be quite impressive for small-to-medium datasets and low-dimensional features."
    ]
@@ -253,7 +260,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![FiftyOne App](./images/clustering_dataset_in_app.jpg)"
    ]
   },
   {
@@ -307,7 +314,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![Embeddings Panel](./images/clustering_open_embeddings_panel.gif)"
    ]
   },
   {
@@ -328,7 +335,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![Compute Clusters](./images/clustering_compute_clusters_operator.gif)"
    ]
   },
   {
@@ -346,7 +353,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![Filtering Clusters](./images/clustering_filter_by_cluster_number.gif)"
    ]
   },
   {
@@ -360,7 +367,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![Coloring by Clusters](./images/clustering_color_by_cluster.gif)"
    ]
   },
   {
@@ -381,14 +388,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Returning to the initial set of clusters, let’s dig into one final area in the embeddings plot. Notice how a few images of people playing soccer got lumped into a cluster of primarily tennis images. This is because we passed 2D dimensionality reduced vectors into our clustering routine rather than the embedding vectors themselves. While 2D projections are helpful for visualization, and techniques like UMAP are fairly good at retaining structure, relative distances are not exactly preserved, and some information is lost. Suppose we instead pass our CLIP embeddings directly into our clustering computation with the same hyperparameters. In that case, these soccer images are assigned to the same cluster as the rest of the soccer images, along with other field sports like frisbee and baseball"
+    "Returning to the initial set of clusters, let’s dig into one final area in the embeddings plot. Notice how a few images of people playing soccer got lumped into a cluster of primarily tennis images. This is because we passed 2D dimensionality reduced vectors into our clustering routine rather than the embedding vectors themselves. While 2D projections are helpful for visualization, and techniques like UMAP are fairly good at retaining structure, relative distances are not exactly preserved, and some information is lost. Suppose we instead pass our CLIP embeddings directly into our clustering computation with the same hyperparameters. In that case, these soccer images are assigned to the same cluster as the rest of the soccer images, along with other field sports like frisbee and baseball:"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![UMAP Limitations](./images/clustering_umap_limitation.gif)"
    ]
   },
   {
@@ -411,14 +418,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![HDSCAN Clusters](./images/clustering_hdbscan.gif)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Note that for HDBSCAN, label ”-1” is given to all background images. These images are not merged into any of the final clusters."
+    "Note that for HDBSCAN, label `-1` is given to all background images. These images are not merged into any of the final clusters."
    ]
   },
   {
@@ -439,7 +446,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![Clustering Run Info](./images/clustering_get_clustering_info.jpg)"
    ]
   },
   {
@@ -498,7 +505,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**INSERT IMAGE**"
+    "![Labeling Clusters with GPT-4V](./images/clustering_gpt4v_labeling.gif)"
    ]
   },
   {
@@ -533,11 +540,6 @@
     "- **Clustering Hyperparameters**: We barely touched the number of clusters in this walkthrough. Your results may vary as you increase or decrease this number. For some techniques, like k-means clustering, there are heuristics you can use to [estimate the optimal number of clusters](https://www.analyticsvidhya.com/blog/2021/05/k-mean-getting-the-optimal-number-of-clusters/). Don’t stop there; experiment with other hyperparameters as well!\n",
     "- **Concept Modeling Techniques**: the built-in concept modeling technique in this walkthrough uses GPT-4V and some light prompting to identify each cluster's core concept. This is but one way to approach an open-ended problem. Try using [image captioning](https://github.com/jacobmarks/image-captioning) and [topic modeling](https://en.wikipedia.org/wiki/Topic_model), or create your own technique!"
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": []
   }
  ],
  "metadata": {

diff --git a/docs/source/tutorials/images/clustering_color_by_cluster.gif b/docs/source/tutorials/images/clustering_color_by_cluster.gif
diff --git a/docs/source/tutorials/images/clustering_compute_clusters_operator.gif b/docs/source/tutorials/images/clustering_compute_clusters_operator.gif
diff --git a/docs/source/tutorials/images/clustering_dataset_in_app.jpg b/docs/source/tutorials/images/clustering_dataset_in_app.jpg
diff --git a/docs/source/tutorials/images/clustering_filter_by_cluster_number.gif b/docs/source/tutorials/images/clustering_filter_by_cluster_number.gif
diff --git a/docs/source/tutorials/images/clustering_get_clustering_info.jpg b/docs/source/tutorials/images/clustering_get_clustering_info.jpg
diff --git a/docs/source/tutorials/images/clustering_gpt4v_labeling.gif b/docs/source/tutorials/images/clustering_gpt4v_labeling.gif
diff --git a/docs/source/tutorials/images/clustering_hdbscan.gif b/docs/source/tutorials/images/clustering_hdbscan.gif
diff --git a/docs/source/tutorials/images/clustering_open_embeddings_panel.gif b/docs/source/tutorials/images/clustering_open_embeddings_panel.gif
diff --git a/docs/source/tutorials/images/clustering_preview.jpg b/docs/source/tutorials/images/clustering_preview.jpg
diff --git a/docs/source/tutorials/images/clustering_umap_limitation.gif b/docs/source/tutorials/images/clustering_umap_limitation.gif
diff --git a/docs/source/tutorials/index.rst b/docs/source/tutorials/index.rst
@@ -163,7 +163,7 @@ your datasets and turn your good models into *great models*.
     :header: Clustering Images with Embeddings
     :description: Use embeddings to cluster images in your dataset and visualize the results in FiftyOne.
     :link: clustering.html
-    :image: ../_static/images/tutorials/clustering.png
+    :image: ../_static/images/tutorials/clustering.jpg
     :tags: App,Brain,Dataset-Curation,Embeddings,Visualization
 
 .. End of tutorial cards