[Templates] Reintroduce requirements.txt + temporary patch fixes (ray…

…-project#34903) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
ShuN6211 · May 2, 2023 · f30f2ed · f30f2ed
1 parent 9f60a09
commit f30f2ed
Show file tree

Hide file tree

Showing 13 changed files with 284 additions and 169 deletions.
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
@@ -11,6 +11,7 @@
 # NOTE: Add @ray-project/ray-docs to all following docs subdirs.
 /doc/ @ray-project/ray-docs
 /doc/source/use-cases.rst @ericl @pcmoritz
+/doc/source/templates @justinvyu @sofianhnaide
 
 # ==== Ray core ====
 

diff --git a/doc/BUILD b/doc/BUILD
@@ -236,7 +236,10 @@ py_test_run_all_subdirectory(
 
 filegroup(
     name = "workspace_templates",
-    srcs = glob(["source/templates/tests/*.ipynb"]),
+    srcs = glob([
+        "source/templates/tests/**/*.ipynb",
+        "source/templates/tests/**/requirements.txt"
+    ]),
     visibility = ["//doc:__subpackages__"]
 )
 
@@ -255,7 +258,8 @@ py_test(
 
 py_test_run_all_notebooks(
     size = "large",
-    include = ["source/templates/tests/many_model_training.ipynb"],
+    # TODO(justinvyu): Merge tests/ with the regular versions of the templates.
+    include = ["source/templates/tests/02_many_model_training/many_model_training.ipynb"],
     exclude = [],
     data = ["//doc:workspace_templates"],
     tags = ["exclusive", "team:ml", "ray_air"],
@@ -267,8 +271,9 @@ py_test_run_all_notebooks(
 py_test_run_all_notebooks(
     size = "large",
     include = [
-        "source/templates/tests/batch_inference.ipynb",
-        "source/templates/tests/serving_stable_diffusion.ipynb"
+        # TODO(justinvyu): Merge tests/ with the regular versions of the templates.
+        "source/templates/tests/01_batch_inference/batch_inference.ipynb",
+        "source/templates/tests/03_serving_stable_diffusion/serving_stable_diffusion.ipynb"
     ],
     exclude = [],
     data = ["//doc:workspace_templates"],

diff --git a/doc/source/templates/01_batch_inference/batch_inference.ipynb b/doc/source/templates/01_batch_inference/batch_inference.ipynb
@@ -8,14 +8,14 @@
    "source": [
     "# Scaling Batch Inference with Ray Data\n",
     "\n",
-    "This template is a quickstart to using [Ray Data](https://docs.ray.io/en/latest/data/data.html) for batch inference. Ray Data is one of many libraries under the [Ray AI Runtime](https://docs.ray.io/en/latest/ray-air/getting-started.html). See [this blog post](https://www.anyscale.com/blog/model-batch-inference-in-ray-actors-actorpool-and-datasets) for more information on why and how you should perform batch inference with Ray!\n",
+    "This template is a quickstart to using [Ray Data](https://docs.ray.io/en/latest/data/dataset.html) for batch inference. Ray Data is one of many libraries under the [Ray AI Runtime](https://docs.ray.io/en/latest/ray-air/getting-started.html). See [this blog post](https://www.anyscale.com/blog/model-batch-inference-in-ray-actors-actorpool-and-datasets) for more information on why and how you should perform batch inference with Ray!\n",
     "\n",
     "This template walks through GPU batch prediction on an image dataset using a PyTorch model, but the framework and data format are there just to help you build your own application!\n",
     "\n",
     "At a high level, this template will:\n",
-    "1. [Load your dataset using Ray Data.](https://docs.ray.io/en/latest/data/creating-datastreams.html)\n",
-    "2. [Preprocess your dataset before feeding it to your model.](https://docs.ray.io/en/latest/data/transforming-datastreams.html)\n",
-    "3. [Initialize your model and perform inference on a shard of your dataset with a remote actor.](https://docs.ray.io/en/latest/data/transforming-datastreams.html#callable-class-udfs)\n",
+    "1. [Load your dataset using Ray Data.](https://docs.ray.io/en/latest/data/creating-datasets.html)\n",
+    "2. [Preprocess your dataset before feeding it to your model.](https://docs.ray.io/en/latest/data/transforming-datasets.html)\n",
+    "3. [Initialize your model and perform inference on a shard of your dataset with a remote actor.](https://docs.ray.io/en/latest/data/transforming-datasets.html#writing-user-defined-functions-udfs)\n",
     "4. [Save your prediction results.](https://docs.ray.io/en/latest/data/api/input_output.html)\n",
     "\n",
     "> Slot in your code below wherever you see the ✂️ icon to build a many model training Ray application off of this template!"
@@ -52,42 +52,46 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "770bbdc7",
-   "metadata": {},
+   "id": "9d49681f-baf0-4ed8-9740-5c4e38744311",
+   "metadata": {
+    "tags": []
+   },
    "outputs": [],
    "source": [
-    "!ray status"
+    "NUM_WORKERS: int = 4\n",
+    "NUM_GPUS_PER_WORKER: float = 1\n"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "9d49681f-baf0-4ed8-9740-5c4e38744311",
-   "metadata": {
-    "tags": []
-   },
+   "id": "770bbdc7",
+   "metadata": {},
    "outputs": [],
    "source": [
-    "NUM_WORKERS: int = 4\n",
-    "NUM_GPUS_PER_WORKER: float = 1\n"
+    "!ray status"
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "id": "23321ba8",
    "metadata": {},
    "source": [
     "```{tip}\n",
-    "Try setting `NUM_GPUS_PER_WORKER` to a fractional amount! This will leverage Ray's fractional resource allocation, which means you can schedule multiple batch inference workers to happen on the same GPU.\n",
+    "Try setting `NUM_GPUS_PER_WORKER` to a fractional amount! This will leverage Ray's fractional resource allocation, which means you can schedule multiple batch inference workers to use the same GPU.\n",
     "```"
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "id": "3b6f2352",
    "metadata": {},
    "source": [
-    "> ✂️ Replace this function with logic to load your own data with Ray Data."
+    "> ✂️ Replace this function with logic to load your own data with Ray Data.\n",
+    ">\n",
+    "> See [the Ray Data guide on creating datasets](https://docs.ray.io/en/latest/data/creating-datasets.html) to learn how to create a dataset based on the data type and how file storage format."
    ]
   },
   {
@@ -97,7 +101,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "def load_ray_dataset() -> ray.data.Datastream:\n",
+    "def load_ray_dataset():\n",
     "    from ray.data.datasource.partitioning import Partitioning\n",
     "\n",
     "    s3_uri = \"s3://anonymous@air-example-data-2/imagenette2/val/\"\n",
@@ -163,7 +167,9 @@
    "outputs": [],
    "source": [
     "ds = ds.map_batches(preprocess, batch_format=\"numpy\")\n",
-    "ds.schema()\n"
+    "\n",
+    "print(\"Dataset schema:\\n\", ds.schema())\n",
+    "print(\"Number of images:\", ds.count())\n"
    ]
   },
   {
@@ -194,9 +200,9 @@
     "    def __call__(self, batch: Dict[str, np.ndarray]) -> Dict[str, np.ndarray]:\n",
     "        # <Replace this with your own model inference logic>\n",
     "        input_data = torch.as_tensor(batch[\"image\"], device=self.device)\n",
-    "        with torch.no_grad():\n",
-    "            result = self.model(input_data)\n",
-    "        return {\"predictions\": result.cpu().numpy()}\n"
+    "        with torch.inference_mode():\n",
+    "            pred = self.model(input_data)\n",
+    "        return {\"predicted_class_index\": pred.argmax(dim=1).detach().cpu().numpy()}\n"
    ]
   },
   {
@@ -218,8 +224,9 @@
     "    PredictCallable,\n",
     "    batch_size=128,\n",
     "    compute=ray.data.ActorPoolStrategy(\n",
-    "        # Fix the number of batch inference workers to a specified value.\n",
-    "        size=NUM_WORKERS,\n",
+    "        # Fix the number of batch inference workers to `NUM_WORKERS`.\n",
+    "        min_size=NUM_WORKERS,\n",
+    "        max_size=NUM_WORKERS,\n",
     "    ),\n",
     "    num_gpus=NUM_GPUS_PER_WORKER,\n",
     "    batch_format=\"numpy\",\n",
@@ -237,14 +244,23 @@
     "preds.schema()\n"
    ]
   },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "2565ba08",
+   "metadata": {},
+   "source": [
+    "Show the first few predictions!"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "8d606556",
    "metadata": {},
    "outputs": [],
    "source": [
-    "preds.take(1)\n"
+    "preds.take(5)\n"
    ]
   },
   {

diff --git a/doc/source/templates/02_many_model_training/many_model_training.ipynb b/doc/source/templates/02_many_model_training/many_model_training.ipynb
@@ -37,8 +37,7 @@
     "\n",
     "This template requires certain Python packages to be available to every node in the cluster.\n",
     "\n",
-    "> ✂️ Add your own package dependencies! You can specify bounds for package versions\n",
-    "> in the same format as a `requirements.txt` file.\n"
+    "> ✂️ Add your own package dependencies in the `requirements.txt` file!\n"
    ]
   },
   {
@@ -50,9 +49,21 @@
    },
    "outputs": [],
    "source": [
-    "requirements = [\n",
-    "    \"statsforecast==1.5.0\",\n",
-    "]\n"
+    "requirements_path = \"./requirements.txt\"\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "92161434",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open(requirements_path, \"r\") as f:\n",
+    "    requirements = f.read().strip().splitlines()\n",
+    "\n",
+    "print(\"Requirements:\")\n",
+    "print(\"\\n\".join(requirements))\n"
    ]
   },
   {
@@ -64,7 +75,9 @@
     "First, we may want to use these modules right here in our script, which is running on the head node.\n",
     "Install the Python packages on the head node using `pip install`.\n",
     "\n",
-    "You may need to restart this notebook kernel to access the installed packages.\n"
+    "```{note}\n",
+    "You may need to restart this notebook kernel to access the installed packages.\n",
+    "```\n"
    ]
   },
   {
@@ -74,9 +87,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "all_requirements = \" \".join(requirements)\n",
-    "\n",
-    "%pip install {all_requirements}\n"
+    "%pip install -r {requirements_path} --upgrade"
    ]
   },
   {
@@ -118,11 +129,12 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "id": "b8fc83d0",
    "metadata": {},
    "source": [
-    "> ✂️ Replace this value to change the number of data partitions you will use. This will be total the number of Tune trials you will run!\n",
+    "> ✂️ Replace this value to change the number of data partitions you will use (<= 5000 for this dataset). This will be total the number of Tune trials you will run!\n",
     ">\n",
     "> Note that this template fits two models per data partition and reports the best performing one."
    ]
@@ -136,7 +148,7 @@
    },
    "outputs": [],
    "source": [
-    "NUM_DATA_PARTITIONS: int = 1000\n"
+    "NUM_DATA_PARTITIONS: int = 500\n"
    ]
   },
   {

diff --git a/doc/source/templates/02_many_model_training/requirements.txt b/doc/source/templates/02_many_model_training/requirements.txt
@@ -0,0 +1 @@
+statsforecast==1.5.0
diff --git a/doc/source/templates/03_serving_stable_diffusion/requirements.txt b/doc/source/templates/03_serving_stable_diffusion/requirements.txt
@@ -0,0 +1,10 @@
+accelerate==0.14.0
+diffusers==0.15.1
+matplotlib>=3.5.3,<=3.7.1
+numpy>=1.21.6,<=1.23.5
+Pillow==9.3.0
+scipy>=1.7.3,<=1.9.3
+tensorboard>=2.11.2,<=2.12.0
+torch==1.13.0
+torchvision==0.14.0
+transformers==4.28.1