{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "数据并行（选读）\n",
    "==========================\n",
    "**Authors**: [Sung Kim](https://github.com/hunkim) and [Jenny Kang](https://github.com/jennykang)\n",
    "\n",
    "在这个教程里，我们将学习如何使用 ``DataParallel`` 来使用多GPU。 \n",
    "\n",
    "PyTorch非常容易就可以使用多GPU，用如下方式把一个模型放到GPU上：\n",
    "\n",
    "```python\n",
    "\n",
    "    device = torch.device(\"cuda:0\")\n",
    "    model.to(device)\n",
    "```\n",
    " GPU:\n",
    "然后复制所有的张量到GPU上：\n",
    "```python\n",
    "\n",
    "    mytensor = my_tensor.to(device)\n",
    "```\n",
    "请注意，只调用``my_tensor.to(device)``并没有复制张量到GPU上，而是返回了一个copy。所以你需要把它赋值给一个新的张量并在GPU上使用这个张量。\n",
    "\n",
    "在多GPU上执行前向和反向传播是自然而然的事。\n",
    "但是PyTorch默认将只使用一个GPU。\n",
    "\n",
    "使用``DataParallel``可以轻易的让模型并行运行在多个GPU上。\n",
    "\n",
    "\n",
    "```python\n",
    "\n",
    "    model = nn.DataParallel(model)\n",
    "```\n",
    "这才是这篇教程的核心，接下来我们将更详细的介绍它。\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "导入和参数\n",
    "----------------------\n",
    "\n",
    "导入PyTorch模块和定义参数。\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "from torch.utils.data import Dataset, DataLoader\n",
    "\n",
    "# Parameters and DataLoaders\n",
    "input_size = 5\n",
    "output_size = 2\n",
    "\n",
    "batch_size = 30\n",
    "data_size = 100"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Device\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "虚拟数据集\n",
    "-------------\n",
    "\n",
    "制作一个虚拟（随机）数据集，\n",
    "你只需实现 `__getitem__`\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "class RandomDataset(Dataset):\n",
    "\n",
    "    def __init__(self, size, length):\n",
    "        self.len = length\n",
    "        self.data = torch.randn(length, size)\n",
    "\n",
    "    def __getitem__(self, index):\n",
    "        return self.data[index]\n",
    "\n",
    "    def __len__(self):\n",
    "        return self.len\n",
    "\n",
    "rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size),\n",
    "                         batch_size=batch_size, shuffle=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "简单模型\n",
    "------------\n",
    "作为演示，我们的模型只接受一个输入，执行一个线性操作，然后得到结果。\n",
    "说明：``DataParallel``能在任何模型（CNN，RNN，Capsule Net等）上使用。\n",
    "\n",
    "\n",
    "我们在模型内部放置了一条打印语句来打印输入和输出向量的大小。\n",
    "\n",
    "请注意批次的秩为0时打印的内容。\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Model(nn.Module):\n",
    "    # Our model\n",
    "\n",
    "    def __init__(self, input_size, output_size):\n",
    "        super(Model, self).__init__()\n",
    "        self.fc = nn.Linear(input_size, output_size)\n",
    "\n",
    "    def forward(self, input):\n",
    "        output = self.fc(input)\n",
    "        print(\"\\tIn Model: input size\", input.size(),\n",
    "              \"output size\", output.size())\n",
    "\n",
    "        return output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "创建一个模型和数据并行\n",
    "-----------------------------\n",
    "\n",
    "这是本教程的核心部分。\n",
    "\n",
    "首先，我们需要创建一个模型实例和检测我们是否有多个GPU。\n",
    "如果有多个GPU，使用``nn.DataParallel``来包装我们的模型。\n",
    "然后通过``model.to(device)``把模型放到GPU上。\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Model(\n",
       "  (fc): Linear(in_features=5, out_features=2, bias=True)\n",
       ")"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model = Model(input_size, output_size)\n",
    "if torch.cuda.device_count() > 1:\n",
    "    print(\"Let's use\", torch.cuda.device_count(), \"GPUs!\")\n",
    "    # dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs\n",
    "    model = nn.DataParallel(model)\n",
    "\n",
    "model.to(device)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "运行模型\n",
    "-------------\n",
    "\n",
    "现在可以看到输入和输出张量的大小。\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\tIn Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])\n",
      "Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
      "\tIn Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])\n",
      "Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
      "\tIn Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])\n",
      "Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
      "\tIn Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
      "Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])\n"
     ]
    }
   ],
   "source": [
    "for data in rand_loader:\n",
    "    input = data.to(device)\n",
    "    output = model(input)\n",
    "    print(\"Outside: input size\", input.size(),\n",
    "          \"output_size\", output.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "结果\n",
    "-------\n",
    "\n",
    "当没有或者只有一个GPU时，对30个输入和输出进行批处理，得到了期望的一样得到30个输入和输出，但是如果你有多个GPU，你得到如下的结果。\n",
    "\n",
    "\n",
    "2 GPUs\n",
    "~\n",
    "\n",
    "If you have 2, you will see:\n",
    "\n",
    ".. code:: bash\n",
    "\n",
    "    # on 2 GPUs\n",
    "    Let's use 2 GPUs!\n",
    "        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])\n",
    "        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])\n",
    "        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])\n",
    "        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])\n",
    "        In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])\n",
    "    Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])\n",
    "\n",
    "3 GPUs\n",
    "~\n",
    "\n",
    "If you have 3 GPUs, you will see:\n",
    "\n",
    ".. code:: bash\n",
    "\n",
    "    Let's use 3 GPUs!\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "    Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])\n",
    "\n",
    "8 GPUs\n",
    "~~\n",
    "\n",
    "If you have 8, you will see:\n",
    "\n",
    ".. code:: bash\n",
    "\n",
    "    Let's use 8 GPUs!\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])\n",
    "    Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "总结\n",
    "-------\n",
    "\n",
    "DataParallel会自动的划分数据，并将作业发送到多个GPU上的多个模型。\n",
    "并在每个模型完成作业后，收集合并结果并返回。\n",
    "\n",
    "更多信息请看这里：\n",
    "https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html.\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Pytorch for Deeplearning",
   "language": "python",
   "name": "pytorch"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}