{
 "nbformat": 4,
 "nbformat_minor": 0,
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "# Introduction to PyTorch\n",
    "\n",
    "In this notebook you will learn the fundamentals of PyTorch — the most widely used deep learning framework in research and industry.\n",
    "\n",
    "By the end of this notebook you will be able to:\n",
    "- Create and manipulate **tensors** (PyTorch's core data structure)\n",
    "- Reshape tensors confidently\n",
    "- Build a **custom Dataset** and load data with a **DataLoader**\n",
    "- Apply **data augmentation** and preprocessing transforms\n",
    "\n",
    "---\n",
    "📌 **Prerequisites:** basic Python and NumPy familiarity."
   ],
   "metadata": {
    "id": "intro"
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 1. What is PyTorch?\n",
    "\n",
    "[PyTorch](https://pytorch.org/) is an open-source deep learning framework developed by Meta AI. It is built around two key ideas:\n",
    "\n",
    "1. **Tensor computation** — like NumPy, but with GPU acceleration.\n",
    "2. **Automatic differentiation** — PyTorch tracks operations on tensors so it can compute gradients automatically (you will use this heavily when training models in the next notebook).\n",
    "\n",
    "To install PyTorch, visit https://pytorch.org/get-started/locally/ and follow the instructions for your system.\n",
    "\n",
    "```bash\n",
    "# Typical CPU-only install\n",
    "pip install torch torchvision\n",
    "```"
   ],
   "metadata": {
    "id": "section1"
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 2. Working with Tensors\n",
    "\n",
    "A **tensor** is the fundamental data structure in PyTorch — think of it as a generalisation of a __________ array:\n",
    "\n",
    "| Concept | NumPy | PyTorch |\n",
    "|---------|-------|---------|\n",
    "| 1-D array | `np.array([1,2,3])` | `torch.tensor([1,2,3])` |\n",
    "| 2-D array | `np.zeros((3,4))` | `torch.zeros((3,4))` |\n",
    "| N-D array | `ndarray` | `Tensor` |\n",
    "\n",
    "The key advantage over NumPy: tensors can live on a **GPU** and support automatic differentiation."
   ],
   "metadata": {
    "id": "section2"
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### 2.1 Creating Tensors\n\nFill in the missing function names to create each type of tensor:"
   ],
   "metadata": {
    "id": "section2_1"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "import torch\n",
    "\n",
    "tensor_random = torch.rand((2, 3))\n",
    "tensor_zeros = torch.zeros((4, 4))\n",
    "tensor_ones = torch.ones((3, 3))\n",
    "tensor_list = torch.tensor([[1, 2, 3], [4, 5, 6]])\n",
    "\n",
    "print(\"Random:\\n\", tensor_random)\n",
    "print(\"Zeros:\\n\", tensor_zeros)\n",
    "print(\"Ones:\\n\", tensor_ones)\n",
    "print(\"From list:\\n\", tensor_list)\n"
   ],
   "metadata": {
    "id": "create_tensors"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "### 2.2 Basic Tensor Operations\n",
    "\n",
    "Fill in the correct operators and function name:\n",
    "\n",
    "- Element-wise addition: `c = a ______ b`\n",
    "- Element-wise multiplication: `d = a ______ b`\n",
    "- Matrix multiplication: `e = torch.________(a, b)`"
   ],
   "metadata": {
    "id": "section2_2"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "a = torch.tensor([[1.0, 2.0], [3.0, 4.0]])\n",
    "b = torch.tensor([[5.0, 6.0], [7.0, 8.0]])\n",
    "\n",
    "c = a + b\n",
    "d = a * b\n",
    "e = torch.matmul(a, b)\n",
    "\n",
    "print(\"a + b:\\n\", c)\n",
    "print(\"a * b (element-wise):\\n\", d)\n",
    "print(\"a @ b (matmul):\\n\", e)\n"
   ],
   "metadata": {
    "id": "tensor_ops"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "### 2.3 Inspecting Tensor Properties\n",
    "\n",
    "Three attributes you will check constantly when debugging:\n",
    "\n",
    "- **`.shape`** — dimensions of the tensor (e.g. `torch.Size([2, 3])`)\n",
    "- **`.dtype`** — element data type (e.g. `torch.float32`)\n",
    "- **`.device`** — where the tensor lives (`cpu` or `cuda:0`)\n",
    "\n",
    "Fill in the attribute names:"
   ],
   "metadata": {
    "id": "section2_3"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "tensor = torch.rand(2, 3)\n",
    "\n",
    "print(\"Shape:\", tensor.shape)\n",
    "print(\"Data Type:\", tensor.dtype)\n",
    "print(\"Device:\", tensor.device)\n"
   ],
   "metadata": {
    "id": "inspect_tensor"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "### 2.4 Moving Tensors Between CPU and GPU\n",
    "\n",
    "GPU training can be 10–100× faster than CPU for large models. PyTorch makes it easy to move data between devices. The standard pattern is to detect the available device once at the top of your script and then move everything there.\n",
    "\n",
    "Fill in the missing method name and variable:"
   ],
   "metadata": {
    "id": "section2_4"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "print(\"Using device:\", device)\n",
    "\n",
    "tensor_gpu = tensor.to(device)\n",
    "print(\"Tensor device:\", tensor_gpu.device)\n"
   ],
   "metadata": {
    "id": "gpu_tensor"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 3. Tensor Reshaping\n",
    "\n",
    "Reshaping is one of the most common operations you will perform — for example, flattening an image before passing it to a linear layer, or adding a batch dimension.\n",
    "\n",
    "| Method | What it does |\n",
    "|--------|--------------|\n",
    "| `.view(shape)` | Reinterpret the data with a new shape — **zero-copy**, requires contiguous memory |\n",
    "| `.reshape(shape)` | Same as view but works even if memory is non-contiguous (safer default) |\n",
    "| `.squeeze()` | Remove all dimensions of size 1 |\n",
    "| `.unsqueeze(dim)` | Insert a new dimension of size 1 at position `dim` |\n",
    "| `.permute(dims)` | Reorder dimensions (useful when converting between channel-first and channel-last) |\n",
    "\n",
    "💡 Use `-1` in a shape to let PyTorch infer that dimension automatically."
   ],
   "metadata": {
    "id": "section3"
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Fill in the blanks to perform the described reshape operations:"
   ],
   "metadata": {
    "id": "section3_task1"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "t = torch.arange(24)\n",
    "print(\"Original shape:\", t.shape)\n",
    "\n",
    "t_2d = t.reshape(4, 6)\n",
    "print(\"Reshaped to (4, 6):\\n\", t_2d)\n",
    "\n",
    "t_flat = t_2d.reshape(-1)\n",
    "print(\"Flattened:\", t_flat.shape)\n",
    "\n",
    "t_3d = t.reshape(2, 3, 4)\n",
    "print(\"Reshaped to (2, 3, 4):\\n\", t_3d)\n"
   ],
   "metadata": {
    "id": "reshape_basic"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "Fill in the blanks for `squeeze` and `unsqueeze`:"
   ],
   "metadata": {
    "id": "section3_task2"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "t = torch.rand(3, 1, 4)\n",
    "print(\"Original shape:\", t.shape)\n",
    "\n",
    "t_sq = t.squeeze()\n",
    "print(\"After squeeze:\", t_sq.shape)\n",
    "\n",
    "t_unsq = t_sq.unsqueeze(0)\n",
    "print(\"After unsqueeze(0):\", t_unsq.shape)\n"
   ],
   "metadata": {
    "id": "squeeze_unsqueeze"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "Fill in the `permute` call to convert from channel-first `(B, C, H, W)` to channel-last `(B, H, W, C)`:"
   ],
   "metadata": {
    "id": "section3_task3"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "img = torch.rand(8, 3, 32, 32)\n",
    "img_hwc = img.permute(0, 2, 3, 1)\n",
    "print(\"Channel-first:\", img.shape)\n",
    "print(\"Channel-last: \", img_hwc.shape)\n"
   ],
   "metadata": {
    "id": "permute"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 4. Datasets and DataLoaders\n",
    "\n",
    "PyTorch separates **what your data is** from **how it is served to the model**:\n",
    "\n",
    "- A **`Dataset`** defines *what* your data looks like — it knows how many samples there are and how to retrieve the i-th one.\n",
    "- A **`DataLoader`** wraps a Dataset and handles *how* data is delivered — batching, shuffling, parallel loading.\n",
    "\n",
    "```\n",
    "Dataset  →  DataLoader  →  Training loop\n",
    " (storage)    (batching)\n",
    "```\n",
    "\n",
    "A Dataset represents the __________ data, defining how each sample is stored and accessed.\n",
    "\n",
    "A DataLoader helps in __________ the data into manageable batches for training.\n",
    "\n",
    "### Why not just load everything into a list?\n",
    "Real datasets (ImageNet: 1.2 M images) do not fit in RAM. The Dataset + DataLoader pattern lets you load only what you need, when you need it."
   ],
   "metadata": {
    "id": "section4"
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 5. Creating a Custom Dataset\n",
    "\n",
    "To create a custom Dataset, subclass `torch.utils.data.Dataset` and implement exactly **two methods**:\n",
    "\n",
    "- `__len__()` — returns the total number of samples\n",
    "- `__getitem__(index)` — returns the sample at position `index`\n",
    "\n",
    "The DataLoader will call these methods internally. Fill in the missing return value:"
   ],
   "metadata": {
    "id": "section5"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "from torch.utils.data import Dataset, DataLoader\n",
    "from PIL import Image\n",
    "import os\n",
    "\n",
    "class CustomImageDataset(Dataset):\n",
    "    def __init__(self, image_dir, transform=None):\n",
    "        self.image_dir = image_dir\n",
    "        self.image_files = os.listdir(image_dir)\n",
    "        self.transform = transform\n",
    "\n",
    "    def __len__(self):\n",
    "        return len(self.image_files)\n",
    "\n",
    "    def __getitem__(self, index):\n",
    "        img_path = os.path.join(self.image_dir, self.image_files[index])\n",
    "        image = Image.open(img_path).convert(\"RGB\")\n",
    "\n",
    "        if self.transform:\n",
    "            image = self.transform(image)\n",
    "\n",
    "        return image\n"
   ],
   "metadata": {
    "id": "custom_dataset"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 6. Using DataLoader\n",
    "\n",
    "Key `DataLoader` parameters:\n",
    "\n",
    "| Parameter | Description |\n",
    "|-----------|-------------|\n",
    "| `batch_size` | How many samples per batch |\n",
    "| `shuffle` | Randomise order each epoch (use `True` for training, `False` for validation) |\n",
    "| `num_workers` | Number of parallel processes for loading (0 = main process only) |\n",
    "\n",
    "Fill in sensible values for a training DataLoader:"
   ],
   "metadata": {
    "id": "section6"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "# dataset = CustomImageDataset(image_dir=\"path/to/images\")\n",
    "\n",
    "# dataloader = DataLoader(dataset, batch_size=32, shuffle=True, num_workers=4)\n",
    "\n",
    "# for batch in dataloader:\n",
    "#     print(batch.shape)\n",
    "#     break\n",
    "\n",
    "print(\"Uncomment the code above once you have a real image directory.\")\n"
   ],
   "metadata": {
    "id": "dataloader"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 7. Loading a Built-in Dataset (MNIST)\n",
    "\n",
    "`torchvision.datasets` ships with many standard datasets (MNIST, CIFAR-10, ImageNet, …). This is the quickest way to get started without managing files yourself.\n",
    "\n",
    "Before loading, we define a **transform pipeline** — a sequence of preprocessing steps applied to every sample on the fly.\n",
    "\n",
    "Fill in the correct transform names and the expected dataset length:"
   ],
   "metadata": {
    "id": "section7"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "import torchvision.transforms as transforms\n",
    "from torchvision import datasets\n",
    "\n",
    "transform = transforms.Compose([\n",
    "    transforms.ToTensor(),\n",
    "    transforms.Normalize(0.5, 0.5)\n",
    "])\n",
    "\n",
    "train_dataset = datasets.MNIST(root='data', train=True, transform=transform, download=True)\n",
    "\n",
    "train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)\n",
    "\n",
    "print(\"Training samples:\", len(train_dataset))  # Expected: 60000\n",
    "print(\"Number of batches:\", len(train_loader))\n",
    "\n",
    "images, labels = next(iter(train_loader))\n",
    "print(\"Batch shape:\", images.shape)\n",
    "print(\"Labels shape:\", labels.shape)\n"
   ],
   "metadata": {
    "id": "mnist"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 8. Data Augmentation & Preprocessing\n",
    "\n",
    "**Data augmentation** artificially increases the diversity of your training set by applying random transformations — without collecting new data. Common benefits:\n",
    "- Reduces overfitting\n",
    "- Makes the model more robust to real-world variation (different lighting, orientations, etc.)\n",
    "\n",
    "⚠️ Important: augmentations should only be applied to the **training set**, not to validation/test sets.\n",
    "\n",
    "Fill in the missing transform names:"
   ],
   "metadata": {
    "id": "section8"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "import torchvision.transforms as transforms\n",
    "\n",
    "train_transform = transforms.Compose([\n",
    "    transforms.Resize((128, 128)),\n",
    "    transforms.RandomHorizontalFlip(),\n",
    "    transforms.RandomRotation(degrees=15),\n",
    "    transforms.ColorJitter(brightness=0.2, contrast=0.2),\n",
    "    transforms.ToTensor(),\n",
    "    transforms.Normalize(mean=[0.5], std=[0.5])\n",
    "])\n",
    "\n",
    "val_transform = transforms.Compose([\n",
    "    transforms.Resize((128, 128)),\n",
    "    transforms.ToTensor(),\n",
    "    transforms.Normalize(mean=[0.5], std=[0.5])\n",
    "])\n"
   ],
   "metadata": {
    "id": "augmentation"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "### Applying Transforms and Exploring the Dataset\n",
    "\n",
    "Let's make this concrete. Below we apply three different transform pipelines to MNIST and compare what comes out:\n",
    "\n",
    "1. **No transform** — raw PIL image\n",
    "2. **Minimal** — `ToTensor()` only\n",
    "3. **Full augmentation** — resize, flip, rotation, normalise\n",
    "\n",
    "We then inspect dataset size, batch shapes, pixel value ranges, and visualise a sample batch."
   ],
   "metadata": {
    "id": "apply_transforms_md"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "import torch\n",
    "import torchvision.transforms as transforms\n",
    "from torchvision import datasets\n",
    "from torch.utils.data import DataLoader\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "raw_dataset = datasets.MNIST(root='data', train=True, transform=None, download=True)\n",
    "\n",
    "minimal_transform = transforms.Compose([\n",
    "    transforms.ToTensor()\n",
    "])\n",
    "minimal_dataset = datasets.MNIST(root='data', train=True, transform=minimal_transform, download=False)\n",
    "\n",
    "augment_transform = transforms.Compose([\n",
    "    transforms.Resize((32, 32)),\n",
    "    transforms.RandomHorizontalFlip(),\n",
    "    transforms.RandomRotation(degrees=15),\n",
    "    transforms.ToTensor(),\n",
    "    transforms.Normalize(mean=[0.5], std=[0.5])\n",
    "])\n",
    "augmented_dataset = datasets.MNIST(root='data', train=True, transform=augment_transform, download=False)\n",
    "\n",
    "test_dataset = datasets.MNIST(root='data', train=False, transform=minimal_transform, download=False)\n",
    "\n",
    "print(\"=== Dataset sizes ===\")\n",
    "print(f\"Training samples : {len(minimal_dataset):,}\")\n",
    "print(f\"Test samples     : {len(test_dataset):,}\")\n",
    "print(f\"Total            : {len(minimal_dataset) + len(test_dataset):,}\")\n",
    "print(f\"Classes          : {minimal_dataset.classes}\")\n",
    "print(f\"Number of classes: {len(minimal_dataset.classes)}\")\n"
   ],
   "metadata": {
    "id": "apply_transforms_code1"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "print(\"=== Single sample inspection ===\")\n",
    "\n",
    "raw_img, label = raw_dataset[0]\n",
    "print(f\"Raw PIL image  : type={type(raw_img).__name__}, size={raw_img.size}, label={label}\")\n",
    "\n",
    "min_img, label = minimal_dataset[0]\n",
    "print(f\"Minimal tensor : shape={min_img.shape}, dtype={min_img.dtype}\")\n",
    "print(f\"                 min={min_img.min():.3f}, max={min_img.max():.3f}\")\n",
    "\n",
    "aug_img, label = augmented_dataset[0]\n",
    "print(f\"Augmented tensor: shape={aug_img.shape}, dtype={aug_img.dtype}\")\n",
    "print(f\"                  min={aug_img.min():.3f}, max={aug_img.max():.3f}\")\n",
    "print()\n",
    "# Q: Why is the min of the augmented tensor negative but the minimal tensor's min is 0?\n",
    "# A: The augmented pipeline applies Normalize(mean=0.5, std=0.5) which shifts [0,1] to [-1,1].\n"
   ],
   "metadata": {
    "id": "apply_transforms_code2"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "loader_minimal   = DataLoader(minimal_dataset,   batch_size=32, shuffle=True)\n",
    "loader_augmented = DataLoader(augmented_dataset, batch_size=32, shuffle=True)\n",
    "\n",
    "imgs_min, labels = next(iter(loader_minimal))\n",
    "print(f\"Minimal   batch — images: {imgs_min.shape}, labels: {labels.shape}\")\n",
    "# Expected shape: (32, 1, 28, 28)\n",
    "\n",
    "imgs_aug, labels = next(iter(loader_augmented))\n",
    "print(f\"Augmented batch — images: {imgs_aug.shape}, labels: {labels.shape}\")\n",
    "# Expected shape: (32, 1, 32, 32)\n",
    "\n",
    "print(f\"\\nBatches per epoch (batch_size=32): {len(loader_minimal)}\")\n",
    "\n",
    "unique, counts = torch.unique(labels, return_counts=True)\n",
    "for cls, cnt in zip(unique.tolist(), counts.tolist()):\n",
    "    print(f\"  Class {cls}: {cnt} sample(s)\")\n"
   ],
   "metadata": {
    "id": "apply_transforms_code3"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "imgs, labels = next(iter(loader_augmented))\n",
    "\n",
    "fig, axes = plt.subplots(2, 8, figsize=(14, 4))\n",
    "fig.suptitle(\"Sample batch — augmented MNIST\", fontsize=11)\n",
    "\n",
    "for i, ax in enumerate(axes.flat):\n",
    "    img = imgs[i].squeeze().numpy()\n",
    "    img = (img * 0.5) + 0.5\n",
    "    ax.imshow(img, cmap='gray', vmin=0, vmax=1)\n",
    "    ax.set_title(str(labels[i].item()), fontsize=9)\n",
    "    ax.axis('off')\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()\n"
   ],
   "metadata": {
    "id": "apply_transforms_code4"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Summary\n",
    "\n",
    "| Concept | Key class / function |\n",
    "|---------|---------------------|\n",
    "| Create tensor | `torch.tensor()`, `torch.rand()`, `torch.zeros()`, `torch.ones()` |\n",
    "| Reshape tensor | `.reshape()`, `.view()`, `.squeeze()`, `.unsqueeze()`, `.permute()` |\n",
    "| Move to GPU | `.to(device)` |\n",
    "| Custom dataset | Subclass `Dataset`, implement `__len__` and `__getitem__` |\n",
    "| Load in batches | `DataLoader(dataset, batch_size=…, shuffle=…)` |\n",
    "| Preprocessing | `transforms.Compose([…])` |\n",
    "\n",
    "**Next notebook:** building neural networks with `nn.Module` and writing a training loop."
   ],
   "metadata": {
    "id": "summary"
   }
  }
 ]
}