{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Object Segmentations\n\nEvery shared stimulus image is accompanied by object-level\nsegmentation masks: for each noun that the upstream detector found in\nthe image, there is one binary ``(1000, 1000)`` mask per detected\ninstance of that noun. For example, an image of a person playing\npiano carries masks for ``\"hand\"`` (4 instances), ``\"piano\"`` (1\ninstance), and so on.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>Segmentations are provided **for the shared stimulus set only**\n   (the 1,492 images viewed by every subject). Subject-unique images\n   do not carry masks. Use ``sub.segmentations.has_image(trial)`` to\n   check before retrieval; ``nouns()`` and ``for_image()`` safely\n   return empty results for uncovered images.</p></div>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Bind the quickstart's data directory\n\nThis script reuses the same data directory as\n:doc:`plot_01_quickstart`; no functional data is needed beyond\nthe stimulus images. ``download_segmentations()`` pulls the\ndataset-wide segmentation HDF5 + metadata CSV (a few MB total)\nthe first time it runs and is a no-op afterwards.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import os\n\nfrom laion_fmri.config import dataset_initialize\nfrom laion_fmri.download import download_segmentations\n\ndata_dir = os.environ.get(\n    \"LAION_FMRI_EXAMPLE_DATA_DIR\",\n    os.path.join(os.getcwd(), \"laion_fmri_quickstart\"),\n)\nos.makedirs(data_dir, exist_ok=True)\ndataset_initialize(data_dir)\n\n# Segmentations are a dataset-wide derivative; pull them on the\n# first run. Idempotent -- a no-op once the local files are\n# already present.\ndownload_segmentations()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Browsing masks from the stimulus side\n\n``stim.segmentations`` exposes three accessors for an image:\n``nouns(image)`` returns the noun list, ``for_image(image)``\nreturns the per-mask metadata rows (one row per detected\ninstance, with score and bounding info), and\n``get(image, noun)`` returns a single binary ``(1000, 1000)``\n``uint8`` mask. The cell below exercises each in turn.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import laion_fmri\n\nstim = laion_fmri.load_stimuli()\n\nimage_name = \"shared_12rep_LAION_cluster_1003_i0.jpg\"\n\n# Which nouns appear in this image?\nnouns = stim.segmentations.nouns(image_name)\nprint(f\"Nouns in {image_name}: {nouns}\")\n\n# All masks for one image, as a metadata slice (one row per mask):\ndf = stim.segmentations.for_image(image_name)\nprint(df[[\"noun\", \"instance_id\", \"score\", \"localized\"]].head())\n\n# Fetch a single mask -- shape (1000, 1000), dtype uint8, values in {0, 1}:\nmask = stim.segmentations.get(image_name, nouns[0])\nprint(f\"\\n'{nouns[0]}' mask: shape={mask.shape}, dtype={mask.dtype}, \"\n      f\"covered pixels={int(mask.sum())}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Overlaying a mask on the image\n\nThe block below tints mask pixels with a soft red, then renders\nthe original image and the tinted overlay side-by-side. The\nmatplotlib render is commented out so the gallery doesn't\nredistribute stimulus content -- uncomment it to inspect the\noverlay locally.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import numpy as np\n\nimg = np.array(stim.images.get(image_name))\noverlay = img.copy()\n# Soft red tint where the mask is set.\noverlay[mask == 1] = (\n    0.55 * img[mask == 1] + 0.45 * np.array([230, 25, 75])\n).astype(np.uint8)\n\n# import matplotlib.pyplot as plt\n# fig, axes = plt.subplots(1, 2, figsize=(10, 5))\n# axes[0].imshow(img)\n# axes[0].set_title(\"original\")\n# axes[0].axis(\"off\")\n# axes[1].imshow(overlay)\n# axes[1].set_title(f\"'{nouns[0]}' mask overlay\")\n# axes[1].axis(\"off\")\n# plt.tight_layout()\n# plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Subject-level access: masks per trial\n\nOn the subject side, segmentations are addressed by **trial index**\n(rows of ``sub.metadata``). Because masks ship only for the shared\nstimulus set, ``nouns()`` returns ``[]`` for any trial whose image\nwas a subject-unique stimulus.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "sub = laion_fmri.load_subject(\"sub-01\")\n\nn_covered = sum(\n    sub.segmentations.has_image(t) for t in range(len(sub.metadata))\n)\nprint(f\"Trials whose image carries masks: {n_covered} / {len(sub.metadata)}\")\n\n# What nouns did sub-01 see across their first 5 trials?\nfor trial in range(5):\n    print(f\"  trial {trial}: {sub.segmentations.nouns(trial)}\")"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.12.13"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}