Object Segmentations¶

Every shared stimulus image is accompanied by object-level segmentation masks: for each noun that the upstream detector found in the image, there is one binary (1000, 1000) mask per detected instance of that noun. For example, an image of a person playing piano carries masks for "hand" (4 instances), "piano" (1 instance), and so on.

Note

Segmentations are provided for the shared stimulus set only (the 1,492 images viewed by every subject). Subject-unique images do not carry masks. Use sub.segmentations.has_image(trial) to check before retrieval; nouns() and for_image() safely return empty results for uncovered images.

Bind the quickstart’s data directory¶

This script reuses the same data directory as Quick Start; no functional data is needed beyond the stimulus images. download_segmentations() pulls the dataset-wide segmentation HDF5 + metadata CSV (a few MB total) the first time it runs and is a no-op afterwards.

import os

from laion_fmri.config import dataset_initialize
from laion_fmri.download import download_segmentations

data_dir = os.environ.get(
    "LAION_FMRI_EXAMPLE_DATA_DIR",
    os.path.join(os.getcwd(), "laion_fmri_quickstart"),
)
os.makedirs(data_dir, exist_ok=True)
dataset_initialize(data_dir)

# Segmentations are a dataset-wide derivative; pull them on the
# first run. Idempotent -- a no-op once the local files are
# already present.
download_segmentations()

[laion-fmri] segmentations already up to date.

{'h5': PosixPath('/path/to/laion-fmri-data/stimuli/task-images_desc-segmentations.h5'), 'metadata': PosixPath('/path/to/laion-fmri-data/stimuli/task-images_desc-segmentations_metadata.csv')}

Browsing masks from the stimulus side¶

stim.segmentations exposes three accessors for an image: nouns(image) returns the noun list, for_image(image) returns the per-mask metadata rows (one row per detected instance, with score and bounding info), and get(image, noun) returns a single binary (1000, 1000) uint8 mask. The cell below exercises each in turn.

import laion_fmri

stim = laion_fmri.load_stimuli()

image_name = "shared_12rep_LAION_cluster_1003_i0.jpg"

# Which nouns appear in this image?
nouns = stim.segmentations.nouns(image_name)
print(f"Nouns in {image_name}: {nouns}")

# All masks for one image, as a metadata slice (one row per mask):
df = stim.segmentations.for_image(image_name)
print(df[["noun", "instance_id", "score", "localized"]].head())

# Fetch a single mask -- shape (1000, 1000), dtype uint8, values in {0, 1}:
mask = stim.segmentations.get(image_name, nouns[0])
print(f"\n'{nouns[0]}' mask: shape={mask.shape}, dtype={mask.dtype}, "
      f"covered pixels={int(mask.sum())}")

Nouns in shared_12rep_LAION_cluster_1003_i0.jpg: ['delicate floral nail art', 'fingers', 'hand', 'short , pale pink polished nails']
                       noun  instance_id     score  localized
0  delicate floral nail art            0  0.772657          1
1  delicate floral nail art            2  0.795340          1
2                   fingers            0  0.889126          1
3                   fingers            2  0.751304          1
4                   fingers            4  0.917879          1

'delicate floral nail art' mask: shape=(1000, 1000), dtype=uint8, covered pixels=12499

Overlaying a mask on the image¶

The block below tints mask pixels with a soft red, then renders the original image and the tinted overlay side-by-side. The matplotlib render is commented out so the gallery doesn’t redistribute stimulus content – uncomment it to inspect the overlay locally.

import numpy as np

img = np.array(stim.images.get(image_name))
overlay = img.copy()
# Soft red tint where the mask is set.
overlay[mask == 1] = (
    0.55 * img[mask == 1] + 0.45 * np.array([230, 25, 75])
).astype(np.uint8)

# import matplotlib.pyplot as plt
# fig, axes = plt.subplots(1, 2, figsize=(10, 5))
# axes[0].imshow(img)
# axes[0].set_title("original")
# axes[0].axis("off")
# axes[1].imshow(overlay)
# axes[1].set_title(f"'{nouns[0]}' mask overlay")
# axes[1].axis("off")
# plt.tight_layout()
# plt.show()

Subject-level access: masks per trial¶

On the subject side, segmentations are addressed by trial index (rows of sub.metadata). Because masks ship only for the shared stimulus set, nouns() returns [] for any trial whose image was a subject-unique stimulus.

sub = laion_fmri.load_subject("sub-01")

n_covered = sum(
    sub.segmentations.has_image(t) for t in range(len(sub.metadata))
)
print(f"Trials whose image carries masks: {n_covered} / {len(sub.metadata)}")

# What nouns did sub-01 see across their first 5 trials?
for trial in range(5):
    print(f"  trial {trial}: {sub.segmentations.nouns(trial)}")

Trials whose image carries masks: 469 / 1044
  trial 0: []
  trial 1: []
  trial 2: ['sky', 'palm trees']
  trial 3: []
  trial 4: []

Total running time of the script: (0 minutes 2.194 seconds)

Gallery generated by Sphinx-Gallery