laion_fmri.segmentations

Access per-stimulus object-segmentation masks for the LAION-fMRI stimuli.

The segmentations describe what objects appear in each stimulus and where: every stimulus has zero or more nouns associated with it, and every noun has one or more spatial masks (one per detected instance of that noun in the image). For example, an image of a person playing piano might carry masks for "hand" (4 instances), "piano" (1 instance), and "sheet music" (1 instance).

Files on disk:

stimuli/
  task-images_desc-segmentations.h5            (N, H, W) uint8, gzip+shuffle
  task-images_desc-segmentations_metadata.csv  one row per mask

The HDF5 holds a single masks dataset of shape (N, 1000, 1000) binary uint8. The CSV columns are mask_row, image_name, noun, instance_id, score, box_x0, box_y0, box_x1, box_y1, localized, mask_file. localized is 1 when the detector returned a bounding box and the mask covers less than 99% of the image; otherwise 0 (use this to filter out “concept present but not localised” entries).

Quick start

>>> import laion_fmri
>>> stim = laion_fmri.load_stimuli()
>>> stim.segmentations.nouns("shared_12rep_LAION_cluster_1003_i0.jpg")
['fingers', 'hand', 'pullover', ...]
>>> mask = stim.segmentations.get(
...     "shared_12rep_LAION_cluster_1003_i0.jpg", "fingers", instance=0
... )
>>> mask.shape, mask.dtype
((1000, 1000), dtype('uint8'))

Classes

Segmentations([data_dir])

Lazy reader for the per-stimulus segmentation masks.

class laion_fmri.segmentations.Segmentations(data_dir=None)[source]

Bases: object

Lazy reader for the per-stimulus segmentation masks.

Opens the HDF5 file once on first access and keeps the handle open for the lifetime of the instance. Use as a context manager to release the handle explicitly:

with Segmentations() as seg:
    arr = seg.get("img.jpg", "hand")
Parameters:

data_dir (str or Path, optional) – Override the configured data directory. Defaults to laion_fmri.config.get_data_dir().

close() None[source]

Release the HDF5 handle.

for_image(image_name: str) DataFrame[source]

Metadata slice (all masks) for one image.

Returns an empty DataFrame (not an error) when the image has no segmentations – which is the case for all subject-unique stimuli, since masks ship only for the shared set.

get(image_name: str, noun: str, instance: int = 0) ndarray[source]

Return the mask for (image, noun, instance).

Parameters:
  • image_name (str) – Stimulus filename (e.g. "shared_..._1003_i0.jpg").

  • noun (str) – One of the nouns associated with image_name (see nouns()).

  • instance (int, default 0) – Which detected instance of noun. 0 is the highest- scored detection.

Returns:

(H, W) uint8 binary mask.

Return type:

np.ndarray

Raises:

KeyError – If the image has no segmentations (e.g. it’s a subject-unique image; only shared stimuli are covered) or the requested (noun, instance) doesn’t exist.

has_image(image_name: str) bool[source]

True if image_name has at least one segmentation mask.

Masks ship only for the shared stimulus set, so this returns False for any subject-unique image.

images() list[str][source]

All image names that have at least one mask, in metadata order.

property metadata: DataFrame

One row per mask.

Columns: mask_row, image_name, noun, instance_id, score, box_x0, box_y0, box_x1, box_y1, localized, mask_file.

nouns(image_name: str, localized_only: bool = True) list[str][source]

Nouns detected in image_name.

Returns an empty list (not an error) when the image has no segmentations – the case for all subject-unique stimuli, since masks ship only for the shared set.

Parameters:
  • image_name (str)

  • localized_only (bool, default True) – If True, drop nouns whose only detections weren’t localised (no bounding box / full-image mask). Set False to include them.