laion_fmri.embeddings¶
Access pretrained image embeddings for the LAION-fMRI stimuli.
Use after the embeddings have been downloaded via
laion_fmri.download.download_embeddings() (or
laion-fmri download-embeddings).
The embeddings live as one HDF5 file per model, sitting next to the stimulus images:
stimuli/
task-images_stimuli.h5
task-images_metadata.csv
task-images_desc-CLIP_embeddings.h5
task-images_desc-DINOv2_embeddings.h5
task-images_desc-PEcore_embeddings.h5
task-images_desc-SigLIP2_embeddings.h5
Each file has three datasets of length 25,052: embedding (the
(N, feature_dim) float16 matrix), image_ids (image filenames),
and valid (per-row validity flag). All four files share the same
image_ids order.
You normally do not construct Embeddings directly. Reach it
through the Stimuli hub:
>>> import laion_fmri
>>> stim = laion_fmri.load_stimuli()
>>> stim.embeddings["CLIP"].shape # (25052, 1024)
>>> stim.embeddings.get("CLIP", "shared_12rep_LAION_cluster_1003_i0.jpg")
For subject-aligned arrays, use the Subject namespace:
>>> sub = laion_fmri.load_subject("sub-01")
>>> features = sub.embeddings.all("CLIP") # (n_trials, D)
Module Attributes
Models shipped with the LAION-fMRI release. |
- laion_fmri.embeddings.AVAILABLE_MODELS = ('CLIP', 'DINOv2', 'PEcore', 'SigLIP2')¶
Models shipped with the LAION-fMRI release. The label is the BIDS
desc-token used in the filename.
Functions
|
Return a lazy embedding reader for one or more models. |
- laion_fmri.embeddings.load_embeddings(models='all', data_dir=None) Embeddings[source]¶
Return a lazy embedding reader for one or more models.
Classes
|
Lazy reader for one or more model embedding files. |
- class laion_fmri.embeddings.Embeddings(models, data_dir=None)[source]¶
Bases:
objectLazy reader for one or more model embedding files.
Opens each model’s HDF5 on first access and keeps the handle open for the lifetime of the instance. Use as a context manager to explicitly release the handles:
with Stimuli() as stim: v = stim.embeddings.get("CLIP", "img.jpg")
- Parameters:
models (str or iterable[str]) – Model labels this handle covers (subset of
AVAILABLE_MODELS). A single string such as"CLIP"is accepted.data_dir (str or Path, optional) – Override the configured data directory.
- get(model: str, image_name) ndarray[source]¶
Return embedding row(s) for one or many image names.
- Parameters:
model (str) – One of
AVAILABLE_MODELS.image_name (str or sequence of str) – One image filename or a list/array of filenames.
- Returns:
(feature_dim,)if a single name was passed, otherwise(n, feature_dim)in the requested order.- Return type:
np.ndarray
- property image_ids: ndarray¶
Image filenames in embedding row order (shared across models).