Quickstart¶

Quick introduction to the LAION-fMRI dataset. For the full reference of every download path and option, see Data Access.

Install¶

python -m pip install "laion-fmri @ git+https://github.com/ViCCo-Group/LAION-fMRI.git@main"

Point the package at a local data directory (the place every download will land):

mkdir -p ./laion_fmri_data
laion-fmri config --data-dir ./laion_fmri_data

Download one subject¶

This pulls the GLMsingle single-trial betas, noise ceilings, ROI masks, and per-session trial tables for sub-01. The bucket is public, so the first call asks you to accept the CC0 license; no AWS credentials needed.

laion-fmri download --subject sub-01

Re-running is safe - the downloader is idempotent and only fetches files whose local size doesn’t match the bucket. Pass --n-jobs 4 for faster parallel transfers, or --ses ses-01 to narrow to a single session.

Load betas in Python¶

from laion_fmri import load_subject

sub = load_subject("sub-01")

betas = sub.get_betas(session="ses-01")          # (n_trials, n_voxels)
trials = sub.get_trial_info(session="ses-01")    # pandas DataFrame

The trials table has one row per beta volume; its label column is the stimulus image filename and is the join key to the stimulus metadata. Restrict by ROI in the same call:

betas_ffa = sub.get_betas(session="ses-01", roi="FFA1")
betas_face = sub.get_betas(session="ses-01", roi="face")  # union of face ROIs
betas_lo = sub.get_betas(session="ses-01", roi="all", nc_threshold=0.2)

The full accessor grammar - categories, mask combinations, surface ROIs, streaming mode for memory-tight machines - is in Load.

Get the stimulus images¶

Stimulus images require accepting a Data Use Agreement. The package walks you through the form on first request and caches the resulting access token for subsequent calls:

laion-fmri download-stimuli

Then load them in Python:

from laion_fmri import load_stimuli

stim = load_stimuli()
stim.metadata.head()                                 # pandas DataFrame
img = stim.image("shared_12rep_LAION_cluster_1003_i0.jpg")  # PIL.Image

Pretrained image embeddings (OpenCLIP, DINOv2, PE Core, SigLIP2) are public and do not need the DUA:

laion-fmri download-embeddings

from laion_fmri import load_embeddings
emb = load_embeddings("CLIP")
emb["CLIP"][0]                                       # (1024,) float16

Next Steps¶

Dataset at a Glance - full dataset overview and “what files do I need”
GLMsingle Beta Estimates - details on the beta estimates and noise ceilings
Stimulus Set - stimulus set, image metadata, and image access
Stimulus Derivatives - embeddings, captions, and segmentations
Train / Test Splits - predefined train/test partitions
Load - full Subject API reference
FAQ - common questions