Dataset at a Glance

This page gives you a quick overview of everything in LAION-fMRI and helps you find the files you need.

The launch release covers 5 participants, 150 main image-viewing fMRI sessions (30 per participant), 25,052 distinct images, and 1,492 images shared by every participant. In total, 165 fMRI sessions were acquired (33 per participant); the supplemental sessions expand the shared image set to about 2,200 images and will be released later.

What’s Included

The dataset contains raw and preprocessed functional MRI data (fMRI Data, Preprocessing), anatomical scans and FreeSurfer reconstructions (Anatomical Data), single-trial GLMsingle beta estimates (GLMsingle Beta Estimates), the stimulus set (Stimulus Set) and derived stimulus files (Stimulus Derivatives), ROI masks (ROIs) derived from retinotopy (Retinotopy) and functional localizers (Functional Localizers), and predefined train/test splits (Train / Test Splits).

Detailed BIDS file trees, coordinate-space tables, and per-ROI summaries will be added in an upcoming documentation update.

What Files Do I Need?

Not everyone needs the full dataset. Start from your use case below to find the relevant files and documentation pages.

Encoding / decoding models

This is the most common use case (e.g. Algonauts challenge participants).

What you need

Details

Single-trial beta estimates

derivatives/glmsingle-tedana/sub-XX/ - see GLMsingle Beta Estimates

Stimulus images & metadata

stimuli/ - see Stimulus Set

Stimulus derivatives

Embeddings, captions, and object segmentations - see Stimulus Derivatives

Train / test splits

Predefined splits for model evaluation - see Train / Test Splits

ROI masks (optional)

derivatives/rois/ - see ROIs

RSA / pattern similarity analyses

What you need

Details

Single-trial betas

derivatives/glmsingle-tedana/sub-XX/ - see GLMsingle Beta Estimates

Stimulus metadata & categories

stimuli/task-images_metadata.csv - see Stimulus Set

ROI masks

derivatives/rois/ - see ROIs

Retinotopic or localizer analyses

What you need

Details

Retinotopic maps

derivatives/retinotopy/ - see Retinotopy

Functional localizer contrasts

derivatives/localizers/ - see Functional Localizers

ROI masks

derivatives/rois/ - see ROIs

Preprocessing from scratch

If you want to run your own preprocessing pipeline instead of using the provided NORDIC/tedana outputs.

What you need

Details

Raw BOLD data

sub-XX/func/ - see fMRI Data

T1w anatomical scans

sub-XX/anat/ - see Anatomical Data

Event timing files

sub-XX/func/*_events.tsv - see Experimental Design

Diffusion data (if needed)

sub-XX/dwi/ - see Diffusion Data

Tip

For details on the preprocessing we already ran, see Preprocessing. For MRI acquisition parameters, see MRI Acquisition.

Data Formats

Data type

Format

Notes

MRI volumes

NIfTI (.nii.gz)

3D (anatomical) or 4D (functional/betas)

Metadata

JSON (.json)

BIDS sidecar files

Events / behavioral

TSV (.tsv)

Tab-separated, BIDS-compliant

Stimulus images

HDF5 (.h5)

Single packed file with raw JPEG bytes, row-aligned to task-images_metadata.csv and accessible by image name through laion_fmri.load_stimuli().

Stimulus metadata

CSV

stimuli/task-images_metadata.csv - see Stimulus Set