Note
Go to the end to download the full example code.
Dataset Initialization¶
One-time setup before working with the LAION-fMRI dataset.
This example walks through the steps a new user takes the first time they use the package:
Configure the local data directory.
Read the licenses you’ll be asked to accept on first download.
Confirm you can reach the bucket and see what it contains.
Downloads themselves are covered by the quick start.
Initialize the data directory¶
Pick a location with enough disk space. The choice is persisted
so subsequent sessions pick it up automatically – you don’t need
to call dataset_initialize again from the same machine.
import os
from laion_fmri.config import dataset_initialize, get_data_dir
# If you already accepted the licenses in another example, the
# following cells just confirm the configuration -- you won't be
# re-prompted.
data_dir = os.environ.get(
"LAION_FMRI_EXAMPLE_DATA_DIR",
os.path.join(os.getcwd(), "laion_fmri_quickstart"),
)
os.makedirs(data_dir, exist_ok=True)
dataset_initialize(data_dir)
print(f"Configured: {get_data_dir()}")
Configured: /path/to/laion-fmri-data
Inspect the license texts¶
Two licenses apply:
The dataset license (CC0 1.0) covers the brain and participant data.
The stimulus license (closed, research-only) covers the stimulus images.
Below we print the body of each so you can read the terms in
advance. The actual Type "I AGREE" prompt happens in the
next cell.
from laion_fmri._constants import (
LICENSE_AGREEMENT_BODY,
STIMULI_LICENSE_BODY,
)
print(LICENSE_AGREEMENT_BODY)
print("---")
print(STIMULI_LICENSE_BODY)
=== LAION-fMRI Dataset License (CC0 1.0) ===
The brain imaging and participant data in the LAION-fMRI dataset are
released under the Creative Commons Zero (CC0 1.0) Public Domain
Dedication. You are free to copy, modify, distribute, and use the
data for any purpose, including commercial, without asking permission.
Full license text: https://creativecommons.org/publicdomain/zero/1.0/
NOTE: Stimulus images are NOT covered by CC0. They are subject to a
separate, restrictive license. You will be prompted to accept it if
you choose to download stimuli.
---
=== LAION-fMRI Stimulus License ===
The LAION-fMRI stimulus images are provided under a closed license.
All rights are reserved by the original copyright holders.
You may ONLY use these images for non-commercial academic research.
All other uses are strictly prohibited. In particular, you may NOT:
1. Share, redistribute, or make the images available to others.
2. Use the images for any commercial purpose.
3. Use the images to train, fine-tune, or evaluate commercial
AI/ML models or services.
4. Create derivative works from the images for any purpose
other than non-commercial academic research.
Full terms: https://laion-fmri.hebartlab.com/terms
Accept the licenses¶
This is the same prompt-and-write-marker flow that
laion_fmri.download.download() triggers internally on its
first call. accept_licenses(include_stimuli=True) prompts
you to type I AGREE for both the dataset license and the
stimulus license, then records your acceptance under
{data_dir}/.laion_fmri/ so future download(...) calls
don’t ask again.
If you decline either prompt, the helper raises – the exception
is the signal that you opted out and that downstream
download(...) calls would refuse to run for the
corresponding data.
from laion_fmri.download import accept_licenses
accept_licenses(include_stimuli=True)
Confirm bucket access¶
The bucket is public, so discovery works without any credential setup. The functions below query the bucket directly and tell you what is available in the dataset regardless of what you have downloaded – a quick way to confirm that initialization is complete and the bucket is reachable from your network.
from laion_fmri.discovery import describe, get_subjects
print(f"Subjects in bucket: {get_subjects()}")
describe()
Subjects in bucket: ['sub-01', 'sub-03', 'sub-05', 'sub-06', 'sub-07']
LAION-fMRI Dataset
Bucket: s3://laion-fmri
Subjects: 5 (sub-01, sub-03, sub-05, sub-06, sub-07)
ROIs: EBA, FBA, FFA1, FFA2, IPCS, IPS0, LO1, LO2, MPA, MST, MT, OFA, OPA, PPA, SPCS, TO1, TO2, V1d, V1v, V2d, V2v, V3A, V3B, V3d, V3v, VO1, VO2, VWFA1, VWFA2, hV4, laionEVC, laiondorsal, laiongeneral, laionlateral, laionventral, lobjects, mfswords, pSTSfaces, pSTSwords, vobjects
Total running time of the script: (0 minutes 6.421 seconds)