Name: The Human Sleep Project
Published: June 2, 2026
License: https://github.com/bdsp-core/bdsp-license-and-dua

Database Credentialed Access

Qichen Li , Shenghan Wen , Haoqi Sun , Wolfgang Ganglberger , Ayush Tripathi , Niels Turley , Samuel Waters , Arnav Gupta , Aditya Gupta , Manohar Ghanta , Bruce Nearing , Han Wu , Katie L. Stone , Chad Robichaux , Zhiyong Zhang , Qiao Li , Gauri Ganjoo , Christine Tsien Silvers , Bharath Gunapati , Kiran Maski , Samaneh Nasiri , Dennis Hwang , Lynn Marie Trotti , Umakanth Katwa , Gari D. Clifford , Emmanuel Mignot , Robert J. Thomas , M. Brandon Westover

Published: June 2, 2026. Version: 3.0

When using this resource, please cite: (show more options)
Li, Q., Wen, S., Sun, H., Ganglberger, W., Tripathi, A., Turley, N., Waters, S., Gupta, A., Gupta, A., Ghanta, M., Nearing, B., Wu, H., Stone, K. L., Robichaux, C., Zhang, Z., Li, Q., Ganjoo, G., Silvers, C. T., Gunapati, B., ... Westover, M. B. (2026). The Human Sleep Project (version 3.0). Brain Data Science Platform. https://doi.org/10.60508/m3sw-rz13.

MLA	Li, Qichen, et al. "The Human Sleep Project" (version 3.0). Brain Data Science Platform (2026), https://doi.org/10.60508/m3sw-rz13.
APA	Li, Q., Wen, S., Sun, H., Ganglberger, W., Tripathi, A., Turley, N., Waters, S., Gupta, A., Gupta, A., Ghanta, M., Nearing, B., Wu, H., Stone, K. L., Robichaux, C., Zhang, Z., Li, Q., Ganjoo, G., Silvers, C. T., Gunapati, B., ... Westover, M. B. (2026). The Human Sleep Project (version 3.0). Brain Data Science Platform. https://doi.org/10.60508/m3sw-rz13.
Chicago	Li, Qichen, Wen, Shenghan, Sun, Haoqi, Ganglberger, Wolfgang, Tripathi, Ayush, Turley, Niels, Waters, Samuel, Gupta, Arnav, Gupta, Aditya, Ghanta, Manohar, Nearing, Bruce, Wu, Han, Stone, Katie L., Robichaux, Chad, Zhang, Zhiyong, Li, Qiao, Ganjoo, Gauri, Silvers, Christine Tsien, Gunapati, Bharath, Maski, Kiran, Nasiri, Samaneh, Hwang, Dennis, Trotti, Lynn Marie, Katwa, Umakanth, Clifford, Gari D., Mignot, Emmanuel, Thomas, Robert J., and M. Brandon Westover. "The Human Sleep Project" (version 3.0). Brain Data Science Platform (2026). https://doi.org/10.60508/m3sw-rz13.
Harvard	Li, Q., Wen, S., Sun, H., Ganglberger, W., Tripathi, A., Turley, N., Waters, S., Gupta, A., Gupta, A., Ghanta, M., Nearing, B., Wu, H., Stone, K. L., Robichaux, C., Zhang, Z., Li, Q., Ganjoo, G., Silvers, C. T., Gunapati, B., Maski, K., Nasiri, S., Hwang, D., Trotti, L. M., Katwa, U., Clifford, G. D., Mignot, E., Thomas, R. J., and Westover, M. B. (2026) 'The Human Sleep Project' (version 3.0), Brain Data Science Platform. Available at: https://doi.org/10.60508/m3sw-rz13.
Vancouver	Li Q, Wen S, Sun H, Ganglberger W, Tripathi A, Turley N, Waters S, Gupta A, Gupta A, Ghanta M, Nearing B, Wu H, Stone K L, Robichaux C, Zhang Z, Li Q, Ganjoo G, Silvers C T, Gunapati B, Maski K, Nasiri S, Hwang D, Trotti L M, Katwa U, Clifford G D, Mignot E, Thomas R J, Westover M B. The Human Sleep Project (version 3.0). Brain Data Science Platform. 2026. Available from: https://doi.org/10.60508/m3sw-rz13.

Abstract

Study objectives: Sleep research has been hindered by existing polysomnography (PSG) datasets being typically single-center, age-specific, or epidemiological data with limited scale and health-relevant diversity, reducing generalizability. The goal here is to assemble a large multi-center clinical PSG dataset, enabling scalable and generalizable research in sleep medicine and clinical neuroscience.

Methods: The Human Sleep Project (HSP) dataset, 119,234 overnight sleep recordings from more than 90,000 unique patients, integrates clinical PSG data collected from five major academic medical centers across the United States (sites are de-identified and referenced by cohort ID). The studies include a standard attended polysomnography montage at a minimum, as well as both clinical annotations and automatically generated standard annotations. The data are organized according to the Brain Imaging Data Structure (BIDS) standard and linked with de-identified demographic and electronic health record (EHR) information.

Results: The resulting HSP dataset spans the full human lifespan (infancy to age >90 years), and links PSG data to diagnoses across 22 distinct ICD-10 clinical categories, including neurology, cardiology, and endocrinology. The aggregated data reveal consistent, age-dependent changes in sleep architecture, e.g., with increased age, a decrease in total sleep time and N3, and an increase in the stage shift derived Sleep Fragmentation Index (SFI). The standardization process successfully harmonized technical and scoring heterogeneity across sites, providing a consistent resource for large-scale analysis.

Conclusions: The Human Sleep Project releases harmonized signals, manual and automated annotations, and per-session quality metadata for 119,234 overnight sleep recordings (115,129 full polysomnograms and 4,105 home sleep apnea tests) across five centers, enabling reproducible sleep research at a scale not previously available.

Background

Sleep is a fundamental biological process essential for cognitive function, emotional regulation, and overall health1–4. Disruptions in sleep are associated with a wide range of neurological and systemic disorders, including Alzheimer’s disease, stroke, and other neurodegenerative conditions5–8. Vast amounts of conventional clinical polysomnograms (PSGs) and various forms of “reduced” recordings are collected but remain trapped in archives, hidden from current data analytics that rely on massive amounts of data.

Routine PSGs capture brain electrical activity, multi-site muscle tone, eye movements, respiration, and cardiac signals. However, existing sleep datasets are typically limited in scope, being small, single-center, focused solely on adult or pediatric populations9–11, or focusing on epidemiologic community cohorts with relatively limited pathology12–15 or clinical symptoms. These limitations hinder the development of robust, generalizable models for sleep analysis, particularly those based on machine learning and artificial intelligence (AI).

This report presents the Human Sleep Project (HSP), which addresses these gaps by providing a large-scale, multi-center PSG dataset. The current release includes over 119,000 sleep recordings collected according to the specifications of the American Academy of Sleep Medicine (AASM)16 from approximately 90,000 unique subjects, from newborns to the 10th decade, collected across five institutions in the United States of America. The dataset includes PSG signals, human annotations for sleep staging, arousals, apnea events, and limb movements, as well as demographic information and links to electronic health records. All data are standardized using the Brain Imaging Data Structure (BIDS) format and fully de-identified17,18. An automated scoring system generated standardized, high-quality, automatic annotations of sleep stages, arousals, apnea events, and limb movements to complement institutional human scoring.

Methods

Overview and Data Sources

The HSP integrates clinical polysomnography (PSG) data from five U.S. academic medical centers (sites are de-identified and referenced by cohort ID throughout: S0001, I0002, I0003, I0004, I0006). The pediatric cohort has been described previously by Tripathi et al38. Each study includes synchronized physiological signals and human or automated annotations of sleep stages, arousals, and respiratory and limb-movement events. All data were de-identified under the HIPAA Safe Harbor standard and made publicly accessible (see website: bdsp.io)19 under an approved IRB protocol (#2022P000417), which provided waiver of informed consent. To further enhance data privacy, each site is assigned a unique identifier (S0001, I0002, I0003, I0004, I0006).

Signal Acquisition and Manual Scoring

PSG recordings followed AASM guidelines16,20, using standard montages encompassing Electroencephalogram (EEG), including F4-M1, F3-M2, C4-M1, C3-M2, O2-M1, O1-M2, Electrooculogram (EOG), chin Electromyography (EMG), tibialis anterior EMG, Electrocardiogram (ECG), respiratory effort, airflow (thermistor and nasal pressure), and SpO₂. Montage composition and hypopnea scoring criteria varied across cohorts, as summarized in Supplementary Methods SM.1 (acquisition and scoring) and Supplementary Results SR.1.5 (event definitions and distributions). Sleep stages were manually annotated in 30-s epochs (Wake (W), Non-REM 1 (N1), Non-REM 2 (N2), Non-REM 3 (N3), Rapid Eye Movement (REM)). Respiratory events were classified as obstructive, central, or mixed apnea; obstructive, central, or mixed hypopnea; or respiratory-effort–related arousal (RERA), annotated with onset times and durations. Arousals and limb movements were likewise annotated with onset times and durations when available.

Scoring Provenance

All manual annotations in HSP were generated in routine clinical practice (not for research) at each contributing sleep center, by Registered Polysomnographic Technologists (RPSGTs) or Registered Sleep Technologists (RSTs), following the American Academy of Sleep Medicine (AASM) scoring manual in effect at the time of acquisition, with physician sign-off by a board-certified sleep medicine specialist. All five sites are AASM-accredited and follow institutional quality-assurance procedures that include periodic inter-scorer reliability review and re-scoring of flagged studies. Scoring was performed fully manually (not semi-automatically). Because HSP was assembled retrospectively from clinical archives spanning multiple years at each site, individual-scorer identifiers, years of experience, and per-PSG attribution were not preserved in a structured form during routine clinical workflow and therefore vary across sites and across the multi-year acquisition windows of each cohort; they cannot be reconstructed at the per-PSG level for the full corpus.

Data De-identification

Each patient and PSG session was assigned anonymized identifiers to enable longitudinal analyses while preserving privacy21. Dates were randomly shifted per subject to maintain chronological order without preserving real-world time references. All direct identifiers were removed. Details of the de-identification workflow and Safe Harbor compliance appear in Supplementary Methods SM.1.

Preprocessing and Validation

All European Data Format (EDF)22 files underwent automated integrity checks using the MNE 1.11.0 library23, supplemented with visual audits. Annotation files were temporally aligned to the EDF signals via the measurement start time (“meas_date” in the EDF files). Intermittent misalignments (typically ~10 s) were detected and corrected to ensure 1:1 correspondence between annotations and physiological data. Representative full-night hypnograms and spectrograms were visually inspected for quality assurance (For detailed information about preprocessing, please refer to Supplementary Methods SM.2).

Data Standardization

All PSG recordings were converted from EDF to HDF5 format with standardized metadata to support efficient computational access and large-scale analysis. Signal processing for preparing the HDF5 files included channel-name normalization, uniform referencing of EEG/EOG channels (left to M2, right to M1, midline to average mastoids), resampling to 200 Hz using polyphase filtering, scaling of SpO₂ to 0–100%, and computation of derived channels such as heart rate, airflow, and PAP status as needed24.

Each HDF5 file corresponds to a single PSG session and contains multichannel physiological signals organized in a signals group, along with synchronized, per-sample annotations stored in a separate annotations group. Annotations for sleep stages, arousals, respiratory events, and limb movements are represented as binary or integer-coded sequences aligned to each signal sample, ensuring precise temporal correspondence with the underlying physiological data. File-level metadata includes the sampling rate, channel units, and measurement start time. This derivative format provides a standardized, compressed alternative to raw EDF files, enabling direct indexing, consistent alignment, and seamless integration into automated analysis workflows across cohorts. A comprehensive cross-site channel mapping dictionary, covering all raw-to-standardized label transformations, is provided in Table S8.

Each PSG session in HSP carries a measurement date and time, stored both in the per-session metadata CSV and as an attribute embedded in the corresponding HDF5 signal file. For patients with more than one PSG in the corpus, chronological information is fully preserved, and the inter-study interval can be computed directly from the session timestamps grouped by participant identifier. The protocol type of each study (diagnostic, split-night, PAP titration, and related categories) materially affects the interpretation of sleep-architecture and respiratory metrics. To quantify this, we classified each session by protocol type from the annotation content of each recording (sleep-stage annotations, PAP pressure events, and technologist log text). Of the 115,129 PSG sessions, a study-type label was produced for 103,987 (90.3%); the remainder lacked the annotation content required for classification. Among the PSG sessions, 57.0% were diagnostic, 15.3% PAP titration, and 5.2% split-night, so that 20.6% (23,693 sessions) involved PAP administration during the recording; 12.1% were other or unclear, concentrated in the cohort with the largest manual-annotation gap (I0004). PAP-titration and split-night studies will skew sleep-architecture and OSA-event metrics if not stratified. In the current release, users can already identify and exclude PAP-treated periods at the sample level using the per-sample PAP status channel in the standardized HDF5 files, which records whether PAP is being delivered at each sample. To make session-level filtering more convenient, a per-session study_type field summarizing this classification will be added to the per-session metadata in the next release. We note that this annotation-based protocol classification is independent of the channel-based HSAT designation used above (sessions lacking EEG channels); the two use different definitions and need not agree at the per-session level.

Signal Quality Assessment

An automated signal-quality assessment pipeline was applied to every session in HSP. For each in-scope channel, the recording was partitioned into 30-second epochs and screened for six artifact modes: flat-line (epoch standard deviation below class-specific threshold; disabled for EMG, where REM atonia and quiet-sleep baselines are physiologically flat, and disabled for SpO2, where oxygen saturation is normally constant during stable non-REM sleep), saturation (5% of samples at percentile extremes; biopotentials only), high amplitude (peak-to-peak above class-specific physiological limits; biopotentials only), 50/60-Hz power-line interference (band power exceeding 40%; high-sampling-rate biopotentials only), high-frequency noise above 70 Hz (band power exceeding 50%; high-sampling-rate biopotentials only), and NaN. SpO2 channels received an additional out-of-range check that rejected an epoch when fewer than 70% of its samples were within [50, 110]%. Signal units were auto-detected per channel (volts vs microvolts; fractional vs percent for SpO2) and rescaled before thresholding so that the same artifact rules applied consistently across recordings exported in different units. SpO2 channels whose data were effectively missing (90% zeros, or a median below 5%, indicating fractional or zero-filled output) were recorded as a separate availability flag and excluded from the usable-hours roll-up.

An epoch was deemed usable if no artifact rule fired. Per-class usable hours were taken as the maximum across channels in each modality class (EEG, EOG, EMG, Resp, SpO2, ECG), matching the Sleep Heart Health Study (SHHS) Reading Center "at least one channel" convention. Each session was then assigned a five-level Likert grade adapted from the absolute-duration version of the SHHS Reading Center grading scheme14,44, based on the per-class usable hours: Outstanding (5; all six modality classes usable for more than 5 hours), Excellent (4; at least one EEG, one EOG, EMG, oximetry, and respiratory channel usable for more than 5 hours), Good (3; at least one EEG, oximetry, and one respiratory channel usable for more than 5 hours), Fair (2; the same channels usable for more than 4 hours), and Poor (1; less than 4 hours of usable data on at least one of EEG, oximetry, or respiratory channels). In addition to the categorical Likert grade, a continuous quality score in the range [0, 1] was computed. Both the Likert grade and continuous score are released as columns (likert_scale, quality_score) in the per-session metadata. Across the corpus, the majority of PSGs were of high quality (77.5% graded Outstanding on the analyzable subset); the per-cohort distribution is provided in Supplementary Table S2.

Automated Annotation (CAISR)

To enhance annotation consistency, the Complete AI Sleep Report (CAISR) framework24 (https://github.com/NAIL-NeurologyAILab/CAISR-App) was applied across all valid sessions. CAISR requires specific channels for each task. Preprocessing, filtering, resampling, and normalization were tailored per task (sleep staging, arousal, respiratory, limb movement). Task-specific preprocessing and annotation procedures are described in detail in Supplementary Methods SM.3. Of the 115,129 recorded sessions across cohorts, approximately 80,000 contained the required channels and were therefore eligible for CAISR processing. These sessions yielded four synchronized annotation layers: (1) 30-s sleep stages, (2) 1-s apnea/RERA events, (3) 0.5-s arousals, and (4) 1-s limb movements.

Overlap with the CAISR training set. Of the five HSP cohorts, only S0001 contributed to CAISR development as reported by Nasiri et al.24. 21,004 sessions from 14,941 participants from S0001 were part of the CAISR training set; this corresponds to the cohort previously described in the original CAISR publication and is now incorporated into S0001 in the present release. Cohorts I0002, I0003, I0004, and I0006 were entirely held out from CAISR development. To allow downstream users to identify and stratify by training-set membership, a Boolean field ‘caisr_training_set’ has been added to the session-level metadata CSV for every cohort.

Automated Annotation Consistency and Complementarity

The primary purpose of applying CAISR was to augment the dataset’s analytic value by providing a standardized, reproducible set of automated annotations to complement human scoring. These AI-derived labels were not primarily evaluated as performance benchmarks in this work, but rather are provided as harmonized reference annotations. Performance of CAISR relative to human expert scoring and other AI models has been reported elsewhere24.

To facilitate downstream comparisons and meta-analyses, human and CAISR annotations were aligned at a uniform temporal resolution (30-second epochs for sleep staging and 1-second intervals for event-level annotations). To ensure inter-cohort consistency, respiratory events were harmonized into unified categories. Obstructive, central, and mixed hypopneas were grouped simply as 'hypopneas.' This consolidation also accounts for variations in scoring criteria; events captured under both the restrictive (4% desaturation) and liberal (3% desaturation or arousal) definitions are collapsed into this single 'hypopnea' designation. These harmonized annotations form the foundation for reproducible analyses of sleep physiology, enabling standardized benchmarking across future studies using the HSP corpus. Technical details regarding category alignment, temporal harmonization, and uncertainty quantification are provided in Supplementary Methods SM.3.3–SM.3.4.

Annotation source per figure and table. To make the underlying annotation source unambiguous: cohort composition and sleep-stage summaries (Figures 4 and 5; sleep indices, stage proportions; Supplementary Tables S6a-S6d) are based on manual human scoring, restricted to the 84,582 sessions with a parseable human-scored hypnogram. Event-level summaries in Figure 6 and Supplementary Tables S7a-S7d use human scoring as the primary source, with CAISR-derived events shown alongside where indicated. The concordance analyses and qualitative visualization case studies (Table 2, Figures 7–10, and Supplementary Results SR.1) are, by construction, comparisons of the two annotation sources on the subset of sessions where both are available. The age-associated norms in Figures 11 and 12 are derived from CAISR annotations, since uniform automated labels are required for fitting smooth lifespan trajectories across cohorts; this is stated explicitly in the captions and the corresponding text. For each session in the released data, both annotation sources are provided when both are available: every session with a retrievable human-scored hypnogram includes those annotations, and every session that contained the channels required by CAISR additionally includes the automated annotations.

HSP can support analyses beyond conventional sleep staging and event counts: adult PSGs with available C4-M1 sleep EEG were processed with the previously described Sleep Brain Health model48, which returns a 1024-dimensional latent representation, a brain health score, and predicted cognition scores.

Data Description

The Human Sleep Project (HSP) v3.0 release comprises 119,234 overnight sleep recordings from 90,166 unique patients, spanning infancy through age > 90 years, collected at five U.S. clinical sleep laboratories. Of these, 115,129 are full polysomnograms (PSGs) and 4,105 are home sleep apnea tests (HSATs).

Cohorts and demographics

The five cohorts are de-identified throughout the dataset and referenced only by cohort ID. Population type is included for context (pediatric vs adult):

Cohort	Population	N subjects	Age (mean ± SD)	Female %
S0001	Adult sleep clinic	19,569	52.3 ± 16.9	42.7
I0002	Adult sleep clinic	12,713	56.6 ± 16.9	45.2
I0003	Pediatric sleep clinic	12,825	8.0 ± 6.0	42.3
I0004	Adult sleep clinic	29,488	50.1 ± 21.1	41.2
I0006	Adult sleep clinic	12,984	54.5 ± 15.5	51.4

Per-cohort metadata are provided in structured CSV files (e.g., S0001_psg_metadata_2025-09-05.csv). Detailed age-stratified demographics and per-cohort study counts per patient are in Supplementary Table S1 of the accompanying manuscript.

Annotations

Each session includes synchronized physiological signals and annotations of sleep stages, arousals, respiratory events, and limb movements. Annotations come from two sources:

Human scoring by Registered Polysomnographic Technologists (RPSGTs) or Registered Sleep Technologists (RSTs) following AASM guidelines, with physician sign-off by a board-certified sleep medicine specialist. All five sites are AASM-accredited. Available for 84,582 of 115,129 sessions (73.5%). Sleep stages are scored in 30-second epochs (Wake, N1, N2, N3, REM).
CAISR automated annotations (Complete AI Sleep Report) provide a uniform, reproducible annotation layer for ~80,000 sessions that contain the required channels: 30-second sleep stages, 1-second apnea/RERA events, 0.5-second arousals, 1-second limb movements. A Boolean field caisr_training_set in the per-session metadata identifies the 21,004 S0001 sessions that contributed to CAISR development, so users can perform unbiased CAISR benchmarking on the 94,000+ held-out sessions.

Clinical linkage (ICD-10)

HSP is linked to longitudinal ICD-10 diagnoses spanning 22 categories (neurologic, cardiometabolic, respiratory, psychiatric, endocrine, and more), with each ICD code released alongside its encounter date so researchers can filter diagnoses temporally relative to the sleep study. Sleep-specific ICD breakdowns (G47.x subcategories) are summarized in Supplementary Table S4.

Acquisition hardware and montage by cohort

Recordings come from five sleep labs that together span more than one generation of hardware and software. Device-level identifiers were removed from the EDF headers during de-identification, but the original sampling frequencies and channel sets are preserved, so hardware variation — both between and within cohorts — is directly observable from the released files. The hardware and software entries below are reported by cohort and are approximate unless stated otherwise.

Cohort	EEG montage (from data)	Acquisition hardware & software (reported)
S0001	Stored referenced to the contralateral mastoid	Natus (more recent), Grass (earlier)
I0002	Stored referenced to the contralateral mastoid	Embla REMlogic
I0003	Monopolar; mastoids (M1, M2) stored as separate channels	Embla REMlogic (predominant)
I0004	Monopolar; mastoids (M1, M2) stored as separate channels	Grass amps → Sandman and SD32+ amps → Sandman; later mix of SD32+ → Sandman and SOMNOmedics SomnoScreen → DOMINO. See acquisition history below.
I0006	Monopolar; mastoids (M1, M2) stored as separate channels	Earlier studies: Embla / REMlogic. More recent: Philips G3 (Respironics G3)

EEG referencing — two groups

The clearest data-visible difference between cohorts is how the EEG is referenced. EEG is provided as recorded and has not been re-referenced; the choice of reference is left to the user and will affect downstream analyses:

Referenced to the contralateral mastoid at acquisition: S0001 and I0002
Stored monopolar, with mastoids (M1, M2) as separate channels: I0003, I0004, and I0006 (re-referencing left to the user)

I0004 acquisition history (within-cohort variation)

I0004 recordings span several hardware and software generations:

Batch 1 — mixed front-ends: Grass amps and SD32+ amps both digitized into and exported from Sandman. Hardware-level differences (CMRR, anti-alias filter shape, baseline noise floor) persist even though exported EDFs look identical.
Batch 2 — homogenous: SD32+ → Sandman across all recordings. Most internally consistent batch.
Batch 3 — cross-platform split: subset continued on SD32+ → Sandman; subset transitioned to ambulatory SOMNOmedics SomnoScreen → DOMINO. Different filtering implementations across the two platforms is a confound for ML models.

SD32+ amplifiers had configurable hardware filters whose settings may have changed over the collection period. SD32+ is a Natus/Embla amplifier.

Per-session quality grades

An automated signal-quality assessment is provided for every session. Six artifact modes are screened (flat-line, saturation, high amplitude, 50/60 Hz line interference, high-frequency noise above 70 Hz, NaN) and each session is assigned a five-level Likert grade (Outstanding / Excellent / Good / Fair / Poor), released as the likert_scale column in per-session metadata. A continuous quality_score in [0, 1] is also released. Across the corpus, 77.5% of PSGs are graded Outstanding on the analyzable subset.

Study-type classification

Each session is classified by protocol type (diagnostic, PAP titration, split-night, etc.) from annotation content. Of the 115,129 PSG sessions, a study-type label was produced for 103,987 (90.3%): 57.0% diagnostic, 15.3% PAP titration, 5.2% split-night, 12.1% other or unclear. Users can identify and exclude PAP-treated periods at the sample level via the per-sample PAP status channel in the standardized HDF5 files (when available).

Dataset folder structure (BIDS-compatible)

HSP follows the Brain Imaging Data Structure (BIDS) specification (version 1.7.0+) for organizing multi-site EEG data. The four levels of the folder hierarchy:

bids/
  —— dataset_description.json
  —— participants.json
  —— participants.tsv
  —— README
  —— sub-<SiteIdPatientId>/
        —— ses-<NN>/
              —— sub-<SiteIdPatientId>_ses-<NN>_scans.tsv
              —— eeg/
                    —— sub-<SiteIdPatientId>_ses-<NN>_task-psg_annotations.tsv
                    —— sub-<SiteIdPatientId>_ses-<NN>_task-psg_channels.tsv
                    —— sub-<SiteIdPatientId>_ses-<NN>_task-psg_eeg.edf
                    —— sub-<SiteIdPatientId>_ses-<NN>_task-psg_eeg.json
                    —— sub-<SiteIdPatientId>_ses-<NN>_task-psg_pre.csv

Subject IDs combine the site identifier and a BDSP patient ID. Each session corresponds to a separate PSG, labeled chronologically.

Brain health scores

Adult PSGs with available C4-M1 sleep EEG were processed with the previously described Sleep Brain Health model, which returns a 1024-dimensional latent representation, a brain health score, and predicted cognition scores. Brain health scores declined with age across adult HSP recordings (Pearson r = −0.49). These are included in the standardized files (when available) for downstream analysis.

Standardized HDF5 (planned derivative)

A standardized HDF5 derivative is being prepared alongside the BIDS-compliant EDF files. The HDF5 derivative will offer channel-name normalization, uniform EEG/EOG referencing (left to M2, right to M1, midline to average mastoids), 200 Hz resampling, scaled SpO₂ on [0, 100]%, derived channels (heart rate, airflow, PAP status), and per-sample annotation alignment. Check the release notes for current availability of the HDF5 derivative on S3.

Usage Notes

How to access

HSP data is mediated by an S3 Access Point on the underlying BDSP repository bucket. After your credentialed access is approved, point your tools at the access-point alias rather than a direct bucket URL.

# Sync the PSG/bids tree for one cohort
aws s3 sync \
  s3://arn:aws:s3:us-east-1:184438910517:accesspoint/bdsp-credentialed-access-point/PSG/bids/S0001/ \
  ./S0001/

# Or via the access-point alias form
aws s3 sync \
  s3://bdsp-credentialed-ac-azoj8m45e3tggdoiobxwom8trs59euse1b-s3alias/PSG/bids/S0001/ \
  ./S0001/

Note: this is a multi-hundred-GB sync per cohort. Use a per-subject or per-session prefix when possible.

Per-session metadata

Per-cohort metadata CSVs (S0001_psg_metadata_2025-09-05.csv, etc.) list every session with columns including BDSPPatientID, SessionID, CreationTime (deidentified), BidsFolder, HasAnnotations, HasStaging, StudyType, AgeAtVisit, SexDSC, likert_scale (quality grade), quality_score, and caisr_training_set (CAISR held-out flag). Use these to filter at the session level before downloading PSG payloads.

CAISR automated annotations

CAISR-derived annotations are released alongside human annotations when both are available. The caisr_training_set Boolean in per-session metadata identifies the 21,004 S0001 sessions used to train CAISR; the remaining ~94,000 sessions across I0002, I0003, I0004, and I0006 were held out and are suitable for unbiased benchmarking.

CAISR application code: github.com/NAIL-NeurologyAILab/CAISR-App.

Documentation and helper code

Documentation on how to read and work with the BIDS-format files is in github.com/bdsp-core/sleep_general.

Manufacturer hardware reference (general)

Device-level specifications for the hardware noted in the per-site table (in the Data Description). The actual sampling rate and resolution of each recording are preserved in the released files and should be read from file headers directly rather than from these maxima.

Manufacturer (example models)	Max hardware fs	ADC	Hardware HPF	Hardware LPF	Core software
Natus (Embla NDx / SDx, incl. SD32+)	≤ 4000 Hz	24-bit	DC or 0.16 Hz	≤ 400 Hz	SleepWorks / REMlogic / Sandman
Grass (Comet-PLUS / AS40-PLUS)	800 Hz	16-bit (~0.06 µV/bit)	0.5 Hz (−3 dB)	100 Hz (−3 dB)	TWin
Embla legacy (S4500 / N7000)	500 Hz / 512 Hz	16-bit	DC to 0.3 Hz	≤ 400 Hz	REMlogic
Philips Respironics (Alice 6 / G3)	≤ 2000 Hz (storage ≤ 500 Hz)	16-bit	DC to 0.32 Hz	≤ 300 Hz	Sleepware G3
SOMNOmedics (SomnoScreen)	varies	varies	active hardware filtering	—	DOMINO

Software defaults applied during scoring inside REMlogic / Polysmith / Sleepware G3 / Sandman / DOMINO typically follow AASM: EEG/EOG ~0.3 Hz HPF / 35 Hz LPF; EMG ~10 Hz / 100 Hz; ECG ~1 Hz / 30 Hz. These are defaults, not guarantees — at least one site reports not filtering certain channels (respiratory effort in particular), and configurable amplifier filters add further variation not captured in file headers.

Release Notes

Version 3.0 (May 2026) — major expansion accompanying the manuscript submitted in May 2026. Adds three new cohorts (I0003 [pediatric], I0004, I0006) alongside the two cohorts (S0001, I0002) previously released in v2. Total corpus grows to 119,234 recordings (115,129 full PSGs + 4,105 HSATs) from 90,166 unique patients spanning the entire human lifespan. Introduces CAISR automated annotations (sleep stages, arousals, respiratory events, limb movements), per-session signal quality grades (Likert + continuous score), brain health scores and 1024-d latent EEG representations, full ICD-10 linkage with encounter dates (22 categories), and standardized per-cohort metadata CSVs. Standardized HDF5 derivative is in preparation and will appear in a subsequent point release.

Ethics

All data were de-identified under the HIPAA Safe Harbor standard and made publicly accessible (see website: bdsp.io)19 under an approved IRB protocol (IRB protocol #2022P000417), which provided waiver of informed consent.

Acknowledgements

This work was funded by grants from the NIH (R01HL161253).

Conflicts of Interest

Dr. Westover is a co-founder, scientific advisor, and consultant to, and has a personal equity interest in Beacon Biosignals. Dr. Clifford has received research funding from the NSF, NIH, Amazon Research, the Alzheimer’s Association, the Center for Discovery, CurePSP, the Gates Foundation, Google.org, the Hood Foundation, the Michael J. Fox Foundation, LifeBell AI, NextSense Inc., the One Mind Foundation, the Rett Research Foundation, and the Tides Foundation, as well as unrestricted donations from AliveCor Inc., Google, the Gordon and Betty Moore Foundation, MathWorks, Microsoft Research. Dr Clifford has advisory roles and financial interests in AliveCor Inc. and NextSense Inc. He is also the CTO of MindChild Medical with significant stock. These relationships are unconnected to the current work. Dr. Thomas is co-inventor of: 1) Cardiopulmonary sleep spectrogram to assess sleep stability/quality and sleep apnea, licensed by the Beth Israel Deaconess Medical Center to MyCardio, LLC; 2) Patent for Enhanced Expiratory Rebreathing Space to treat high loop gain sleep apnea; 3) Patent for estimating respiratory self-similarity for detection of high loop gain sleep apnea. 4) General sleep medicine consulting: GLG Councils, Guidepoint, Beacon Biosignals, Jazz Pharmaceuticals. Dr. Stone reports grant funding from Eli Lilly and is consultant for Axsome Therapeutics. Dr. Maski 1) is consultant for Alkermes, Avadel, Harmony Biosciences, Jazz Pharmaceuticals, Takeda Pharmaceuticals, 2) has grant funding Harmony Biosciences and Jazz Pharmaceuticals, 3) is DSMB chair for Idorsia, 4) collaborator on clinical trials sponsored by Alkermes and Takeda. These relationships are unconnected to the current work.

Access

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
BDSP Credentialed Health Data License 1.5.0

Data Use Agreement:
BDSP Credentialed Health Data Use Agreement

Required training:

Discovery

DOI:
https://doi.org/10.60508/m3sw-rz13

Project Website:
https://github.com/bdsp-core/CAISR

Corresponding Author

You must be logged in to view the contact information.

Versions

2.0 - Nov. 1, 2023
3.0 - June 2, 2026

Files

This is a restricted-access resource. To access the files, you must fulfill all of the following requirements:

be a credentialed user
sign the data use agreement for the project

The Human Sleep Project

Cite