Database Restricted Access

The Human Sleep Project

M Brandon Westover Valdery Moura Junior Robert Thomas Sydney Cash Samaneh Nasiri Haoqi Sun Aditya Gupta Jonathan Rosand Manohar Ghanta Wolfgang Ganglberger Umakanth Katwa Katie Stone Zhiyong Zhang Gauri Ganjoo Thijs E Nassi PhD Candidate Ruoqi Wei Dennis Hwang Lynn Marie Trotti Ankit Parekh ErikJan Meulenbrugge Emmanuel Mignot Rhoda Au Gari Clifford David Rapoport

Published: May 23, 2023. Version: 1.0 <View latest version>

When using this resource, please cite: (show more options)
Westover, M. B., Moura Junior, V., Thomas, R., Cash, S., Nasiri, S., Sun, H., Gupta, A., Rosand, J., Ghanta, M., Ganglberger, W., Katwa, U., Stone, K., Zhang, Z., Ganjoo, G., Nassi PhD Candidate, T. E., Wei, R., Hwang, D., Trotti, L. M., Parekh, A., ... Rapoport, D. (2023). The Human Sleep Project (version 1.0). Brain Data Science Platform.


The Human Sleep Project (HSP) sleep physiology dataset is a growing collection of clinical polysomnography (PSG) recordings. Beginning with PSG recordings from from ~19K patients evaluated at the Massachusetts General Hospital, the HSP will grow over the coming years to include data from >200K patients, as well as people evaluated outside of the clinical setting.


The HSP dataset is being used to develop CAISR (Complete AI Sleep Report), a collection of deep neural networks,  rule-based algorithms, and signal processing approaches designed to provide better-than-human detection of conventional PSG scoring metrics, including sleep stages, arousals, apnea and hypopnea events and their subtypes, and periodic limb movements.

Beyond conventional scoring, the HSP dataset is intended to support research seeking to identify "hidden" information within the brain's activity during sleep that can be used to directly measure brain health. These brain health indicators include measures of risk for common neurologic diseases, including cerebrovascular disease, Alzheimer's disease, and related neurodegenerative diseases of aging; indicators of response to therapies, including lifestyle interventions (e.g. diet, meditation, exercise) and pharmacologic interventions.

Over time we will be adding additional data to enable further research on the relationships between sleep and health, including medical diagnoses, medical testing and imaging results, brain images (MRI, CT, PET), genetics, and omics data. 


As of 4/1/2023, the dataset includes 25,941 PSG recordings from the Massachusetts General Hospital’s (MGH) Sleep Lab in the Sleep Division. The PSG recordings were captured following the AASM standards, which included thirteen signals. These signals comprise six channels of electroencephalography (EEG) at F3-M2, F4-M1, C3-M2, C4-M1, O1-M2, and O2-M1, based on the International 10/20 System; electroculography (EOG) on the left side (EEG and EOG referenced to the contralateral ear lobe); electromyography (EMG) measured at the chin; two channels of respiration signals from the abdomen and chest; airflow and oxygen saturation (SaO2); and one ECG channel recorded below the right clavicle near the sternum and over the left lateral chest wall. All signals, except the SaO2, are measured with (or resampled to) a sampling frequency of 200Hz. SaO2 signals have been upsampled using sample and hold to 200Hz to synchronize all signals. All signals are measured in microvolts.

All HSP data is shared under protocols reviewed by appropriate local Institutional Review Boards (IRBs). Data is deidentified following the Safe Harbor Method

Data Description

  • Sleep stages were annotated by certified sleep technologists as part of routine clinical care, according to the American Academy of Sleep Medicine (AASM) manual for the scoring of sleep. Stages were annotated in 30 second contiguous intervals, and include: wakefulness, (W) non-REM stage 1 (N1), non-REM stage 2 (N2), non-REM stage 3 (N3), and rapid eye movement (REM) sleep. 
  • Arousals are annotated, and classified as either spontaneous or respiratory effort related arousals (RERA), or arousals associated with a variety of other events including bruxisms (teeth grinding), hypoventilations, hypopneas, apneas (central, obstructive and mixed), vocalizations, snores, periodic leg movements. 
  • Respiratory events are scored as obstructive apnea, central apnea, mixed apnea, hypopnea, and respiratory effort-related arousal. 
  • Periodic limb movements and isolated limb movements are scored. 

Usage Notes

The PSG signal data is available in .mat files, and the annotations are available as .csv files. We have ensured the de-identification of all files on the platform, with no names or real dates included. Dates have been shifted to protect the privacy of the participants.

We have provided detailed documentation on how to read and work with the files, which can be found on our GitHub repository. 

Code for automated scoring of events will be available on the CAISR github repository. 

Release Notes

By requesting access to the data, you agree not to download, copy, repost or to publish or otherwise share any work that uses the data, in full or in part, without written consent from the BDSP leadership team. This condition will be loosened at a later date. 


Data collection and sharing for the HSP is performed under Institutional Review Board (IRB) approvals and data sharing agreements among participating hospitals, with waiver of the requirement for informed consent. HSP data is generated as part of usual patient care. All data is deidentified. 


The Human Sleep Project has received support from the Glenn Foundation and the American Federation of Aging Research (AFAR) through the 2018 Glenn / AFAR Award for Medical Research Breakthroughs in Gerontology (BIG) (2018), the American Academy of Sleep Medicine (AASM) through a 2019 Strategic Research Award, the National Institutes of Health (NIH) (R01NS102190, R01NS102574, R01NS107291, RF1AG064312, RF1NS120947, R01AG073410, R01HL161253, R01NS126282, R01AG073598), the National Science Foundation (NSF 2014431), and through the Henry and Allison McCance Center for Brain Health.

Conflicts of Interest

MBW is a co-founder of Beacon Biosignals. Beacon Biosignals did not contribute funding and played no role in this work.


Access Policy:
Only registered users who sign the specified data use agreement can access the files.

License (for files):
BDSP Restricted Health Data License 1.0.0

Data Use Agreement:
BDSP Restricted Health Data Use Agreement

Corresponding Author
You must be logged in to view the contact information.
  • 1.0 - May 23, 2023
  • 2.0 - Nov. 1, 2023