Name: MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code
Published: June 2, 2026
License: https://github.com/bdsp-core/bdsp-license-and-dua

Database Credentialed Access

Chenxi Sun , Ioannis Karakis , Aline Herlopian , Marcus Ng , Gamaleldin Osman , Zubeda Sheikh , Olga Taraschenko , Jiyeoun Yoo , Brian Appavu , Lakshman Arcot Jayagopal , Peter Kaplan , Jong Woo Lee , Olga Selioutski , Hiba Haider , Jonathan Halford , Daniel Hoch , Alice Lam , Fabio Nascimento , Jay Pathmanathan , Sarah Schmitt , Pia De Stefano , Urs Fisch , jeroen gijs , Humberto Castro-Lima , Wan-Yee Kong , Mackenzie Cervenka , Monica Dhakar , Safoora Fatima , Nicolas Gaspard , emily gilmore , Susan Herman , Manisha Holmes , Emily Johnson , Carlos F. Muniz , Eric Rosenthal , Andres rodriguez , rani sarkis , Mouhsin Shafi , Christa Swisher , Mohammad Tabaeizadeh , Selim Benbadis , Fonda Chan , Catherine Chu , Marjan Dolatshahi , Adam Greenblatt , Roohi Katyal , Chinasa Nwankwo , Edilberto Amorim , William O. Tatum , Dan Weber , Tobias Loddenkemper , Jurriaan Peters , Umakanth Katwa , Kiran Maski , Robert Thomas , Shenda Hong , Doyle Yuan , Sydney Cash , Andrew Cole , Daniel Goldenholz , Charlotte Stow , Jennifer Kim , Sahar Zafar , Aaron F Struck , M Brandon Westover , Jin Jing

Published: June 2, 2026. Version: 1.0.0

When using this resource, please cite: (show more options)
Sun, C., Karakis, I., Herlopian, A., Ng, M., Osman, G., Sheikh, Z., Taraschenko, O., Yoo, J., Appavu, B., Arcot Jayagopal, L., Kaplan, P., Lee, J. W., Selioutski, O., Haider, H., Halford, J., Hoch, D., Lam, A., Nascimento, F., Pathmanathan, J., ... Jing, J. (2026). MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code (version 1.0.0). Brain Data Science Platform. https://doi.org/10.60508/v7ky-5g40.

MLA	Sun, Chenxi, et al. "MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code" (version 1.0.0). Brain Data Science Platform (2026), https://doi.org/10.60508/v7ky-5g40.
APA	Sun, C., Karakis, I., Herlopian, A., Ng, M., Osman, G., Sheikh, Z., Taraschenko, O., Yoo, J., Appavu, B., Arcot Jayagopal, L., Kaplan, P., Lee, J. W., Selioutski, O., Haider, H., Halford, J., Hoch, D., Lam, A., Nascimento, F., Pathmanathan, J., ... Jing, J. (2026). MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code (version 1.0.0). Brain Data Science Platform. https://doi.org/10.60508/v7ky-5g40.
Chicago	Sun, Chenxi, Karakis, Ioannis, Herlopian, Aline, Ng, Marcus, Osman, Gamaleldin, Sheikh, Zubeda, Taraschenko, Olga, Yoo, Jiyeoun, Appavu, Brian, Arcot Jayagopal, Lakshman, Kaplan, Peter, Lee, Jong Woo, Selioutski, Olga, Haider, Hiba, Halford, Jonathan, Hoch, Daniel, Lam, Alice, Nascimento, Fabio, Pathmanathan, Jay, Schmitt, Sarah, De Stefano, Pia, Fisch, Urs, gijs, jeroen, Castro-Lima, Humberto, Kong, Wan-Yee, Cervenka, Mackenzie, Dhakar, Monica, Fatima, Safoora, Gaspard, Nicolas, gilmore, emily, Herman, Susan, Holmes, Manisha, Johnson, Emily, Muniz, Carlos F., Rosenthal, Eric, rodriguez, Andres, sarkis, rani, Shafi, Mouhsin, Swisher, Christa, Tabaeizadeh, Mohammad, Benbadis, Selim, Chan, Fonda, Chu, Catherine, Dolatshahi, Marjan, Greenblatt, Adam, Katyal, Roohi, Nwankwo, Chinasa, Amorim, Edilberto, Tatum, William O., Weber, Dan, Loddenkemper, Tobias, Peters, Jurriaan, Katwa, Umakanth, Maski, Kiran, Thomas, Robert, Hong, Shenda, Yuan, Doyle, Cash, Sydney, Cole, Andrew, Goldenholz, Daniel, Stow, Charlotte, Kim, Jennifer, Zafar, Sahar, Struck, Aaron F, Westover, M Brandon, and Jin Jing. "MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code" (version 1.0.0). Brain Data Science Platform (2026). https://doi.org/10.60508/v7ky-5g40.
Harvard	Sun, C., Karakis, I., Herlopian, A., Ng, M., Osman, G., Sheikh, Z., Taraschenko, O., Yoo, J., Appavu, B., Arcot Jayagopal, L., Kaplan, P., Lee, J. W., Selioutski, O., Haider, H., Halford, J., Hoch, D., Lam, A., Nascimento, F., Pathmanathan, J., Schmitt, S., De Stefano, P., Fisch, U., gijs, j., Castro-Lima, H., Kong, W., Cervenka, M., Dhakar, M., Fatima, S., Gaspard, N., gilmore, e., Herman, S., Holmes, M., Johnson, E., Muniz, C. F., Rosenthal, E., rodriguez, A., sarkis, r., Shafi, M., Swisher, C., Tabaeizadeh, M., Benbadis, S., Chan, F., Chu, C., Dolatshahi, M., Greenblatt, A., Katyal, R., Nwankwo, C., Amorim, E., Tatum, W. O., Weber, D., Loddenkemper, T., Peters, J., Katwa, U., Maski, K., Thomas, R., Hong, S., Yuan, D., Cash, S., Cole, A., Goldenholz, D., Stow, C., Kim, J., Zafar, S., Struck, A. F., Westover, M. B., and Jing, J. (2026) 'MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code' (version 1.0.0), Brain Data Science Platform. Available at: https://doi.org/10.60508/v7ky-5g40.
Vancouver	Sun C, Karakis I, Herlopian A, Ng M, Osman G, Sheikh Z, Taraschenko O, Yoo J, Appavu B, Arcot Jayagopal L, Kaplan P, Lee J W, Selioutski O, Haider H, Halford J, Hoch D, Lam A, Nascimento F, Pathmanathan J, Schmitt S, De Stefano P, Fisch U, gijs j, Castro-Lima H, Kong W, Cervenka M, Dhakar M, Fatima S, Gaspard N, gilmore e, Herman S, Holmes M, Johnson E, Muniz C F, Rosenthal E, rodriguez A, sarkis r, Shafi M, Swisher C, Tabaeizadeh M, Benbadis S, Chan F, Chu C, Dolatshahi M, Greenblatt A, Katyal R, Nwankwo C, Amorim E, Tatum W O, Weber D, Loddenkemper T, Peters J, Katwa U, Maski K, Thomas R, Hong S, Yuan D, Cash S, Cole A, Goldenholz D, Stow C, Kim J, Zafar S, Struck A F, Westover M B, Jing J. MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code (version 1.0.0). Brain Data Science Platform. 2026. Available from: https://doi.org/10.60508/v7ky-5g40.

Additionally, please cite the original publication:

Sun C, Karakis I, Herlopian A, et al. Toward Unified and Comprehensive Automated EEG Interpretation: Multi-center Development and Validation of an EEG Foundation Model. Lancet Digital Health, in press (2026).

Abstract

Background. Electroencephalography (EEG) is essential for neurological diagnosis, but expert interpretation is limited globally, and existing AI methods address narrow tasks. This study aimed to develop and externally validate a broadly applicable foundation model capable of expert-level performance across diverse EEG tasks and clinical settings.

Methods. We developed MORGOTH (Multi-domain Omnibus for Reading and Generalizing Over THorough EEG interpretation), a foundation model that supports broad clinical interpretation across all major settings. In this multi-center study, we developed MORGOTH using EEGs from 18,677 patients across four hospitals and validated it internally on 13,334 patients and externally on 1,573 patients from 48 institutions, spanning diverse clinical settings and ages (0–90+ years). Test datasets annotated by 6–30 experts enabled inter-rater reliability (IRR) analysis by comparing model–expert and expert–expert agreement. MORGOTH was compared against both human experts and state-of-the-art models using area under the curve (AUC) and the percentage of experts' operating points under the curve (EUC) for receiver operating characteristic (ROC) and precision-recall (PR) curves, as well as inter-rater reliability and statistical calibration.

Findings. MORGOTH achieved expert-level performance with AUC-ROC scores of 0.86–0.98 across 17 EEG findings. It outperformed an average of 90% of experts on 5 of 7 multi-expert–annotated datasets and exceeded at least 20% of experts on each of the 17 tasks. Event-level performance was especially strong for seizure detection (EUC=95%), rhythmic and periodic pattern classification, and spike detection (EUC=100%). IRR analysis showed MORGOTH matched or exceeded expert consensus. External validation confirmed consistent performance with modest declines from internal to external test sets (event-level AUC −1.2%, EUC −3.3%; EEG-level AUC −2.1%, EUC −9.5%). MORGOTH also remained robust across age, sex, and moderate channel loss.

Interpretation. MORGOTH advances automated EEG interpretation with expert-level performance across clinical settings, offering improved diagnostic accuracy in low-resource environments and greater efficiency in high-volume centers.

Background

Electroencephalography (EEG) is a cornerstone diagnostic tool for evaluating patients with neurological disorders. When interpreted by skilled clinicians, EEG provides critical information that guides diagnosis and therapeutic decision-making. Despite the prevalence of conditions requiring EEG interpretation — with epilepsy alone affecting more than 70 million people worldwide — expertise in clinical EEG interpretation remains scarce. Most EEGs are interpreted by physicians without specialized fellowship training, and most hospitals cannot provide real-time EEG monitoring. This expertise gap contributes to preventable harm; EEG misinterpretation is the leading cause of epilepsy misdiagnosis.

AI has the potential to address these unmet clinical needs by providing accurate EEG interpretation where expertise is unavailable, reducing expert workload, and improving diagnostic consistency. However, existing AI approaches for EEG interpretation have addressed only limited aspects of this complex task. Recent models such as SCORE-AI focus exclusively on routine outpatient EEGs, SPaRCNet targets only critical care settings, SpikeNet addresses only epileptiform discharge detection, and U-Sleep is used for sleep staging. No prior work has simultaneously addressed all clinical EEG settings and the full spectrum of clinically relevant EEG patterns. Additionally, external validation of these models has been limited, particularly regarding algorithm performance relative to expert inter-rater reliability.

Foundation models adaptable to diverse tasks have gained attention, as shown by the success of large language models. Inspired by this, EEG foundation models such as LaBraM, EEGFormer, and BrainBERT have emerged since 2023. However, they have primarily focused on non-clinical domains, with limited relevance to clinical EEG interpretation.

Our work addresses these limitations through MORGOTH (Multi-domain Omnibus for Reading and Generalizing Over THorough EEG interpretation), a unified and generalizable foundation model for EEG interpretation across all clinical settings — including routine outpatient clinics, epilepsy monitoring units (EMU), and critical care environments. Unlike previous approaches, our system performs both event-level detection and EEG-level interpretation within a single architecture. We expanded the range of detectable patterns to include seizures, epileptiform discharges, rhythmic and periodic patterns along the ictal–interictal–injury continuum (IIIC), pathological slowing, and sleep stage classification.

Methods

Study Design and Data Sources

This study was conducted with ethical approval from the appropriate institutional review boards and was approved by the Beth Israel Deaconess Medical Center (BIDMC) IRB (protocols #2022P000481 and #2022P000417), with a waiver of informed consent for retrospective analysis. All data were collected and analyzed in accordance with relevant ethical guidelines and regulations. MORGOTH was developed and validated using EEG datasets from diverse sources (see Table 1).

Model Development Framework

MORGOTH automatically processes 19-channel EEG recordings with the international 10–20 system at 200 Hz (data at other rates are resampled). Signals are bandpass filtered (0.5–70 Hz), notch filtered at 50/60 Hz, clipped at ±500 μV, and normalized to [−1, 1] using a common average montage. Missing channels are imputed with the mean of available channels. Although the EKG channel is excluded, the model can reject EKG artifacts.

The model comprises a tokenizer, an EEG Transformer, and specialized task heads. The tokenizer converts continuous EEG into 8,192 discrete tokens via contrastive learning and vector quantization. The EEG Transformer includes 12 encoder blocks with multi-head attention and temporal/spatial position encoding. Event-level tasks use fully connected heads for pattern detection, while EEG-level tasks use convolutional and attention-based heads for whole-recording classification.

MORGOTH performs 7 event-level tasks: 3 binary classifications (normal vs. abnormal, burst suppression, spike detection), 2 three-class tasks (slowing: focal vs. generalized vs. none; spike localization: focal vs. generalized vs. none), 1 five-class sleep staging task (AASM-defined), and 1 six-class seizure/IIIC task (seizure, LPD, GPD, LRDA, GRDA, other). EEG-level tasks include 17 binary classifications, one per EEG finding. Labels for training and evaluation are assigned to 10-second segments (event-level), 1-second segments (event-level spikes), 10-minute segments (EEG-level internal datasets), or full recordings (EEG-level external datasets). The model supports variable-length EEG inputs at inference.

Model Training

We used a two-stage training approach: first, pretraining the tokenizer and transformer on EEGs from 14,500 HEEDB patients using self-supervised learning; then, fine-tuning task-specific heads on expert-labeled datasets. For multi-expert labeled datasets, we used soft labels (expert vote distributions) to approximate class probabilities and capture consensus. To address class imbalance and sample difficulty, we applied focal loss with curriculum learning. Training was optimized using AdamW with learning rate scheduling and early stopping.

Validation Strategies

Internal validation: holdout data from HEEDB-Test, IIIC-Test, SN2-Test, MGH-PSG-Test, and BCH-PSG datasets, with no patient overlap with training data.
External validation: 6 independent datasets from 48 institutions, including HEP, SAI, ON, TUH, UPenn, and MASS.
Inter-rater reliability (IRR): multi-expert annotated datasets including MoE (21 experts), IIIC-Test (30), SN2-Test (24), HEP (13), SAI (14), ON (15), UPenn (6).
Comparison against SOTA models: SpikeNet for spike detection; SPaRCNet and the Kaggle HMS competition winner for seizure and IIIC; U-Sleep for sleep staging; a burst suppression detector; SCORE-AI for EEG-level tasks; and EEG foundation models LaBraM, EEGFormer, BrainBERT on TUH benchmarks.

Statistical Analysis

Model performance was evaluated using Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves. To compare with experts, we computed Experts Under the Curve (EUC) — the fraction of expert operating points (sensitivity–specificity or precision–recall) that lie below the model's ROC or PR curve. We calculated 95% confidence intervals using 10,000 bootstrap iterations. Reliability was reported using Cohen's κ, comparing expert–expert and expert–model agreement patterns. Calibration was assessed by fitting parametric models to derive a calibration index ranging from −1 (maximal under-calling) to 1 (maximal over-calling); we applied Platt scaling and isotonic regression to recalibrate outputs.

Table 1. Dataset statistics

Numbers represent total counts; age is reported as median [min, max]. "Female" is the proportion of female patients. "Labels/sample" indicates how many experts labeled each sample. "EEGs" shows total recordings (a patient may have multiple). Ethnicity is the proportion of participants self-identifying as White, Black, American Indian/Alaska Native, Asian, Native Hawaiian/Pacific Islander, Multiracial, or Other. "Routine", "ICU", "EMU", and "Sleep Lab" indicate the number of recordings from each setting.

Dataset	Hospitals	Experts	Labels/sample	Patients	EEGs	Female	Age	Routine	ICU	EMU	Sleep Lab	Findings
Pre-training dataset
HEEDB-Pretrain	4	4	1 [1,3]	14,500	14,500	50%	49 [0,100]	7,793	2,942	765	0	All 17 findings
Fine-tuning datasets
HEEDB-Train	4	1	1 [1,3]	10,851	10,851	49%	49 [0,100]	6,695	3,064	591	0	All 17 findings
MGH-PSG-Train	1	7	1 [1,1]	1,000	1,000	50%	60 [2,94]	0	0	0	1,000	5 sleep stages
IIIC-Train	1	124	3 [1,30]	1,940	1,940	49%	54 [0,96]	367	1,295	243	0	Seizure and 4 IIICs
SN2-Train	2	24	8 [1,23]	4,886	4,886	49%	53 [0,99]	3,578	899	131	0	1-second spike
Internal validation datasets
HEEDB-Test	4	1	1 [1,3]	8,527	8,527	51%	59 [0,100]	4,464	3,404	278	0	All 17 findings
IIIC-Test	2	30	3 [1,30]	758	758	48%	60 [0,95]	135	565	45	0	Seizure and 4 IIICs
SN2-Test	2	24	8 [8,23]	208	208	48%	37 [0,96]	127	64	17	0	1-second spike
MGH-PSG-Test	1	7	1 [1,1]	1,000	1,000	50%	59 [5,90]	0	0	0	1,000	5 sleep stages
BCH-PSG	1	5	1 [1,1]	1,000	1,000	50%	7 [0,35]	0	0	0	1,000	5 sleep stages
MoE-Internal	3	21	13 [10,19]	1,841	2,125	53%	54 [0,95]	597	898	272	0	Slowing, burst suppression, spike location, seizure+IIIC, awake/N1/N2
External validation datasets
MoE-External	4	21	13 [10,19]	409	636	30%	63 [16,93]	0	409	0	0	As above
HEP	29	13	3 [3,3]	143	143	—	—	143	0	0	0	1-second spike
MASS	3	1	1 [1,1]	200	200	51%	40 [18,76]	0	0	0	200	5 sleep stages
TUH-Test	1	1	1 [1,1]	552	552	51%	52 [9,90+]	mix of EMU, ICU, routine				4 tasks
UPenn	3	6	6 [6,6]	69	69	100%	51 [39,63]	0	0	0	69	5 sleep stages
SAI	3	14	11 [11,11]	100	100	61%	26 [0,95]	100	0	0	0	Normal, slowing, spike location
ON	5	15	15 [15,15]	100	100	53%	26 [0,99]	100	0	0	0	Slowing, spike location

Data Description

This release contains the labeled EEG datasets and pretraining recordings used to develop and validate MORGOTH, plus the inference and training code at github.com/bdsp-core/morgoth. The MORGOTH-EEG corpus complements the parent Harvard Electroencephalography Database (HEEDB v4.1) publication by exposing the precise data splits, expert-annotated segments, and pretrained checkpoints used in the manuscript.

Data layout on S3

Files are hosted in the bdsp-opendata-credentialed bucket under prefix morgoth1/. Access is mediated via the BDSP credentialed access point — users approved for credentialed access receive read permission on the access-point ARN.

s3://bdsp-opendata-credentialed/morgoth1/
└── data/
    ├── pretrain/
    │   └── sub-I*_ses-1_*.mat        # 14,500 10-minute EEG files, 19 channels @ 200 Hz, ~19 MB each
    └── internal_dataset/
        ├── ABNORMAL/                       # binary EEG-level: abnormal
        ├── NORMAL/                         # binary EEG-level: normal
        ├── BS/                             # burst suppression
        ├── FOCALSLOWING/, GENSLOWING/      # slowing (3-class)
        ├── SEIZURE/, LPD/, GPD/, LRDA/, GRDA/, IIIC/    # seizure / ictal-interictal continuum
        ├── BIPD/, BIRD/                    # bilateral patterns
        ├── PDR/, POSTS/, BETS/             # background and benign-variant labels
        ├── AWAKE/, N1_19Channel/, N2_19Channel/, ... # sleep stages (19-channel mode)
        └── Morgoth_test_dataset_10s/       # consolidated 10-second test slices

Code repository

Inference and training code is at github.com/bdsp-core/morgoth (CC BY-NC 4.0). Pretrained checkpoints and a small test data sample are distributed via the Dropbox link in the repository README. The model accepts raw EEG in both EDF and MAT formats; preprocessing (bandpass, resampling, montage, clipping, normalization, epoching) is applied automatically.

Tasks supported by MORGOTH

Event-level (10–1 s segments): normal/abnormal, burst suppression, spike detection, slowing (focal/generalized/none), spike localization (focal/generalized/none), sleep staging (5-class), seizure and IIIC (seizure/LPD/GPD/LRDA/GRDA/other).
EEG-level (10-minute or full-recording): 17 binary findings.

Usage Notes

Getting started

Clone the repository and create a conda environment (recommended: Python 3.12, PyTorch 2.4, CUDA 12.4):

git clone https://github.com/bdsp-core/morgoth.git
cd morgoth
conda create -n morgoth python=3.12 -y
conda activate morgoth
pip install -r requirements.txt
conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia

Running inference

Three entry points are provided in the repository:

continuous_event_level.sh — sliding-window event-level prediction over continuous EEG
discrete_event_level.sh — segment-by-segment event-level prediction (10-second segments)
EEG_level.sh — whole-recording EEG-level classification (consumes 1-second sliding-window event-level outputs)

CPU-only variants (EEG_level_cpu.sh, EEG_level_windows_cpu.bat) are available for Windows or CPU-only systems.

Training from scratch

To train MORGOTH on your own data, prepare a dataset in .h5 (pretraining) and .pkl (fine-tuning) format, modify data_provider.py accordingly, then run pretrain.sh, train_classification.sh (event-level heads), and train_EEG_level_head.sh (EEG-level heads).

Data access

Datasets in s3://bdsp-opendata-credentialed/morgoth1/ are accessible to credentialed BDSP users via the bdsp-credentialed-access-point S3 access point. After completing the BDSP credentialed Data Use Agreement and receiving access, data can be retrieved with the AWS CLI:

aws s3 ls s3://bdsp-opendata-credentialed/morgoth1/ --profile bdsp-credentialed
aws s3 sync s3://bdsp-opendata-credentialed/morgoth1/data/internal_dataset/SEIZURE/ ./SEIZURE/ --profile bdsp-credentialed

Release Notes

Version 1.0.0 — initial release accompanying acceptance of the MORGOTH manuscript at Lancet Digital Health (2026). This release contains the labeled EEG datasets and pretraining recordings used to develop and validate the MORGOTH foundation model. Inference and training code are available at github.com/bdsp-core/morgoth under CC BY-NC 4.0. The final journal citation will be added once the article is in print.

Ethics

Acknowledgements

The authors thank the patients and clinical EEG teams at the participating institutions, and the dozens of expert annotators whose multi-rater labels enabled the inter-rater reliability analyses central to this work.

Conflicts of Interest

Per-author disclosures are listed below. Authors not listed report no relevant conflicts.

M. Brandon Westover is a co-founder, scientific advisor, consultant to, and has personal equity interest in Beacon Biosignals, and has received NIH (RFG064312, RF1NS120947, R01AG073410, R01HL161253, R01NS126282, R01AG073598, R01NS131347, R01NS130119).
Jin Jing receives author royalties from Springer Publishing.
Aaron F. Struck received a grant from Ceribell and NIH (R01NS111022).
Ioannis Karakis serves as a consultant for Epitel, Ceribell, Neurotech, UCB, and GSK.
Tobias Loddenkemper is listed on patents and patent applications related to the detection, prediction, diagnosis, and treatment of neurological conditions, including epilepsy and seizures (US patents 12150771, 11564617, 10959662, 10278608, and applications 20230397876, 20230386025, 20230141496, 20210038143, 20200383627, 20190298248, 20180206776, 20150216436). His lab has received NIH grant funding (U44NS121562, R01NS111022, R01NS088627, R21NS101381, U01NS090415, U24NS107200, K23NS076550), as well as device donations from Epitel and Empatica and travel support from AAN, ACNS, and American Education Services. His lab is supported by international academic fellows. These activities are not related to this work.
Andres A. Rodriguez Ruiz serves on the advisory board of SK Life Science and holds ownership in Rodzi Homes LLC and Lourbeth LLC. These activities are not related to this work.
Kiran Maski has received NIH (R61NS130215-02).
Aline Herlopian has received consultancy fees from UCB and Medtronic, honoraria from Natus, and royalties from Springer Nature; she has received research funding from NIH (R01NS132121-01A1; R21HD115285, AWD0012923) and the Swebilius Foundation.
Marcus C. Ng receives author royalties from Demos Publishing, research grants from Eisai Canada and Paladin, and serves as a consultant for UCB Canada and Eisai Canada (proceeds donated to a hospital charity foundation); he has received CIHR (PJT178217, DC0190GP).
Ji Yeoun Yoo receives author royalties from Elsevier and has received NIH (1UH3NS109557-01A1).
Jay Pathmanathan receives salary from Beacon Biosignals and owns stock in Beacon Biosignals.
Mackenzie C. Cervenka has served as a consultant and speaker for Nutricia North America/Danone and a speaker for Nestle Health Science, and receives author royalties from Demos/Springer Publishing Company.
Nicolas Gaspard serves on the scientific board of Bioserenity. He has received research funding from the Fonds National pour la Recherche Scientifique, INNOVIRIS, Fonds Erasme pour la Recherche Médicale, and Fonds Jaumotte, all provided to his institution.
Robert Joseph Thomas is co-inventor and patent holder of the ECG/PPG-derived sleep spectrogram, licensed by BIDMC to MyCardio, LLC, and receives royalties through BIDMC. The Positive Airway Pressure Gas Modulator for treatment of central/complex sleep apnea is an unlicensed patent. He is also co-inventor of a licensed (by BIDMC to Sleepcare Asia) submitted patent for respiratory self-similarity and inventor of a licensed (by BIDMC to Sleepcare Asia) Enhanced Expiratory Rebreathing Space for high-loop-gain sleep apnea; these two patents currently have no royalties. He consults for Guidepoint Global and GLG Councils. He is also co-inventor of licensed auto-CPAP software to DeVilbiss-Drive, without current royalties.
Daniel M. Goldenholz is an unpaid advisor for Epilepsy AI and Eysz and has been a paid advisor for Magic Leap. He has received speaker fees from AAN, AES, ACNS, NNS, and AI in Epilepsy and Neurology and has previously been a paid consultant for Neuro Event Labs, IDR, LivaNova, Health Advances, Duke University, and Bloom Insights. He has received research funding from NIH and BIDMC (NINDS K23NS124656). These activities are not related to this work.
Peter W. Kaplan serves as an expert witness on qEEG, EEG, neurology, and epilepsy; is a member of the IFCN Executive Committee; consults for Natus; and receives author royalties from Demos and Wiley-Blackwell.
Jong Woo Lee is co-founder of Soterya Inc (with no financial ties), has performed contract work for Teladoc, and has received an investigator-initiated study grant from SK Biopharmaceuticals.
Olga Selioutski is a council member of the American Clinical Neurophysiology Society (ACNS) and has received research support through a subaward on an NIH grant (R01NS131967).
Hiba A. Haider receives author royalties from UpToDate Inc. and Springer Publishing. These activities are not related to this work. She has received NINDS (5R21NS137117) and NCATS (5UG3TR004501).
Susan T. Herman reports grants to her institution from Neuroelectrics, the Epilepsy Foundation, and the NORSE Institute, and has received research support through a subaward on an NIH SBIR grant (NS121559).
Emily L. Johnson serves as an associate editor for Neurology and has received research funding from NIH (K23AG063899).
Jonathan J. Halford has received research funding from the Veterans Affairs Office of Research and Development (I01HX003107-01A2).
Brian L. Appavu has received research grant support from NIH (R01NS133037) and the Pediatric Epilepsy Research Foundation, with funding provided to his institution.
Eric S. Rosenthal has received research grant support from NIH/NINDS (R01NS117904) and NIH/OD (OT2OD032701), with funding provided to his institution.
Olga Taraschenko receives salary and research support from NIH (P20GM130447), the Cognitive Neuroscience and Development of Aging (CONDA) Award, and the DHHS LB606 Nebraska Stem Cell Grant.
Emily J. Gilmore has received NIH research grant support (R01NS117904), with funding provided to her institution.
Mouhsin M. Shafi has received research funding from NIH (R01AG060987, R01EB032820).
Edilberto Amorim has received research funding from NIH (K23NS119794, 1OT2OD032701, R01NS128342), the Department of Defense (HT9425-23-1-0242, HT9425-25-1-0170, W81XWH-19-1-0861, W81XWH-21-C-0075), the American Heart Association (AMFDP 843457, 20CDA35310297, 24DIVSUP1274116), the Regents of the University of California, Cures Within Reach (2022CAL-Amorim), the Zoll Foundation, and the Hellman Foundation.
Jennifer A. Kim has received research funding from NINDS (K23NS112596, R01NS117904, R01NS126282), the Brain Aneurysm Foundation, and the Swebilius Foundation.
William O. Tatum reports disclosures filed with the journal.

References

Sun C, Jing J, Turley N, et al. Harvard Electroencephalography Database: A comprehensive clinical electroencephalographic resource from four Boston hospitals. Epilepsia (2025). https://doi.org/10.1111/epi.18487
Tveit J, Aurlien H, Plis S, et al. Automated interpretation of clinical electroencephalograms using artificial intelligence. JAMA Neurology 80(8):805-812 (2023). https://doi.org/10.1001/jamaneurol.2023.1645
Jing J, Ge W, Hong S, et al. Development of expert-level classification of seizures and rhythmic and periodic patterns during EEG interpretation. Neurology 100(17):e1750-e1762 (2023). https://doi.org/10.1212/WNL.0000000000207127
Li J, Goldenholz DM, Alkofer M, et al. Expert-level detection of epilepsy markers in EEG on short and long timescales. NEJM AI 2(7) (2025). https://doi.org/10.1056/AIoa2401221
Perslev M, Darkner S, Kempfner L, et al. U-Sleep: Resilient high-frequency sleep staging. NPJ Digital Medicine 4:72 (2021).
Jing J, Ge W, Struck AF, et al. Interrater reliability of expert electroencephalographers identifying seizures and rhythmic and periodic patterns in EEGs. Neurology 100(17):e1737-e1749 (2023). https://doi.org/10.1212/WNL.0000000000201670
Jiang W, Zhao L, Lu B. Large brain model for learning generic representations with tremendous EEG data in BCI. ICLR (2024).
Chen Y, Ren K, Song K, et al. EEGFormer: Towards transferable and interpretable large-scale EEG foundation model. arXiv:2401.10278 (2024).
Wang C, Subramaniam V, Yaari AU, et al. BrainBERT: Self-supervised representation learning for intracranial recordings. arXiv:2302.14367 (2023).
Berry RB, Quan SF, Abreu AR, et al. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications, v2.6. American Academy of Sleep Medicine (2020).
Lin TY, Goyal P, Girshick R, et al. Focal loss for dense object detection. ICCV:2980-2988 (2017). https://doi.org/10.1109/ICCV.2017.324
O'Reilly C, Gosselin N, Carrier J, et al. Montreal Archive of Sleep Studies: An open-access resource for instrument benchmarking and exploratory research. J Sleep Res 23(6):628-635 (2013).
Jing J, Lin Z, Yang C, et al. HMS - Harmful Brain Activity Classification. Kaggle Competition (2024). https://kaggle.com/competitions/hms-harmful-brain-activity-classification
Platt J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers (1999).
Zadrozny B, Elkan C. Transforming classifier scores into accurate multiclass probability estimates. ACM SIGKDD (2002).

Parent Projects

MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code was derived from:

Harvard Electroencephalography Database v4.1

Please cite them when using this project.

Access

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
BDSP Credentialed Health Data License 1.5.0

Data Use Agreement:
BDSP Credentialed Health Data Use Agreement

Required training:

Discovery

DOI:
https://doi.org/10.60508/v7ky-5g40

Corresponding Author

You must be logged in to view the contact information.

Files

This is a restricted-access resource. To access the files, you must fulfill all of the following requirements:

be a credentialed user
sign the data use agreement for the project

MORGOTH 1.0: A Foundation Model for Clinical EEG - Data and Code

Cite