Database Restricted Access

On-demand EEG education through competition: learning to identify IEDs

Jaden D. Barfuss Fabio Nascimento Erik Duhaime Srishti Kapur Ioannis Karakis Marcus Ng Aline Herlopian Alice Lam Douglas Maus Jonathan Halford Sydney Cash M Brandon Westover Jin Jing

Published: April 18, 2026. Version: 1.0.0


When using this resource, please cite: (show more options)
Barfuss, J. D., Nascimento, F., Duhaime, E., Kapur, S., Karakis, I., Ng, M., Herlopian, A., Lam, A., Maus, D., Halford, J., Cash, S., Westover, M. B., & Jing, J. (2026). On-demand EEG education through competition: learning to identify IEDs (version 1.0.0). Brain Data Science Platform. https://doi.org/10.60508/xcm9-3k12.

Additionally, please cite the original publication:

Barfuss JD, Nascimento FA, Duhaime E, Kapur S, Karakis I, Ng M, Herlopian A, Lam A, Maus D, Halford JJ, Cash S, Westover MB, Jing J. On-demand EEG education through competition - A novel, app-based approach to learning to identify interictal epileptiform discharges. Clin Neurophysiol Pract. 2023;8:177-186.

Abstract

Objective. Misinterpretation of EEGs harms patients, yet few resources exist to help trainees practice interpreting EEGs. We therefore sought to evaluate a novel educational tool to teach trainees how to identify interictal epileptiform discharges (IEDs) on EEG.

Methods. We created a public EEG test within the iOS app DiagnosUs using a pool of 13,262 candidate IEDs. Users were shown a candidate IED on EEG and asked to rate it as epileptiform (IED) or not (non-IED). They were given immediate feedback based on a gold standard. Learning was analyzed using a parametric model. We additionally analyzed IED features that best correlated with expert ratings.

Results. Our analysis included 901 participants. Users achieved a mean improvement of 13% over 1,000 questions and an ending accuracy of 81%. Users and experts appeared to rely on a similar set of IED morphologic features when analyzing candidate IEDs. We additionally identified particular types of candidate EEGs that remained challenging for most users even after substantial practice.

Conclusions. Users improved in their ability to properly classify candidate IEDs through repeated exposure and immediate feedback.

Significance. This app-based learning activity has great potential to be an effective supplemental tool to teach neurology trainees how to accurately identify IEDs on EEG.


Background

A large portion of EEG studies in the United States are read by general neurologists without post-residency or fellowship training in neurophysiology or epilepsy. Given the well-known challenges of distinguishing benign from pathological EEG features, coupled with documented worldwide deficiencies in neurology residency EEG education, EEG misinterpretation is not uncommon in real-world practice. Inaccurate EEG reads may result in serious consequences for patients, especially in scenarios where a diagnosis of epilepsy and initiation of antiseizure medication rely on EEG results.

Misinterpretation of EEG due to under-calling (failure to recognize IEDs when present) or over-calling (mistakenly reporting benign transients as IEDs) can lead to misdiagnosis and harm to patients. Improving EEG education is thus a necessary step to improving epilepsy patient care.

This project releases the deidentified raw EEG candidates, gold-standard expert labels, survey responses, and aggregated performance scores from an iOS-app-based EEG teaching tool (DiagnosUs). The tool engaged 2,270 participants worldwide in learning to recognize IEDs via repeated practice with immediate feedback, and the released dataset is intended to support further research into EEG education, crowdsourcing for label validation, and tools to improve IED-recognition skill.


Methods

Gold standard. The 13,262 candidate IEDs used in the app come from a prior study (Jing et al., JAMA Neurol 2020) in which eight fellowship-trained physicians independently classified each candidate as epileptiform (IED) or not. For the DiagnosUs-based study, a candidate was considered “positive” if at least 3 of the 8 original experts classified it as an IED. The team additionally evaluated performance on three categories based on the degree of expert agreement: clear IEDs (6–8 of 8 votes), clear non-IEDs (0–2 of 8 votes), and unclear IEDs (3–5 of 8 votes).

IED scoring contest. Participants on the DiagnosUs iOS app (Centaur Labs, Boston, MA) were shown 10-second EEG epochs in bipolar montage as static images, with a vertical red rectangle highlighting the candidate IED. Users voted yes (IED) or no (non-IED) and received instant feedback based on the expert-consensus gold standard. Prior to playing, users completed a short training tutorial and 25 practice questions (not counted in performance). Time-limited leaderboard contests with prizes ranging from $0.50 to $75 were intermittently launched to maintain user engagement. The competition was free and open to anyone with an iOS device.

Statistical analysis. Three analyses were performed: (1) a parametric learning-rate analysis modeling each user's accuracy as p(n) = a · (n/N)b, where n is the question number, N = 5,000, a is the asymptotic accuracy, and b is the learning rate; (2) a feature analysis correlating 23 IED morphological features with participant voting behavior, compared against correlations with expert voting; and (3) an expert-vs-crowd disagreement analysis qualitatively characterizing outlier cases where participants strongly disagreed with the expert consensus.

Ethics. Preparation of the data and public sharing of the deidentified images was conducted under an IRB-approved protocol. Data were deidentified and obtained from users who voluntarily participated in the public EEG test within the DiagnosUs app. Use of the app did not require written informed consent.


Data Description

The dataset is hosted at s3://bdsp-opendata-restricted/spike-learning-centaur/ and contains the following components:

Top-level files:

  • LUT_spikes.xlsx — Lookup table mapping each candidate IED to its 8-expert gold-standard votes and candidate identifier.
  • SpikeTestData_new9_sampled.mat — Sampled test data used to derive performance benchmarks for the “additional 9” (A9) expert raters.
  • eeg-survey.csv — Survey instrument distributed to participants, capturing demographic and professional-background information.
  • eeg-survey-responses.csv — Survey responses from participants, including professional role and EEG-reading experience.
  • final_Scores.xlsx — Final per-participant performance scores and accuracy summaries.

raw_eeg/ — 13,262 MATLAB .mat files, one per candidate IED. Each file contains a 10-second, multi-channel scalp EEG window used as a single question in the DiagnosUs app, together with metadata identifying the candidate and the expert-consensus vote. Filenames follow the pattern b{batch}_SEG_{candidate_id}.mat. These match the 13,262 candidate IEDs described in the companion paper (Jing et al., JAMA Neurol 2020) and the companion “Measuring Expertise in Identifying Interictal Epileptiform Discharges” BDSP dataset.

Total: 13,274 files, approximately 2.36 GB. All EEG waveforms have been de-identified; all patient-level identifiers have been removed per HIPAA Safe Harbor.


Usage Notes

Code to reproduce the manuscript analyses — learning-rate modeling, feature correlation, outlier categorization, and figures — is available at https://github.com/bdsp-core/spike-learning-game.

The .mat files in raw_eeg/ can be loaded with scipy.io.loadmat (Python) or directly in MATLAB. The gold-standard labels and per-segment metadata live in LUT_spikes.xlsx; survey information is in eeg-survey-responses.csv; per-participant final scores are in final_Scores.xlsx.

Typical use cases:

  • Replicating or extending the learning-rate analysis across different subgroups of raters.
  • Benchmarking automated IED-detection or IED-characterization models against a large crowdsourced-labeled dataset.
  • Developing EEG-education tools (apps, web games, or resident-training modules) on top of the same 13,262-candidate IED corpus used by the DiagnosUs iOS app.
  • Studying the “wisdom of the crowd” versus expert consensus for ambiguous neurophysiology classification tasks.

The DiagnosUs iOS app used in the original study is available at https://www.diagnosus.com.


Ethics

Preparation of the data and public sharing of the deidentified candidate-IED images on the DiagnosUs iOS app was conducted under an IRB-approved protocol. Data were deidentified and obtained from users who volunteered to participate in the public EEG test. Use of the app did not require users to provide written informed consent.


Acknowledgements

No funding was used in this study.


Conflicts of Interest

E. Duhaime and S. Kapur of Centaur Labs developed and have a financial interest in the DiagnosUs app used to collect the data. J. Barfuss, F. Nascimento, I. Karakis, M. Ng, A. Herlopian, A. Lam, D. Maus, J. Halford, S. Cash, M. Westover, and J. Jing report no disclosures.


Share
Access

Access Policy:
Only registered users who sign the specified data use agreement can access the files.

License (for files):
BDSP Restricted Health Data License 1.0.0

Data Use Agreement:
BDSP Restricted Health Data Use Agreement

Corresponding Author
You must be logged in to view the contact information.

Files