Resources


Model Credentialed Access

Automated extraction of stroke severity from unstructured electronic health records using natural language processing

Marta Fernandes, M Brandon Westover, Aneesh Singhal, Sahar Zafar

This project automatically extracts NIHSS scores from unstructured electronic health records using natural language processing

nihss nlp stroke

Published: Oct. 2, 2025. Version: 1.0.0


Model Credentialed Access

Automated extraction of post-stroke functional outcomes from unstructured electronic health records

Marta Fernandes, Kaileigh Gallagher, Niels Turley, Aditya Gupta, M Brandon Westover, Aneesh Singhal, Sahar Zafar

This project aims to automatically extract mRS scores for a post-stroke patient population from unstructured electronic health records using natural language processing

stroke natural language processing machine learning modified rankin scale

Published: Oct. 2, 2025. Version: 1.0.0


Model Open Access

Automated phenotyping of mild cognitive impairment and Alzheimer's disease and related dementias using electronic health records

Ruoqi Wei, Niels Turley, Aditya Gupta, Manohar Ghanta, Robert Thomas, Sahar Zafar, Haoqi Sun, M Brandon Westover

a MCI/ADRD EHR phenotyping model trained with python sklearn pipeline, injoblib format.

Published: Sept. 25, 2025. Version: 1.1


Database Credentialed Access

The Brain Imaging and Neurophysiology Database (BIND)

Charlotte Maschke, Peter Hadar, Yicheng Zhang, Jian Li, Gauri Ganjoo, Andrew Hoopes, Alessandro Guazzo, Aditya Gupta, Manohar Ghanta, Bruce Nearing, Christine Tsien Silvers, Bharath Gunapati, Robert Thomas, Jennifer Kim, Shibani Mukerji, Adrian Dalca, Sahar Zafar, Alice Lam, Emmanuel Mignot, M Brandon Westover

BIND Database 1: Neuroimaging Data (MRI, CT, PET, SPECT) that can be paired with EEG and PSG (found in the Harvard EEG Database https://bdsp.io/content/harvard-eeg-db/4.1/). LLMs helped categorize pathology.

ct mri brain imaging

Published: Sept. 9, 2025. Version: 1.0


Database Credentialed Access

The Boston Childrens Hospital Sleep Corpus

Ayush Tripathi, Wolfgang Ganglberger, Haoqi Sun, Callison Alcott, Niels Turley, Rebecca Fitzgerald, Ayan Mitra, Samuel Waters, Arnav Gupta, Aditya Gupta, Manohar Ghanta, Valdery Moura Junior, Samaneh Nasiri, Bruce Nearing, Katie Stone, Emmanuel Mignot, Dennis Hwang, Matthew Reyna, Zuzana Koscova, Chad Robichaux, Zhiyong Zhang, Qiao Li, Gauri Ganjoo, Lynn Marie Trotti, Gari Clifford, Christine Tsien Silvers, Bharath Gunapati, Robert Thomas, M Brandon Westover, Kiran Maski, Umakanth Katwa

The Boston Children’s Hospital (BCH) Sleep Corpus comprises 15,695 fully annotated pediatric polysomnography (PSG) recordings collected between 2010 and 2024.

Published: Aug. 6, 2025. Version: 1.0.1


Database Restricted Access

Cerebrospinal Fluid Testing for Neuroinvasive West Nile Virus and Measures to Improve Guideline Adherence

Carson Quinn, Karan Singh, Erik Klontz, Isaac Solomon, Shibani Mukerji

De-identified, dataset of 1,304 adult encounters with CSF-fluid West Nile virus testing patterns at two Mass General Brigham hospitals (2016-2023). Includes demographics, immune status, CSF/serum labs, WNV PCR & IgM results, guideline-adherence flags

Published: July 31, 2025. Version: 1.0.0


Database Credentialed Access

Harvard-Emory ECG Database

Zuzana Koscova, Valdery Moura Junior, Matthew Reyna, Shenda Hong, Aditya Gupta, Manohar Ghanta, Reza Sameni, Aaron Aguirre, Qiao Li, Sahar Zafar, Gari Clifford, M Brandon Westover

The Harvard ECG database (HEEDB) is a large collection of 12-lead electrocardiography (ECG) recordings.

Published: July 28, 2025. Version: 4.0


Database Credentialed Access

Identification of patients with epilepsy using automated electronic health records phenotyping - Data and Code

Marta Fernandes, Sahar Zafar, M Brandon Westover

Code and data for identifying patients with epilepsy using automated electronic health records.

nlp ehr epilepsy

Published: June 5, 2025. Version: 1.0


Database Credentialed Access

Automated phenotyping of mild cognitive impairment and dementias using electronic health records - Data and Code

Ruoqi Wei, Robert Thomas, M Brandon Westover, Haoqi Sun

Data and Code to reproduce results in "Automated phenotyping of mild cognitive impairment and dementias using electronic health records"

nlp ad mci

Published: June 5, 2025. Version: 1.0


Database Credentialed Access

NIDX: A Machine Learning Approach for Identifying People with Neuroinfectious Diseases in Electronic Health Records

Arjun Singh, Shadi Sartipi, Haoqi Sun, Niels Turley, Sahar Zafar, Sudeshna Das, Marta Fernandes, M Brandon Westover, Shibani Mukerji

A machine learning approach to accurately identify neuroinfectious diseases from clinical notes.

electronic health records ehr phenotyping neuroinfectious diseases natural language processing

Published: May 31, 2025. Version: 1.0