Database Credentialed Access

Automated Extraction of Seizures and Ictal-Interictal Continuum Patterns from EEG Reports - Data and Code

Shadi Sartipi Deena S. Godfrey Alexandra-Maria Tauțan Marta P. Fernandes Manohar Ghanta Aditya Gupta Bruce Nearing Jennifer A. Kim Aaron F. Struck M. Brandon Westover Sahar F. Zafar

Published: June 1, 2026. Version: 1.0.0


When using this resource, please cite: (show more options)
Sartipi, S., Godfrey, D. S., Tauțan, A., Fernandes, M. P., Ghanta, M., Gupta, A., Nearing, B., Kim, J. A., Struck, A. F., Westover, M. B., & Zafar, S. F. (2026). Automated Extraction of Seizures and Ictal-Interictal Continuum Patterns from EEG Reports - Data and Code (version 1.0.0). Brain Data Science Platform. https://doi.org/10.60508/j6w0-v279.

Abstract

Background: Electroencephalography (EEG) reports are critical in critical-care brain monitoring but remain largely inaccessible for large-scale research studies due to their free-text format. Manual extraction of seizure and ictal-interictal continuum pattern information is time-consuming and limits population-level studies in neurocritical care.

Objective: To develop and validate an automated, cross-institutional pipeline for structured extraction of seizure and ictal-interictal continuum pattern information from EEG reports using large language models.

Methods: We developed a five-stage pipeline comprising: (1) text normalization, (2) rule-based pattern filtering, (3) segment-focused prompt generation, (4) large language model inference for extracting structured attributes, and (5) post-processing. The pipeline was applied to 177,902 EEG reports from four institutions (three adult, one pediatric). Performance was evaluated using a balanced set of 1,500 expert-reviewed samples, extracting seizure presence, count, burden, and timing, as well as ictal-interictal continuum pattern frequency and prevalence. Patterns included lateralized periodic discharges, generalized periodic discharges, and lateralized rhythmic delta activity.

Results: The pipeline achieved high performance across institutions for seizure (average accuracy 0.94) and IIC pattern detection (LPD: 0.97 [95% CI: 0.94–0.99], GPD: 0.96 [0.92–0.98], GRDA/LRDA: 0.97 [0.94–0.99]). Seizure sensitivity was 0.72 at MGB (95% CI: 0.66–0.77) and 0.99 at BIDMC/BCH. Frequency and prevalence extraction showed strong agreement with mean values 0.94 and 0.93, respectively. High-frequency discharges (≥2 Hz) were significantly associated with seizure occurrence within 24 and 48 hours, supporting their prognostic value. The model maintained consistent performance across diverse report styles and institutions, enabling large-scale, automated EEG phenotyping for neurocritical care research.

Conclusions: This study provides an effective, scalable alternative to waveform-based seizure analysis, enabling structured EEG interpretation using only clinical narratives. By linking early ictal-interictal continuum characteristics to seizure progression, the approach supports real-time risk stratification in intensive care unit settings and accelerates large-scale research into EEG-based outcome prediction.

Keywords: Electroencephalography reports; ictal-interictal continuum pattern; large language model inference; rule-based pattern.


Background

Continuous electroencephalography (cEEG) plays a vital role in seizure detection, ischemia detection, and prognostication in patients with acute brain injuries and altered mental status [1], [2], [3], [4]. Up to 20-40% of patients with acute brain injuries and altered mental status are found to have nonconvulsive seizures and rhythmic and periodic EEG patterns that fall on the ictal–interictal continuum (IIC) [2], [5]. Patterns such as lateralized periodic discharges (LPDs), generalized periodic discharges (GPDs), lateralized rhythmic delta activity (LRDA), are increasingly recognized as markers of brain dysfunction and prognosis in critically ill patients [6], [7], [8].

Despite the clinical significance of these EEG patterns, large-scale studies investigating treatment of seizures and other IIC patterns have been limited due to the time-consuming nature of extracting structured data from EEG reports [9]. Typically, EEG interpretations are recorded as unstructured free-text reports, which vary widely in terminology, formatting, and level of detail depending on the institution and the individual clinician [10], [11]. Manual chart review is labor-intensive, time-consuming, unscalable, and requires trained experts, posing a significant roadblock to population-level studies aimed at understanding the prognostic implications of EEG patterns or evaluating treatment responses in heterogeneous patient cohorts [12], [13].

Numerous approaches have been developed to automate EEG interpretation, and they fall into two main categories: waveform-based and report-based methods [14], [15]. waveform-based models apply machine learning frameworks to detect seizures or classify EEG waveforms directly from raw data [16], [17]. These methods can achieve high accuracy but depend on access to large volumes of raw EEG data, which is not always available, and often require substantial computational resources. Alternative efforts, such as the American Clinical Neurophysiology Society (ACNS) standardized terminology initiatives, have focused on harmonizing EEG annotation protocols [2]. Text-based approaches using natural language processing (NLP) have shown promise in extracting seizure-related information from EEG reports. Some have employed rule-based systems or keyword extraction [18], [19], [20], while others have applied machine learning and transformer-based models [21], [22].

While the performance of these prior works is significant, they have important limitations. Many NLP pipelines focus only on binary seizure detection, overlooking critical contextual features such as seizure frequency, duration, and burden. In addition, there are no pipelines for extraction of periodic, rhythmic or IIC patterns, as well as their frequency and prevalence [23], [24], [25]. Moreover, most have been validated on small datasets or are from a single institution, limiting generalizability. Furthermore, large language models (LLMs) used in recent studies are often prone to hallucination, particularly when tasked with extracting information that is only implicitly mentioned or inconsistently reported [26]. There remains a need for a validated method that can extract both the presence and characteristics of seizures, periodic and rhythmic, and IIC patterns from EEG reports across diverse hospital systems.

To address these challenges, we developed a cross-institutional pipeline that leverages recent advances in LLMs for structured extraction of seizure and IIC features from unstructured EEG reports. Our pipeline combines preprocessing, pattern-focused segmentation, rule-based filtering, LLM prompting, and post-processing to generate high-resolution structured outputs. Our approach explicitly extracts detailed seizure and IIC characteristics, including frequency, prevalence, and timing, while operating locally on clinical reports, ensuring privacy preservation. Our contributions are threefold: (1) we introduce a scalable, reproducible pipeline for automated EEG report parsing across diverse hospital systems; (2) we validate this approach on expert-annotated samples across three adult sites and one pediatric site; (3) we show that early IIC patterns, especially high-frequency discharges, are associated with increased seizure risk, highlighting their utility in neurocritical care triage and monitoring. Together, this work provides a framework for unlocking large-scale EEG datasets to advance research in neurocritical care.


Methods

Overview

In this study, we analyzed EEG reports from four healthcare institutions to extract information on seizures, periodic and rhythmic, and other IIC patterns from unstructured clinical text. The study was done under research protocols approved by the Mass General Brigham and Beth Israel Deaconess Medical Center IRB (MGB: 2023P000478, 2024P002630, BIDMC: 2022P000417, 2022P000481). Our approach consisted of two primary tasks: (1) identifying the presence of specific EEG patterns and (2) extracting detailed attributes related to each pattern. By combining the outputs of these steps, we generated structured data capturing seizure presence, count, timing, duration, and burden. For periodic and rhythmic patterns, we extracted presence, frequency, and prevalence.

Dataset

We used EEG report data from four healthcare institutions: Massachusetts General Hospital Brigham and Women’s Hospital, Beth Israel Deaconess Medical Center (BIDMC), and Boston Children’s Hospital (BCH). For MGB and BIDMC, we used EEG reports from adult inpatient hospitalizations between 1998 and 2023. The BCH dataset included inpatient and outpatient EEG reports from 2007 to 2024. Table 1 provides a breakdown of patient age across institutions.

Table 1. Summary of data across all cohorts.

Category

Count

MGB

79932

BIDMC

48978

BCH

48992

Total

177902

Unique Patients 56580

Age

MGB

BIDMC

BCH

Total

Percentage

[ 0, 10)

0

0

3921

3921

6.93%

[10, 20)

341

102

10536

10979

19.40%

[20, 30)

1248

657

8358

10263

18.13%

[30, 40)

1544

703

1467

3714

6.56%

[40, 50)

2193

963

53

3209

5.67%

[50, 60)

3846

1655

3

5504

9.72%

[60, 70)

5435

2148

4

7587

13.40%

[70, 80)

4813

1907

0

6720

11.88%

[80, 90)

2499

1371

0

3870

6.84%

[90, )

540

273

0

813

1.44%

Gender

MGB

BIDMC

BCH

Total

Percentage

Male

12323

5063

13375

30761

54.37%

Female

10135

4714

10946

25795

45.59%

Unknown

1

2

3

6

0.01%

EEG Findings

MGB

BIDMC

BCH

Total

Percentage

Seizure

8688

5653

8169

22510

12.65%

LPD

9658

2124

172

11954

6.72%

GPD

8156

2295

60

10511

5.91%

LRDA

4602

736

56

5394

3.03%

GRDA

12431

2288

130

14849

8.35%

Five-stage pipeline

We developed a multi-stage pipeline consisting of five stages: preprocessing, rule-based pattern detection, segment-focused prompt generation, LLM inference, and postprocessing. The summary of the steps is presented in Figure 1.

Figure 1. Overview of the proposed pipeline.

In the first stage, text preprocessing was applied to standardize the input reports. This included whitespace normalization, conversion to lowercase, removal of non-informative lines, and elimination of redundant symbols such as asterisks and exclamation points. These steps are commonly used in clinical NLP to reduce noise and improve text parsing efficiency [15].

Next, the processed text was segmented into contextually meaningful units referred to as “epochs” representing blocks of EEG descriptions relevant to specific time periods of EEG monitoring (typically 8-12 hours in duration). EEG patterns were categorized using the ACNS 2021 standardized terminology [2]. We applied regular expressions to detect explicit mentions of seizures, LPD, GPD, LRDA, GRDA, and IIC patterns. Table 2 lists the specific expressions used for periodic and rhythmic pattern identification. This step limited downstream model inference to the most relevant portions of each report.

The extracted segments were then refined using stemming and embedded into structured prompts [27] specifically designed to extract key pattern-related attributes.

Two examples of the prompt formats are as follows.

For Periodic, rhythmic and IIC patterns:

You are a neurology expert. Extract {pattern name} details exactly as stated in the EEG report. Provide the result in JSON format.

{pattern name}_frequency: in Hz (maximum 4 Hz); return “none” if not mentioned.

{pattern name}_prevalence: choose from rare, occasional, frequent, abundant, continuous, intermittent, or none; return “none” if not directly mentioned.

EEG Report: {insert report text here}

Provide only the JSON object

For seizures:

You are a neurology expert. Extract seizure details from the EEG report. Provide the result in JSON format.

seizures: “yes” if seizures are present, “no” otherwise

seizures_number: total number of seizures (“none” if not mentioned)

seizures_time_stamps: list of times or time ranges (e.g., 9–10 PM); “none” if not available

seizures_duration: duration in seconds; “none” if not provided

seizure_burden: frequency as described in the report (e.g., “6/hour”, “every 10 minutes”); “none” if not directly mentioned

EEG Report: {insert report text here}

Provide only the JSON object.

Table 2. The expressions based on ACNS 2021 terminology.

LPD

laterali[sz]ed periodic discharge, lpd, periodic lateralized discharge, pld, periodic lateralized epileptiform discharge, pled, periodic discharges with extension over left/right, periodic and further evolve into… rhythmic discharges

GPD

generali[zs]ed periodic discharges, gpd, periodic generalized discharges, generalized periodic epileptiform discharge, gped, periodic generalized epileptiform discharge generalized …sharps … with … triphasic

LRDA

lateralized rhythmic delta activity, lrda, temporal intermittent rhythmic delta activity, tirda, polymorphic delta slowing over the temporal lobes bilaterally, lrda+s

GRDA

generali[zs]ed rhythmic delta activity, grda, frontal intermittent rhythmic delta activity, firda, oirda, grda+s

This structured prompting approach was designed to minimize ambiguity and guide the model to extract only the relevant information with a high degree of precision.

Post-Processing

The post-processing stage involved several steps to ensure accuracy and consistency of the extracted information. For the consistency, “none” value refers to the information that is not available in the report and “0” refers to the absence of the pattern.

First, despite instructing the language model to extract only the information explicitly mentioned in the EEG report, the model occasionally inferred features, particularly pattern prevalence, even when they were not directly stated. To mitigate this issue and reduce the risk of hallucinated content, we validated all extracted attributes by checking whether the corresponding expression appeared verbatim in the segmented text. If the extracted information was not directly supported by the source text, we replaced it with a “none” value.

Second, we addressed inconsistencies in the chronological order of EEG reports (epochs) for a single hospital encounter. This issue was particularly common in MGB data, where the order of epochs did not always align with the sequence of their timestamps. To resolve this, we flagged such encounters and re-sorted the epochs by their associated timestamps, ensuring that each patient’s data followed a consistent temporal order.

Third, we accounted for the fact that multiple EEG reports could be associated with a single hospital admission. These reports often came from different departments and may have contained overlapping or partial information. To unify these data, we used hospital admission time to group and aggregate reports from the same encounter. If a feature was marked as “none”, meaning not mentioned in the report, in some epochs but reported in others, we imputed the missing values using the most informative value. For instance, if the LRDA prevalence across aggregated epochs was recorded as [rare, occasional, none, none, occasional, 0], we updated the missing values to reflect the most frequently observed prevalence, resulting in [rare, occasional, occasional, occasional, occasional, 0].

Fourth, to provide consistency in reporting across all sites, we standardized the epoch duration to a 24-hour window. For patients with multiple epochs within the same 24-hour period, we consolidated findings using the following rules:

A pattern was considered present if it appeared in any of the reports within the window.

The total number of seizures was calculated as the sum of seizures recorded across epochs.

For frequency, we selected the highest reported value.

For prevalence, we chose the most severe category, based on the following hierarchy: rare < occasional < intermittent < frequent < abundant < continuous.

All other pattern attributes were aggregated accordingly to reflect the most complete representation of the patient’s EEG during that 24-hour period.

These post-processing steps were essential to ensure that the extracted features accurately reflected the content of the original reports and were harmonized across multiple institutions and reporting styles.

We conducted all experiments on a Windows system using the Ollama interface to run LLaMA for inference [28], [29].

Model Performance Verification Based on Expert Annotation

To validate the performance of our pipeline, we conducted an expert-reviewed evaluation of randomly selected epochs. We selected 100 samples per pattern per site, with balanced class representation (i.e., 50 positive and 50 negative samples per site). This resulted in 300 samples per pattern and a total of 1,500 expert-annotated samples across all five patterns.

Performance metrics included accuracy (Acc), sensitivity (Sens), and specificity (Spec). To account for variability in model estimates, we computed 95% confidence intervals (CI) for each metric using binomial proportion confidence limits. For each EEG pattern, a misclassification of its presence/absence also led to associated feature-level errors (e.g., frequency, prevalence), ensuring the evaluation reflected both detection and descriptive accuracy.


Data Description

This BDSP project hosts the deidentified data and code that accompany Sartipi et al. (manuscript under review). The companion S3 folder s3://bdsp-opendata-credentialed/eeg-report-extraction/ contains three CSV files of extracted EEG findings and a code directory with the extraction pipeline.

Data files

Three CSV files of deidentified EEG findings, one per source institution:

FileSourceColumnsSize
data/bch_eeg_findings_deidentified.csvBoston Children's Hospital (BCH, pediatric)266.4 MB
data/bidmc_eeg_findings_deidentified.csvBeth Israel Deaconess Medical Center (BIDMC, adult)213.5 MB
data/mgb_eeg_findings_deidentified.csvMass General Brigham (MGB, adult; deidentified subset)3112 MB

Each row corresponds to a single EEG epoch / report segment. Patient identifiers have been replaced with BDSP-style integer IDs (BDSPPatientID); dates have been shifted (ShiftedDateOfBirth, ShiftedDate, etc.) to prevent re-identification while preserving relative intervals.

Common columns across institutions

Each CSV contains the following groups of extracted attributes:

  • Demographics: BDSPPatientID, ShiftedDateOfBirth, Gender/SexDSC, and date/time fields for the EEG and (where applicable) hospital admission and discharge.
  • Seizures: presence, count, timestamps, duration, burden.
  • IIC patterns (each of: GPD, GRDA, LPD, LRDA, BIRD where reported): presence, frequency, prevalence, and (for MGB/BCH) the LLM-emitted response string for downstream audit.

Column schemas differ slightly across institutions because of source-report formatting differences; refer to the file headers and the per-institution sections of the manuscript for full column definitions.

Coverage caveat

The MGB CSV in this release contains the deidentified-shareable subset of the MGB data analyzed in the manuscript, in accordance with MGB's data-sharing rules. The BIDMC and BCH CSVs match the manuscript counts.

Code

The code/ folder contains the extraction pipeline:

  • code_eeg_annotation.ipynb — main Jupyter notebook implementing the five-stage pipeline (preprocessing, rule-based filtering, segment-focused prompt generation, LLM inference, post-processing).
  • help_functions.py — shared utility functions (text normalization, pattern matching, structured-attribute parsing).
  • list_szPatterns0x.txt, list_szPatterns1.txt, list_szPatternsR.txt — seizure-pattern keyword lists used in rule-based filtering.
  • list_gpdPatterns.txt, list_grdaPatterns.txt, list_lpdPatterns.txt, list_lrdaPatterns.txt — pattern keyword lists for each IIC pattern (GPD, GRDA, LPD, LRDA).

Usage Notes

Code on GitHub

The pipeline code is mirrored at https://github.com/bdsp-core/eeg-report-extraction (public, browseable without credentialed access). Clone for the canonical reference; data still requires credentialed access via the S3 paths below.

Loading the data

import pandas as pd

bch   = pd.read_csv("data/bch_eeg_findings_deidentified.csv")
bidmc = pd.read_csv("data/bidmc_eeg_findings_deidentified.csv")
mgb   = pd.read_csv("data/mgb_eeg_findings_deidentified.csv")

print(bch.shape, bidmc.shape, mgb.shape)
print(bch.columns.tolist())

Reproducing the extraction pipeline

The notebook code/code_eeg_annotation.ipynb walks through the five-stage pipeline end-to-end, applying it to a sample input. To run it on your own EEG-report corpus:

  1. Open the notebook in JupyterLab (Python ≥3.10). Install pandas, numpy, and your preferred LLM client (the pipeline was developed against the OpenAI API; any chat-completion-compatible client works with minor adjustments).
  2. Point the input path to your own EEG-report CSV/TSV with one report per row.
  3. The pattern-keyword files (list_*Patterns*.txt) drive the rule-based filtering stage; adjust them if your reports use different terminology.
  4. Configure your LLM credentials and model name. The original work used GPT-4-class models; smaller models will degrade extraction quality, especially on rare pattern types.

Suggested analyses

  • Reproduce the Table 1 demographic and pattern-prevalence breakdowns by institution.
  • Reproduce the 24-/48-hour seizure association analysis for high-frequency IIC patterns (≥2 Hz) reported in the manuscript.
  • Cross-validate extraction against a separate manually-annotated set in your own institution to estimate transferability.

Accessing the S3 data

If you've been granted credentialed access through bdsp.io, you can download the files via the AWS CLI:

aws s3 sync s3://bdsp-opendata-credentialed/eeg-report-extraction/ ./eeg-report-extraction/

Or via the access-point alias if your team uses one:

aws s3 sync s3://bdsp-credentialed-pr-fymwc8rqh9fzdisq7om7eiq9wutqhuse1b-s3alias/eeg-report-extraction/ ./

Release Notes

Version 1.0.0 — initial release accompanying Sartipi et al. (manuscript under review). Contains deidentified extracted EEG findings from three institutions (BCH, BIDMC, MGB) plus the LLM-based extraction pipeline.


Ethics

The study was done under research protocols approved by the Mass General Brigham and Beth Israel Deaconess Medical Center IRB (MGB: 2023P000478, 2024P002630, BIDMC: 2022P000417, 2022P000481)


Acknowledgements

Funding: This work was supported by grants from the NIH (RF1AG064312, RF1NS120947, R01AG073410, R01HL161253, R01NS126282, R01AG073598, R01NS131347, R01NS130119, R01NS131347.


Conflicts of Interest

Dr. Westover is a co-founder, scientific advisor, consultant to, and has personal equity interest in Beacon Biosignals. Dr. Zafar receives publishing royalties from Springer and Wolters Kluwer.


Share
Access

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
BDSP Credentialed Health Data License 1.5.0

Data Use Agreement:
BDSP Credentialed Health Data Use Agreement

Required training:

Corresponding Author
You must be logged in to view the contact information.

Files