Database Open Access

VE-CAM-S: Visual EEG-Based Grading of Delirium Severity and Associations with Clinical Outcomes

Ryan Tesh Haoqi Sun Jin Jing Mike Westmeijer Anudeepthi Neelagiri Subapriya Rajan Parimala Velpula Krishnamurthy Pooja Sikka Syed Quadri Michael Leone Luis Paixao Ezhil Panneerselvam Christine Eckhardt Aaron F Struck Peter Kaplan Oluwaseun Akeju Daniel Jones Eyal Kimchi M Brandon Westover

Published: Jan. 5, 2024. Version: 1.0


When using this resource, please cite: (show more options)
Tesh, R., Sun, H., Jing, J., Westmeijer, M., Neelagiri, A., Rajan, S., Velpula Krishnamurthy, P., Sikka, P., Quadri, S., Leone, M., Paixao, L., Panneerselvam, E., Eckhardt, C., Struck, A. F., Kaplan, P., Akeju, O., Jones, D., Kimchi, E., & Westover, M. B. (2024). VE-CAM-S: Visual EEG-Based Grading of Delirium Severity and Associations with Clinical Outcomes (version 1.0). Brain Data Science Platform. https://doi.org/10.60508/jfpq-mj80.

Additionally, please cite the original publication:

Tesh RA, Sun H, Jing J, Westmeijer M, Neelagiri A, Rajan S, Krishnamurthy PV, Sikka P, Quadri SA, Leone MJ, Paixao L, Panneerselvam E, Eckhardt C, Struck AF, Kaplan PW, Akeju O, Jones D, Kimchi EY, Westover MB. VE-CAM-S: Visual EEG-Based Grading of Delirium Severity and Associations With Clinical Outcomes. Crit Care Explor. 2022 Jan 18;4(1):e0611. doi: 10.1097/CCE.0000000000000611. PMID: 35072078; PMCID: PMC8769081.

Abstract

This is a dataset and code to accompany a published prospective, observational cohort study, which used machine learning to develop the Visual EEG Confusion Assessment Method Severity (VE-CAM-S). VE-CAM-S is a physiological grading scale that quantifies the severity of delirium or coma secondary to acute encephalopathy. VE-CAM-S scores are well calibrated with the severity of delirium symptoms and coma and are associated with clinical outcomes, including in-hospital and 3-month mortality and functional disability at hospital discharge.


Background

Delirium is an acute neuropsychiatric syndrome characterized by a disturbance of attention and awareness. Even though more than 20% of hospitalized older adults experience delirium, delirium is often missed by healthcare professionals because of its variable presentation. 

Delirium is a manifestation of underlying acute encephalopathy and exists on a continuum between subsyndromal delirium and coma. Within this spectrum, the severity of delirium symptoms is associated with increased mortality, longer hospital stays, and cognitive and functional deterioration. 

While clinical tools have been developed to standardize delirium evaluation, even validated delirium severity scales such as the CAM-S are subjective and can be subject to inter-rater variability. Additionally, these scales often create inflexible distinctions between patients whose clinical manifestation of an underlying acute encephalopathy in a given moment is more consistent with delirium, a syndrome of impaired attention and awareness with clear diagnostic criteria, or coma, a syndrome primarily operationalized using decreased responsiveness to environmental stimuli in clinical scales. However, both delirium and coma are potential manifestations of the same acute encephalopathies and patients can fluctuate between these states. A physiologically measure of the full breadth of manifestations of acute encephalopathy, uniting delirium and coma, could overcome challenges to monitoring patients with acute encephalopathy and potentially could provide important prognostic information. 

Numerous studies have documented characteristic electroencephalographic changes in patients with delirium. However, the delirium literature to date has been relatively isolated from the long-standing physiologic literature on encephalopathy, including several clinical neurophysiologic scales that have been proposed to grade the degree of encephalopathy as revealed by EEG. However, prior EEG grading studies have been limited by their focus on narrowly defined patient populations, small sample sizes, evaluation of only limited sets of EEG features, or the use of primarily qualitative analytic tools. 

In this study we use machine learning on a comprehensive set of visually-assessable EEG features in a large and heterogeneous clinical cohort to develop the Visual EEG CAM-S (VE-CAM-S), a physiologic grading scale to quantify the symptom severity in the full spectrum of acute encephalopathy including delirium and coma. We identified a minimal subset of nine EEG features that reliably characterize symptom severity in the context of a grading system where weighted points are assigned for the presence of each EEG feature. We demonstrate that the VE-CAM-S scores are not only well calibrated with symptom severity across both delirium and coma, but also associated with important clinical outcomes, including in-hospital and 3-month mortality and functional disability at the time of hospital discharge.


Methods

Study design, setting, and participants:

We conducted a single-center, prospective observational cohort study consisting of adult inpatients undergoing clinical EEG recording to assess brain activity. Adult inpatients were considered for evaluation from all wards, including medical, surgical, and neurologic floors, as well as ICUs. The study was conducted from August 2015 to December 2019, with a temporary pause in study recruitment from January 2017 to September 2018 due to research staff personnel availability. Patients were excluded prior to evaluation if deaf, severely aphasic, developmentally delayed, non-English speaking (if non-comatose), or if their goals of care focused primarily on comfort measures. Prior to data analysis, patients were also excluded if there were technical difficulties with EEG that precluded clinical interpretation (eFigure 1 of the published manuscript). Study design is compatible with STROBE guidelines.

Standard Protocol Approvals, Registrations, And Patient Consents:

This study of human subjects was approved by the Mass General Brigham Institutional Review Board (IRB approval # 2012P001929), including review of EEG and other clinical data. The Partners Healthcare Human Research Committee provided a waiver of written consent for this study.

Clinical Assessment:

Patients were assessed at the bedside by study staff during active clinical EEG recording or as soon as possible if limited by patient or staff availability. Each evaluation was conducted by a single member of the study team. Study staff were unaware of the EEG results at the time of delirium assessment. Staff were trained to perform assessments through a combination of didactics, literature review, in-person case reviews, and ongoing discussions.

A one-time evaluation of mental status was conducted, using a structured interview to determine the severity of delirium symptoms using the Confusion Assessment Method-Severity (CAM-S, Long Form: 0-19) (10). The CAM-S scores the severity of ten delirium related features: 1) Acute change/fluctuating course, 0-1; 2) Inattention, 0-2; 3) Altered level of consciousness, 0-2; 4) Disorganized thinking, 0-2; 5) Disorientation, 0-2; 6) Memory impairment, 0-2; 7) Perceptual disturbances, 0-2; 8) Psychomotor agitation, 0-2; 9) Psychomotor retardation, 0-2; 10) Altered sleep-wake cycle, 0-2. Additionally, the Richmond Agitation Sedation Scale (RASS; normal= 0) was used to assess level of arousal (29), and we collected the Age-adjusted Charlson Comorbidity Index (ACCI, calculated via the medical record) (30).

For descriptive purposes only, patients were classified into various clinical states: delirium according to the CAM framework (31) and coma if they had a RASS score of -4 or -5. To analyze the severity of all patients collectively, patients not assessable due to deep sedation or coma were assigned a CAM-S score of 15 out of 19 for machine learning model development. This score was chosen a priori as in a hierarchical framework of consciousness, without a sufficient level of consciousness, that is arousal, it is not possible to have intact contents of consciousness, that is attention (32-34). We therefore have given maximal points for features that cannot be had in the absence of an appropriate level of consciousness (e.g. negative symptoms), but have not assigned points that can only occur with an appropriate level of consciousness (e.g. positive symptoms of specifically features 7: Perceptual disturbances such as hallucinations and 8: Psychomotor agitation), yielding a score of 15. This practice is consistent with our previously published work, in which patients with coma were assigned maximal CAM-S short form scores (35). Expanded information on evaluation questions and rules used for delirium symptom severity scoring for each category are given in eTables 1 and 2.

EEG Recordings and Visual Interpretation:

Clinical EEGs were recorded with Ag/AgCl scalp electrodes using the standard international 10-20 electrode placement by qualified EEG technicians and read and reported clinically by neurophysiologists using the 2012 ACNS Critical Care EEG terminology (36). As part of routine clinical practice, all EEG recordings were reviewed by two clinical experts (fellow and attending physician electroencephalographers) before reports were finalized and published in the electronic medical record. Patient evaluations were done prior to clinical interpretation of the EEGs. Although clinical EEG readers had access to routine clinical data, they were blinded to the results of the research evaluation.

Clinical EEG reports were reviewed to identify the presence of a wide range of findings (eTable 3), including background/rhythm abnormalities, periodic patterns, sporadic discharges, and seizure activity. For scoring, the reported EEG epoch containing the time of patient evaluation was chosen. If a patient was unable to be evaluated during the EEG recording, the nearest reported EEG epoch to the evaluation time was then chosen.

VE-CAM-S Model Development:

For the VE-CAM-S model, the visual EEG features defined in eTable 3 were used as inputs to predict the determined CAM-S Long form (LF) score, 0-19. As our dataset included only three patients with CAM-S LF scores >15, we capped all scores at 15 and made the model prediction range from 0 to 15.

The model was created by adapting its coefficients (i.e. points) so that for each pair of patients A and B, the model must discriminate whether the CAM-S LF for patient A is higher than that for patient B. We imposed several a priori constraints based on medical domain knowledge and to reduce collinearity among the inputs, including (1) ElasticNet penalty: encourages some points to be 0 when they do not improve prediction; (2) integer constraint: points had to be integers so they can easily be added by practitioners when an EEG feature is seen; (3) sign and severity constraints: certain points must be 0 or positive; and certain patterns of severe encephalopathy were set a priori to receive maximal points, as specified in eTable 3; and (4) ordinal constraints: focal/unilateral delta slowing was constrained to have points ≥ focal/unilateral theta slowing.

The model was trained using five-fold nested cross validation (CV), consisting of outer and inner CV (eFigure 2). Outer CV reports an unbiased out-of-sample performance and inner CV selects the best model parameters. The outer CV splits the dataset into five folds, where each fold was used to estimate out-of-sample performance (testing set), and the other four folds combined were used to train the model (outer training set). In each step of the outer CV, we do inner CV by further splitting the training sets (80% of whole data) into five folds to select model parameters. Note that these model parameters are not trainable and have to be specified before model training, hence they are also called hyperparameters. There are two hyperparameters, including the strength of the ElasticNet penalty, selected from 10-3, 10-2, …, 101 (5 choices); and the 0-point encouragement parameter over the range 0.5, 0.6, …, 0.9 (5 choices). In total, there are 25 choices. The choice of hyperparameters that maximize the Spearman’s correlation, averaged across the five inner testing folds, were selected.

Next, data from the inner five folds (80% of whole data) were combined to re-train the model with the selected hyperparameters. We then transformed the model output probability for each level of CAM-S LF to the actual occurrence frequency in the dataset (this is called calibration). Only at this point was the trained model applied to the outer testing set (20% of whole data held out in outer CV) to get a testing performance. The final reported performance was obtained from the average of the performances on the five outer testing folds. However, we now have five models, one for each outer CV fold. To get the final model, the most common hyperparameters from the five outer CV folds were used to re-train the model and calibrate on the whole dataset, from which we get the points for the VE-CAM-S model.

Measuring the Association of VE-CAM-S with Clinical Outcomes:     

To estimate the association of VE-CAM-S with clinical outcomes, we fit a generalized linear model with inputs as age, sex, and VE-CAM-S. We fit models separately for three outcomes: functional status at discharge, in-hospital mortality, and mortality at 3-months post-discharge. Functional status at hospital discharge was scored with the Glasgow Outcome Scale (GOS; 1=death to 5=good recovery) (37), determined using a combination of physician documentation and physical / occupational therapy evaluations at discharge. The whole VE-CAM-S dataset was used to fit models for clinical outcomes without cross validation, as there were no hyperparameters in these models. For comparison, we also performed the same analysis with the clinically assessed CAM-S LF instead of the VE-CAM-S. We also conducted subset analyses, comparing associations of VE-CAM-S with each clinical outcome in the following groups: young (<40y) vs. middle-aged (40-59y) vs. old (≥60y), male vs. female, white vs. black race, ICU vs. non-ICU patients, and non-comatose. All demographic information (age, sex, race/ethnicity) was obtained from the electronic health record.

Performance Metrics:

We used three metrics to measure associations of VE-CAM-S with outcomes: Spearman’s R correlation; area under the receiver operator curve (AUROC; AUC) to assess the ability of VE-CAM-S to discriminate between levels of delirium severity (CAM-S = 0 vs. CAM-S ≥ X); and the calibration curve to assess the consistency of the predicted probability with the observed frequency for CAM-S LF ≤4 vs. ≥5 (10).

Statistical Analysis:

Quantitative data is reported as medians (interquartile range, IQR) and compared using Kruskal Wallis ANOVA tests, followed by Dunn’s post-hoc comparison. Categorical data is reported as n = counts (percent) and compared using Chi-square tests, followed by pairwise comparisons with Bonferroni correction. The significance level for all tests was set at p <0.05. Confidence intervals were generated by bootstrapping 1000 times, to obtain 2.5% and 97.5% percentiles as the lower and upper bounds respectively. Analyzes were planned prior to conducting all statistical tests. Code used to develop the model and generate figures and tables are available at: https://github.com/mghcdac/VE-CAM-S.

 


Data Description

In all, 406 subjects were assessed for delirium; two were subsequently excluded due to technical difficulties with the EEG that precluded interpretation. In the remaining 404 subjects, three were evaluated more than once for a total of 407 timepoints of paired EEG and delirium assessments. Of the 404 subjects analyzed, 132 did not have delirium or coma (32.7%), 143 had delirium (35.4%), and 129 had coma (31.9%). Subjects with delirium or coma were older, had longer hospital stays, higher Charlson Comorbidity scores, more severe CAM-S scores, lower RASS, and lower GOS scores at discharge. Subject characteristics are shown in Table 1.

The majority of evaluations (59.7%) occurred during EEG recordings, 85.7% of evaluations were conducted during the active clinical EEG recording or within 1 hour, and all evaluations were conducted within 5 hours (Table ​(Table1).1). Most EEGs involved long-term monitoring (LTM; 237/407, 58.2%), with a mean reported epoch duration, used for scoring, of 11.0 hours (sd 7.4 hr). Differences in the prevalence of EEG findings between nondelirious, delirious, and coma subjects are shown in Table 2.

Modeling results leading to the published version of VE-CAM-S are described here.

 

 


Usage Notes

Data and code to generate all results and figures from the publication are provided here.


Ethics

This study of human subjects was approved by the Mass General Brigham Institutional Review Board (IRB approval # 2012P001929), including review of EEG and other clinical data. The Partners Healthcare Human Research Committee provided a waiver of written consent for this study. All data is deidentified. 


Acknowledgements

The authors wish to acknowledge and express their appreciation to the patients who were a part of this study, as well as the EEG technicians and clinical neurophysiologists of the MGH Division of Clinical Neurophysiology who participated in their clinical care. We gratefully acknowledge support from the The AWS Open Data Sponsorship Program, which allows us to share this data with the research community.  


Conflicts of Interest

M.B.W. is a co-founder of Beacon Biosignals. Beacon Biosignals did not contribute funding nor played any role in the study.


References

  1. Tesh RA, Sun H, Jing J, Westmeijer M, Neelagiri A, Rajan S, Krishnamurthy PV, Sikka P, Quadri SA, Leone MJ, Paixao L, Panneerselvam E, Eckhardt C, Struck AF, Kaplan PW, Akeju O, Jones D, Kimchi EY, Westover MB. VE-CAM-S: Visual EEG-Based Grading of Delirium Severity and Associations With Clinical Outcomes. Crit Care Explor. 2022 Jan 18;4(1):e0611. doi: 10.1097/CCE.0000000000000611. PMID: 35072078; PMCID: PMC8769081.

Parent Projects
VE-CAM-S: Visual EEG-Based Grading of Delirium Severity and Associations with Clinical Outcomes was derived from: Please cite them when using this project.
Share
Access

Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.

License (for files):
Open Data Commons Open Database License v1.0

Corresponding Author
You must be logged in to view the contact information.

Files

Total uncompressed size: 0 B.

Access the files
Folder Navigation: <base>
Name Size Modified
LICENSE.txt (download) 0 B 2024-01-05