Longitudinal Structural Brain Changes in Bipolar Disorder: A Multicenter Neuroimaging Study of 1232 Individuals by the ENIGMA Bipolar Disorder Working Group

BACKGROUND
Bipolar disorder (BD) is associated with cortical and subcortical structural brain abnormalities. It is unclear whether such alterations progressively change over time, and how this is related to the number of mood episodes. To address this question, we analyzed a large and diverse international sample with longitudinal magnetic resonance imaging (MRI) and clinical data to examine structural brain changes over time in BD.


METHODS
Longitudinal structural MRI and clinical data from the ENIGMA (Enhancing Neuro Imaging Genetics through Meta Analysis) BD Working Group, including 307 patients with BD and 925 healthy control subjects, were collected from 14 sites worldwide. Male and female participants, aged 40 ± 17 years, underwent MRI at 2 time points. Cortical thickness, surface area, and subcortical volumes were estimated using FreeSurfer. Annualized change rates for each imaging phenotype were compared between patients with BD and healthy control subjects. Within patients, we related brain change rates to the number of mood episodes between time points and tested for effects of demographic and clinical variables.


RESULTS
Compared with healthy control subjects, patients with BD showed faster enlargement of ventricular volumes and slower thinning of the fusiform and parahippocampal cortex (0.18 <d < 0.22). More (hypo)manic episodes were associated with faster cortical thinning, primarily in the prefrontal cortex.


CONCLUSIONS
In the hitherto largest longitudinal MRI study on BD, we did not detect accelerated cortical thinning but noted faster ventricular enlargements in BD. However, abnormal frontocortical thinning was observed in association with frequent manic episodes. Our study yields insights into disease progression in BD and highlights the importance of mania prevention in BD treatment.

. In prior cross-sectional studies from the ENIGMA (Enhancing Neuro Imaging Genetics through Meta Analysis) BD Working Group, including 6503 individuals, we found the most pronounced cortical thickness alterations in pars opercularis and rostral middle frontal and fusiform cortex, albeit with small effect sizes but no abnormalities in cortical surface area (5). We also reported smaller amygdala, hippocampus, and thalamus volumes and larger ventricular volumes in patients with BD compared with healthy control (HC) subjects (6). However, the extent and heterogeneity of brain abnormalities among patients is substantial (17)(18)(19), and cross-sectional studies cannot determine whether the observed brain alterations arise from progressive changes over time.
The term neuroprogression refers to the progressive symptomatic and functional decline observed in some patients with BD that may be associated with progressive neuroanatomical changes (20)(21)(22)(23). However, few studies have used a longitudinal design to assess brain changes during the course of BD (24)(25)(26)(27)(28). These single-center studies and recent reviews (29) suggest progressive features in the prefrontal and temporal cortices associated with BD. These brain changes could be part of the natural course of BD but could also reflect cortical changes influenced by medication (30,31), genetic factors (25), and the occurrence of mood episodes (24)(25)(26)(27). The potential relationship to manic episodes, specifically, is supported by studies demonstrating associations between frontotemporal cortical decline and the occurrence of (hypo) manic episodes (25,26) as well as in first-episode mania (32). It has also been suggested that no cortical changes or even cortical thickness increases, potentially reflecting normalization processes, occur during periods without manic (24,25) or other mood (28) episodes. However, longitudinal brain imaging studies are scarce, and many of them are hampered by various limitations such as small samples, short follow-up times, lack of control groups, or lack of a statistical control for potential confounders such as psychiatric comorbidity and medication use.
The primary aim of this multicenter longitudinal brain imaging study was to overcome the limitations of prior studies, to elucidate whether progressive changes in cortical thickness, surface area, and subcortical volumes occur in BD, beyond those expected with normal aging. While cortical thickness and surface area seem to be genetically distinct measures (33,34), cortical thickness is increasingly being used as a marker for cortical integrity (35)(36)(37)(38), also within BD (5,7,8,25,39). In view of our prior findings of abnormalities in cortical thickness but not surface area in BD, we used cortical thickness as a primary cortical measure and cortical surface area as a secondary measure. We thus significantly extend our previous crosssectional ENIGMA-BD analyses (5,6) by investigating longitudinal changes in the same measures for patients with BD and HC subjects.
Through ENIGMA-BD, we combined data collected from 14 independent studies, including 2464 structural brain magnetic resonance imaging (MRI) scans from 1232 individuals scanned at 2 time points (0.5-9 years apart) and tested for differences in regional annualized change rates between patients with BD (n = 307) and HC subjects (n = 925). Based on the literature reviewed above, we hypothesized that patients with BD at the group level would show greater frontotemporal cortical thinning over time; greater volume decline in the amygdala, thalamus, and hippocampus; and greater ventricular volume enlargements relative to HC subjects. Given the growing evidence for the potential involvement of mania, the second aim was to investigate whether the number of manic episodes between imaging investigations was associated with annual change rates in patients. Similar associations with hypomanic, mixed, and depressive episodes were explored. We also tested the effects of demographic and clinical variables, such as psychiatric comorbidity, bipolar subtype, and medication use.

Participating Sites and Cohort Characteristics
Fourteen international sites from 9 countries from the ENIGMA-BD Working Group contributed individual subject longitudinal MRI and clinical data of 325 patients with BD and 978 HC subjects (mean age: 40 6 17 years) collected at baseline (time point 1 [TP1]) and follow-up (TP2). ENIGMA-BD applies standardized processing, quality control, and analysis techniques to independently collected data samples. Further details about our standardized methods and protocols can be found in our recent review (40). Demographic and clinical data consisted of sex, age, body mass index, educational level, ethnicity, smoking status, alcohol use, substance use, age of onset, number of mood episodes, mood state, bipolar subtype, psychiatric comorbidity, history of psychotic symptoms, and medication use at the time of scan (see Supplement 1 for more details and how these variables were coded). Tables S1-S3 in Supplement 1 list the demographic and clinical details for each diagnostic instruments used to obtain diagnoses and clinical information and exclusion criteria for each center.
In the main analysis, we included centers that provided both patient and control data to reliably correct for imaging site and account for potential scanner drifts, yielding a final sample size of 1232 participants (307 patients with BD and 925 HC subjects). Data from the center that provided only HC data (n = 53) and the center that provided only BD data (n = 18) were included in secondary within-group analyses. All sites received approval from their local ethics committees, and all participants provided written informed consent.

MRI Acquisition and Processing
T1-weighted anatomical brain images were acquired at each site (Table S4 in Supplement 1 for acquisition parameters). Participants underwent baseline and follow-up investigation using the same protocol and scanner. ENIGMA-standardized image processing, quality control, and data extraction tools were applied to each of the 14 independently collected ENIGMA-BD samples. Methodological details are provided in Supplement 1. In brief, FreeSurfer (41)(42)(43)(44)(45) was used on-site to perform cortical reconstructions and subcortical segmentations at each imaging time point. Images were first processed cross-sectionally and then with the longitudinal stream implemented in FreeSurfer (version 5.3 or higher) (46). We investigated 68 cortical thickness (and surface area) regions of interest (ROIs), as defined by the Desikan-Killiany atlas (47 . This yielded a time-independent relative change measure (percent per year) for each participant and each ROI, where negative values reflected a decrease and positive values an increase over time. This approach was chosen because the majority of sites provided data from 2 time points, to be consistent with previous ENIGMA projects, and to enable comparison within and across disorders (5,6,48). If participants provided data from more than 2 time points, the first and last scans were used for change rate computations.

Statistical Analyses
Cohort Characteristics. Differences in demographic and clinical variables between groups at each time point (Table S6 in Supplement 1) were tested with t tests or Fisher's exact c 2 tests.
Group Differences in Yearly Change Rates (Main Analysis). To determine differences in yearly change rates between patients with BD and HC subjects, we used linear mixed modeling with change rates in brain phenotypes as dependent variable, group (BD vs. HC; variable of interest) as fixed factor, age and sex as covariates, and imaging site as random factor, as in our previous study (5). For each ROI, effect size (Cohen's d) and significance (p values) of group comparisons were mapped into brain space using the ENIGMA viewer (http:// enigma.ini.usc.edu/research/enigma-viewer) (Figures 1 and 2).
As in our prior work (5,6), we treated the investigation of subcortical and cortical phenotypes as independent studies but here report findings of both analyses in the same manuscript. Within each phenotype, multiple-comparison correction was performed using Bonferroni's Dubey Armitage-Parmar/ Sidak's adjustment of the alpha level, considering the number of tests (68 for cortical thickness, 8 for subcortical volumes) and their intercorrelation (r thickness = 0.2778, a = 0.0024; r subcortical = 0.11463, a = 0.0047) between the dependent variables (49). Changes in surface area (r area = 0.2339, a = 0.0020) were not part of the main hypothesis but are reported for completeness.

Sensitivity
Analysis Testing for Potential Confounders. We tested whether the observed group differences in yearly change rates were affected by demographic or clinical variables (listed in Table S6 in Supplement 1). Methodological details are provided in Supplement 1. Corresponding results are provided in Supplemental Data DS_2.

Correlations Between Change Rates and Manic
Episodes Between Time Points. Within patients with BD, correlations between change rates and the number of mood episodes between time points were calculated using nonparametric Spearman's rank correlations in SPSS (version 26; IBM Corp.), given the non-normal distributions of mood episodes (Figures S10 and S11 in Supplement 1). In addition, we constructed a second measure of interest defined as the combined number of manic, hypomanic, and mixed episodes between time points. This measure reflects the total number of elated mood episodes, as investigated in Abé et al. (25). Although our hypothesis focused on the effects of manic  episodes, we present results for depressive episodes for completeness (Supplement Data DS_2). The same correction methods as described for the main analysis were performed to account for the number of correlations tested. We performed sensitivity tests adjusting for demographic and clinical variables, including the number of depressive episodes. We also repeated the analyses when excluding the Stockholm cohort, which previously showed associations between cortical decline and manic episodes (25), and the STOP-EM cohort, which was a first-episode mania cohort. See Supplement Data DS_2 and Supplement 1 for details on sensitivity tests and for results of exploratory analyses within BD subtypes.
Intercorrelations Between Brain Phenotypes (Post Hoc Analyses). To test whether the observed cortical thickness increases related to surface area decreases in the same ROI, we calculated Pearson correlations between the corresponding phenotypes. Given the widespread albeit weak effects demonstrated in Figure 1, we also correlated global thickness with global area changes. Given the observed increases in ventricular volumes and indications for subcortical decline in patients with BD, we tested for relationships between ventricle and subcortical volume change rates. Moreover, because we observed intercorrelations between such brain phenotypes, we tested whether multivariate classification methods (partial least squares and random forest) could distinguish between patients with BD and HC subjects based on regional change rate data. Corresponding methods and results of these exploratory tests are shown in Supplement 1.

Cohort Characteristics
A total of 2464 brain MRI scans from 1232 individuals (307 patients and 925 HC subjects) were included in the main analysis. Table S6 in Supplement 1 displays group characteristics. Patients with BD and HC subjects did not differ statistically in male/female ratios. Patients with BD were on average 6 years younger than HC subjects. The interscan interval was 0.9 years shorter in the HC group. Although the BD group contained fewer participants with White ethnic background, it was the most reported in both groups (83% and 90%). The BD group differed from the HC group in educational level, had higher body mass index, and was more likely to smoke than the HC group. Up to 58% of patients experienced mood episodes between time points. Lithium and antipsychotic drugs were the most frequently used medication types. Patients with BD had comorbid psychiatric diagnoses ranging from 1% (eating disorders) to 9% (attention-deficit/hyperactivity disorder). A few HC subjects (4%) reported alcohol abuse, 1 control subject had generalized anxiety disorder, and 1 control subject reported a history of psychotic symptoms. These were included in the main analysis but were excluded in tests for potential confounders. Sex, age, and interscan interval were accounted for in the main analysis. Effects of other demographic and clinical variables were tested for in additional follow-up analyses (Supplemental Data DS_1).

Case-Control Differences in Yearly Change Rates (Main Analysis)
Effect sizes and significance of group comparisons are shown in Figures 1 and 2

Effects of Demographic and Clinical Variables (Sensitivity Tests)
Overall, the sensitivity tests did not indicate that the group differences were affected by demographic and clinical variables. While adjusting for first-generation antipsychotic (FGA) use did not affect group differences, FGA use at TP1 was associated with larger increases in bilateral ventricular volume (p left = .004, p right = .014) and faster decrease in right fusiform thickness (p , .001) in patients. Note that only 14 patients used FGA; hence, these results should be treated with caution. Similarly, history of psychosis (at TP1) was related to faster decline in right parahippocampal thickness (p = .035). There were no differences associated with the use of other medication types. Patients with bipolar I disorder showed a decline in right parahippocampal thickness, whereas patients with bipolar II disorder showed thickness increases in the same region (mean difference: p = .010). The observed effects of FGA, history of psychosis, and bipolar subtype within patients are shown in Figures S5-S9 in Supplement 1. Age was not related to changes in cortical measures but correlated positively with change rates of ventricle volumes in HC subjects (Supplement 1 and Supplemental Data DS_1). No significant effects of sex, age 3 age, or group 3 age were observed.

Correlations Between Change Rates and Manic Episodes Between Time Points
Overall, we found negative correlations between the number of mood episodes between time points and cortical change rates, indicating faster rate of cortical thinning in patients with more mood episodes. After correction for multiple tests, we found significant negative correlations between the number of manic episodes and yearly change rates of left lingual thickness and frontal pole. The combined number of (hypo)manic and mixed episodes inversely correlated with thickness changes in several (pre)frontal and temporal ROIs (Table S7 in Supplement 1; Figure 4; Supplemental Data DS_2). There were no correlations with surface area or subcortical volume change rates. In complementary tests, we found correlations with depressive episodes (Supplemental Data DS_2).
Post hoc tests for interpretational purpose revealed that those with no manic episodes (n = 138) between time points (or no (hypo)manic and mixed episodes) showed either no changes or increased cortical thickness, whereas patients who had one or more manic episodes (n = 55) showed cortical thinning over time (Supplemental Data DS_2).  Table 1  The observed correlations remained robust when adjusting for age, sex, and imaging site, excluding outliers, the Stockholm and/or STOP-EM cohort, and when adjusting for the number of depressive episodes between time points (Supplemental Data DS_2). The results also remained when controlling for FGA use. The correlations with (hypo)manic and mixed episodes and thickness changes in lingual, pars orbitalis, pars opercularis, causal anterior cingulate, and caudal middle frontal ROIs remained when controlling for history of psychosis (Supplemental Data DS_2).

Intercorrelations Between Brain Phenotypes (Post Hoc Analysis)
Changes in cortical thickness and surface area were not correlated. In patients, yearly change rates of ventricular volume correlated negatively with changes in all investigated subcortical regions except the right pallidum and left accumbens ( Figure 5; Table S5 in Supplement 1). Multivariate case-control classification analyses did not provide sufficient classification accuracy (Supplement 1).

DISCUSSION
To our knowledge, the present ENIGMA-BD Working Group study is the largest longitudinal neuroimaging study of BD to date. On average, patients with BD did not show accelerated decline in any cortical phenotype investigated. Instead, patients with BD showed less cortical thinning than HC subjects in some areas. We did, however, find significantly larger change rates of ventricular volumes in patients with BD than HC subjects. Importantly, more manic episodes between imaging time points were associated with a higher degree of thinning in prefrontal cortex in patients.

Cortical Changes in BD
While HC subjects indicated cortical thinning over time across the whole brain ( Figure S13 in Supplement 1), patients with BD showed less or no thinning over time. With respect to surface area, patients showed both higher and lower change rates than HC subjects, indicating that surface area decreases faster in some and slower in other brain areas compared with HC subjects. However, most findings did not withstand correction for multiple comparisons. After correction, case-control differences were observed in fusiform and parahippocampal thickness change rates, where patients with BD displayed less decline compared with HC subjects.
As greater cortical thickness in adults is commonly interpreted as reflecting better cortical integrity (36,38,(50)(51)(52)(53)(54)(55)(56)(57)(58)(59), it is tempting to speculate that increases in cortical thickness (or a lack of thinning) reflect structural improvement processes. For example, lithium use has been linked to gray matter volume increases (5,6,60-63) and putative neuroprotective effects (31,(64)(65)(66). A recent review also suggested that lithium has normalizing effects on brain structure (31). Although we did not find any relationship between lithium use and changes in cortical thickness, given our limited information on medication use, we cannot exclude that lithium use prior to baseline scan had an effect on brain change rates. Potential normalizing effects of lithium could also be one possible explanation for why we did not detect group differences in prefrontal brain areas. However, medication effects remain an area of focused investigation in future ENIGMA-BD studies with more detailed medication information such as dosage and history of use. It should be noted, however, that size increases of cortical structures do not necessarily reflect beneficial effects but may be related to neuroinflammatory processes previously suggested to occur in BD (67).
Furthermore, the observed group differences were not affected by the use of lithium, antiepileptics, antipsychotics, or antidepressants, and, except for FGA, change rates for patients with BD on medication at the time of scan did not differ from those not on such medications. However, our study  Table S7 in Supplement 1 (purple) in which significant negative correlations between thickness change rates and the number of (hypo)manic episodes were observed. See Table S7

Subcortical Changes in BD
The overall pattern revealed lower subcortical volume change rates and larger ventricular change rates in patients with BD compared with HC subjects, but only the ventricular findings survived correction for multiple comparisons. Given that both groups showed ventricular increases over time (positive change rates), this indicates faster bilateral ventricular enlargements in BD. However, ventricular change rates correlated negatively with those for subcortical volumes, indicating that those patients with BD who display greater ventricle enlargement also display greater subcortical decline over time.
These results lend support to the notion that neuroprogression may occur in BD (29), predominantly characterized by ventricle enlargements. Thus, larger ventricle volumes as observed in cross-sectional studies of BD (5,7-16) may partly result from abnormal rates of enlargement during the course of illness. Overall, the reported cortical and subcortical findings remained significant after correcting for potential confounds, including medication use, psychiatric comorbidity, and demographic variables. The robustness of our findings was further supported by the results from leave-one-site-out analyses (Supplement 1). Multivariate classification analyses did not provide reliable accuracy for case-control classifications. While this may indicate that ROI-based structural change rates may not follow multivariate patterns, such methods may have potential utility in future studies of other brain measures.

Cortical Thinning in Relation to Manic Episodes
Prior studies have proposed that the occurrence of manic episodes is associated with cortical decline (24)(25)(26). In this study, the number of manic episodes and the total number of elevated mood episodes (mixed and (hypo)manic episodes) between time points correlated negatively with cortical change rates, predominantly in prefrontal cortex. Effects were small (r , 0.25) but significant. These results were consistent when adjusting for the number of depressive episodes between time points, indicating that the greater the number of manic episodes, the faster the rate of prefrontal cortical thinning. Similar associations were observed in the lingual (visual) cortex. The effects of manic episodes on cortical changes were observed in the combined patient cohort but may differ regionally between BD subtypes, as indicated by our exploratory analyses (Supplement 1).
Mechanisms underlying pathological gray matter loss may include increased neurodegeneration, neuronal apoptosis, neurotoxic susceptibility, and altered neuroplasticity influenced by neuroinflammatory processes and/or oxidative stress during mood episodes (24,29,68). Although our results are in line with these theories, the mechanisms underlying accelerated cortical thinning cannot be derived from this study. It also remains unclear if manic episodes precede gray matter loss or vice versa, or if there is another causative factor promoting both manic episodes and gray matter changes.
Moreover, our results indicate that patients experiencing mania between time points displayed prefrontal cortical thinning, whereas those who did not experience manic episodes showed no significant cortical changes or thickness increases. While this may suggest cortical normalization processes when mania is prevented, future studies are warranted. Efforts are underway to collect more detailed clinical information from ENIGMA-BD samples including behavioral, cognitive, and functional measures to empower future investigations (40). Although frontocortical abnormalities observed in cross-sectional studies of BD may in part reflect a static trait, our study suggests that some of these abnormalities could arise from progressive changes over time, which may be associated with the experience of manic symptoms. This and the commonly observed heterogeneity of patient groups (17)(18)(19) stresses the importance of identifying additional risk factors and subgroups at risk for pathological brain changes.

Limitations
A detailed discussion of the study limitations is provided in Supplement 1. In brief, the imaging method we used cannot reveal what biological mechanisms underlie the observed brain changes (69). Patients with BD were younger than HC subjects. However, age did not correlate with cortical change rates (only with ventricular volumes in HC subjects) and was used as covariate, accounting for individual agerelated variation in change rates. In addition, results obtained from sensitivity analyses in age-range-matched adults did not change our conclusions (Supplemental Data DS_1). In addition, because age-related brain changes are commonly of larger magnitude in older people (70,71), we would expect group differences in ventricular volume changes to be even more pronounced if groups were of same age. However, whether and how longitudinal brain changes in BD depend on age remains to be investigated in future studies. Moreover, how cortical changes or the number of mood episodes relate to medication effects can be better addressed using refined between time point cumulative medication use data and in randomized clinical trials. It is challenging to accurately assess the number of mood episodes, especially in cases that did not require hospitalizations or rely on self-report. Moreover, how the number of hospitalizations as well as the duration of mood episodes relate to longitudinal brain changes in BD remains to be investigated.
Although the ROI approach provides better comparability with previous studies that used the same brain parcellation method, analyses with higher regional resolution, e.g., voxelwise or surface-based vertex-wise analyses, could potentially reveal focal cortical variations that remained undetected at the ROI level. Although we attempted to parse patient groups with potential differential brain trajectories, such as those who experienced frequent manic episodes, refined data-driven analyses aimed at the identification of other potential subpopulations in even larger samples are warranted.
Finally, our findings do not allow conclusions about brain changes that occur in the natural course of BD if untreated.

Conclusions
Our findings suggest that patients with BD show less cortical decline but greater ventricular enlargements over time than HC subjects. Faster frontocortical thinning was associated with more frequent manic episodes. Although it remains to be clarified whether differential change rates in BD reflect beneficial effects from mood-stabilizing treatment, structural improvements when manic symptoms are prevented, or detrimental effects of manic episodes, our findings highlight the importance of preventing manic episodes and provide evidence for a neuroprogressive course of illness in BD.