Shared and Anxiety-Specific Pediatric Psychopathology Dimensions Manifest Distributed Neural Correlates

BACKGROUND
Imaging research has not yet delivered reliable psychiatric biomarkers. One challenge, particularly among youth, is high comorbidity. This challenge might be met through canonical correlation analysis designed to model mutual dependencies between symptom dimensions and neural measures. We mapped the multivariate associations that intrinsic functional connectivity manifests with pediatric symptoms of anxiety, irritability, and attention-deficit/hyperactivity disorder (ADHD) as common, impactful, co-occurring problems. We evaluate the replicability of such latent dimensions in an independent sample.


METHODS
We obtained ratings of anxiety, irritability, and ADHD, and 10 minutes of resting-state functional magnetic resonance imaging data, from two independent cohorts. Both cohorts (discovery: n = 182; replication: n = 326) included treatment-seeking youth with anxiety disorders, with disruptive mood dysregulation disorder, with ADHD, or without psychopathology. Functional connectivity was modeled as partial correlations among 216 brain areas. Using canonical correlation analysis and independent component analysis jointly we sought maximally correlated, maximally interpretable latent dimensions of brain connectivity and clinical symptoms.


RESULTS
We identified seven canonical variates in the discovery and five in the replication cohort. Of these canonical variates, three exhibited similarities across datasets: two variates consistently captured shared aspects of irritability, ADHD, and anxiety, while the third was specific to anxiety. Across cohorts, canonical variates did not relate to specific resting-state networks but comprised edges interconnecting established networks within and across both hemispheres.


CONCLUSIONS
Findings revealed two replicable types of clinical variates, one related to multiple symptom dimensions and a second relatively specific to anxiety. Both types involved a multitude of broadly distributed, weak brain connections as opposed to strong connections encompassing known resting-state networks.

behavioral problems. Here, we focus on three more closely related and often comorbid domains: irritability, ADHD, and anxiety. We test a hypothesis consistent with previous work using other latent variable approaches combined with taskbased fMRI (1,2): CCA yields latent phenotypes that capture both unique and shared aspects of irritability, ADHD, and anxiety. However, unlike past work, brain connectivity is not evoked by highly controlled tasks in the current study. Thus, more broadly distributed neural circuitry correlates are expected in the current study, as compared with correlates in previous studies.
Finally, as a third extension of past work, we evaluate the latent variables' replicability using novel sampling and analytic techniques. Prior CCA studies find replicable associations when discovery and replication cohorts represent subsets of the same sample (9,10) but not when they arise from independent cohorts (11). Robustness against sampling variability is essential for clinical applications of CCA, which possesses exploratory components that can make replication difficult. Thus, the current study used data from 2 independent cohorts of treatment-seeking youth assessed with similar methods. We treated the smaller sample (n = 182) as the discovery dataset, because it was assessed with homogeneous imaging parameters. The larger cohort (n = 326), assessed with heterogeneous imaging parameters, served as a replication dataset (12). Moreover, we employed analytic techniques that leverage independent component analysis (ICA) to improve interpretability of the canonical variates (13). Finally, we used a novel, stepwise permutation scheme (14) that addresses limitations in other CCA studies concerning the handling of nuisance variables and possible inflation of type I errors.

METHODS AND MATERIALS Participants
Both samples comprised healthy volunteers and youth diagnosed with an anxiety disorder, disruptive mood dysregulation disorder, or ADHD by licensed clinicians using the Kiddie Schedule for Affective Disorders and Schizophrenia (K-SADS) (15). Exclusion criteria were neurological disorders, autism and bipolar spectrum disorders, psychosis, substance use, MRI contraindications, and Full Scale IQ , 70. Anxiety was assessed by using the parent-and youth-reported ratings of the five subscales of the Screen for Child Anxiety Related Disorders (16). Irritability was assessed with the first six items of the parent-and youth-reported Affective Reactivity Index (17). Parents quantified ADHD symptoms such as inattention and disruptive behavior through seven items assessed with the ADHD subscale of the Conners (18) in the discovery sample and the Child Behavior Checklist (19) in the replication sample. These 29 ratings of anxiety, irritability, and disruptive behavior (18 parent-reported, 11 self-reported) were used as input for the joint CCA1ICA.
Samples were similar in terms of sex ratios, proportions of anxiety disorders, oppositional defiant disorder, medicationfree-to-medication-use ratios, and levels of parent-reported symptoms of irritability, ADHD, and anxiety. However, the discovery sample was older and had a higher IQ, a lower proportion of ADHD cases, a higher proportion of diagnosisfree and disruptive mood dysregulation disorder cases, and lower self-reported irritability and anxiety. Both cohorts were ethnically diverse and were recruited from urban, semi-rural, and rural areas (Figure 1, Supplement).

Acquisition and Preprocessing of Imaging Data
Discovery sample data were acquired at one site with two identical 3T General Electric Signa scanners (GE Healthcare, Chicago, IL). The replication sample comprised data from a 1.5T Siemens Avanto scanner (Siemens Healthineers, Erlangen, Germany), a 3T Siemens Tim Trio scanner, and a 3T Siemens Prisma scanner. A high-resolution T1-weighted structural image and 10 minutes of blood oxygen level-dependent changes during rest were collected from all participants, although sequences varied between samples and within the replication dataset (Supplement). Quality of the imaging data was assessed using MRIQC (20). The automated pipeline FMRIPREP (21) was used for preprocessing. We refrained from motion scrubbing and used instead ICA-AROMA, which reduces motion-related artifacts at least as well (22,23).

Functional Connectivity
The rsfMRI-connectivity network comprised 216 nodes derived from a 200-region parcellation scheme (24), augmented by eight subcortical regions per hemisphere obtained using FreeSurfer segmentation (nucleus accumbens, nucleus caudatus, pallidum, putamen, amygdala, hippocampus, thalamus, and ventral diencephalon) (25). Framewise displacement and spatial standard deviation of the temporal difference data (26), but not global signal, were regressed out from the time series. Functional connectivity was quantified using partial correlations, which offer an estimate of direct (as opposed to indirect or shared) connectivity between each pair of nodes (edges). Because the resulting network matrices are symmetric, only half of the edges (i.e., 23,220) were analyzed.

CCA, ICA, and Permutation Testing
Covariates (age, sex, race, IQ, psychotropic medication, and scanner for the discovery sample; additionally, site and sequence type for the replication sample) were regressed out from both imaging and clinical variables before dimensionality reduction. All 29 symptom ratings were included; dimensionality of rsfMRI was reduced using principal component analysis (PCA) before CCA (7). Residuals were projected to a lower dimensional space where data are exchangeable, thus mitigating spurious dependencies among observations introduced by residualization (14).
Given two sets of variables (here, imaging data [Y] and symptom ratings [X]), CCA seeks linear mixtures within each set (i.e., canonical variables [CVs]; U = Y 3 A and V = X 3 B), such that each resulting mixture (U) from one set is maximally correlated with a corresponding mixture (V) from the other set, but uncorrelated with all other mixtures in either set. We use upper-case letters U and V to represent the whole set of canonical variables on the imaging and clinical side, respectively, and lower-case letters followed by subscripts to indicate the order of the canonical correlations, from higher to smaller, u k and v k , to denote specific latent variables.
Small perturbations in the original data could possibly lead to arbitrary rotations of the CCA solutions. To mitigate the Connectivity-Based Dimensions of Psychopathology problem and aid interpretability, we subjected the stacked CVs to ICA, seeking CVs that were not only orthogonal but also statistically independent. The joint CCA1ICA procedure was performed using a modification of a recently proposed algorithm for permutation inference for CCA (14), thus allowing not only characterization and better disambiguation of the resulting CVs but also valid statistical inference (details in the Supplement). Below, where we refer to results of CCA, these are to be understood as results of the joint inference using CCA1ICA.
Because the number of CVs is determined by the smallest input dataset, we obtained 29 CVs. Statistical significance was determined using 10,000 permutations. In the permutation test, for each estimated CV (post-ICA), variance already explained by CVs with stronger, significant canonical correlations were removed in an iterative procedure (14). Canonical correlations were considered significant at alpha = .05 after familywise error rate (FWER) correction using a closed testing procedure. The nonsymmetric redundancy index (27), which gives the mean variance of the clinical data explained by imaging data, and vice versa, is reported in the Supplement.

Replicability
Replicability of the CVs was determined based on three criteria: 1) stability within the same dataset across variations in the number of PCA components that entered CCA relative to the sample size (input-to-participant ratio), 2) similarities of latent clinical patterns, and 3) similarities of latent connectivity patterns identified independently in the two samples. A prior CCA study in youth reported replicability only for clinical but not for rsfMRI patterns (9); thus, we decided to evaluate the replicability of the clinical and the connectivity patterns as separate criteria.
To evaluate the first criterion, we performed three analyses that varied the input-to-participant ratios. The primary analyses used an input-to-participant ratio of 1:2, which translated into 64 rsfMRI components, explaining 75% of the betweensubject variance in rsfMRI connectivity in the discovery sample. In the replication cohort, 134 rsfMRI components were used; these explained only 57% of the variance, possibly owing to more unstructured noise in this dataset. This primary analysis was supplemented by two secondary analyses using input-to-participant ratios of 1:3 and 1:4, thereby reducing risks of overfitting, at the expense of explaining less variance. This was accomplished by using fewer imaging principal components as input to the CCA. Results were compared across the three ratios by examining cross-correlations among CCA components (e.g., corr[v 1j1:2 , Y D 3a 1j1:3 ] and corr[u 1j1:2 , X D 3b 1j1:3 ]). Statistical significance was determined using 10,000 permutations, with a threshold of p FWER , .05 within each set of comparisons. Because psychiatric symptoms might relate to components that explain relatively little variance in the imaging data, we also discuss CVs that solely replicated at the 1:3 ratio but could be found in the replication cohort.
To test the second and third criteria, we used joint CCA1-ICA in the replication dataset. Canonical weights from each dataset were applied to the input data from the other dataset; these products were then correlated with the CVs identified in that dataset (e.g., corr[v 1jD , Y D 3 a 1jR ] and corr[u 1jD , X D 3b 1jR ]). Clinical and connectivity patterns were considered replicable when both the application of weights from the discovery to the replication dataset and the application of weights from the replication to discovery dataset yielded statistically significant Connectivity-Based Dimensions of Psychopathology associations. We used 10,000 permutations to establish significance. However, thresholds differed for clinical and connectivity patterns. We used a stringent threshold of p FWER , .05 to determine replicability of the imaging and clinical patterns; additionally, we also investigated a more lenient threshold of p uncorr , .05 for replicability of the connectivity pattern. This decision was motivated by two factors. First, a prior CCA study finding replicable clinical patterns did not report replicable connectivity patterns across two subsets of a single sample (9). This raises questions as to whether any evidence of replicability can be detected with even liberal statistical thresholds. Second, in the current study, significant differences exist between cohorts in all metrics, quantifying the quality of the imaging data ( Figure S1); this contrasts with the broadly similar profiles for symptom ratings (Table S1 and Figure S1).

Interpretation of CVs
To interpret the significant CVs, we investigated their correlations with the residualized input data. These correlations between latent variables derived from the CCA and input data (i.e., symptom ratings, connectivity matrices) are henceforth referred to as canonical loadings. Consistent with prior studies (7,28), we focused on clinical items with loadings jrj . 0.2, resembling a small to moderate effect, to interpret and label key CVs. However, we extended this approach by limiting our focus to replicating clinical loadings, i.e., loadings jrj . 0.2 that could be observed across samples. Similarly, we emphasized edge loadings that replicated across samples. However, given the differences in the quality of the imaging data across samples, we applied a more lenient threshold of jrj . 0.15 to the replication cohort.
Our key hypothesis concerned identifying both crossdimension and specific variates. Consistent with this hypothesis, latent clinical phenotypes (v 1-7jD and v 1-5jR ) could be differentiated in terms of specificity of associated symptoms. In both datasets, we observed latent variables that loaded  Table S3. Canonical correlation analysis results are characterized by sign indeterminacy, meaning that it is valid to flip the sign for an entire latent dimension, which will affect the directions of the correlations.
We highlight three CVs passing all three replicability criteria highlighted in the Methods and Materials section and two latent dimensions passing only the first two replicability criteria (stability within the same dataset and replicability of clinical patterns). Full results concerning replicability appear in the Supplement (Tables S8-S11, S13-S20). We describe replicable CVs based on the specificity of the clinical patterns ranging from shared between all three clinical domains to anxiety-specific.
Both v 3jD and v 5jR loaded ..20 on the irritability and anxiety domains, where close inspection suggested informant effects; v 3jD captured youth-reported whereas v 5jR loaded on parentreported irritability and anxiety (Figure 3, Tables S7 and S12). Yet, connectivity patterns correlated across samples (corr[u 5jR , Y R 3 a 3jD ]: r = .12, p uncorr = .0375, p FWER = .7425; corr[u 3jD , Y D 3a 5jR ]: r = .32, p uncorr = .0001, p FWER = .0007). Inspection of the connectivity loadings showed that in both samples this transdimensional phenotype was associated with edges interconnecting established resting-state networks within and across both hemispheres ( Figure S18). (C) Edges in red that load strongly positively on u 3jD and u 5jR . Edges that load strongly negatively on u 3jD and u 5jR are depicted in blue. Given baseline differences in the strength of the connectivity patterns, connectivity maps were thresholded at jrj . 0.2 for the discovery sample and at jrj . 0.15 for the replication sample. Only edges that loaded highly positively or negatively in both datasets were retained for this figure. ADHD, attention-deficit/hyperactivity disorder; P, parent; Y, youth.

Connectivity-Based Dimensions of Psychopathology
Interestingly, v 2jD was also robust against variations in participant-to-input ratios and showed substantial positive loadings ..20 on the same three parent-report items from the irritability domain ("Often loses temper," "Angry for a long time," "Loses temper easily") and one from the ADHD domain as v 5jR ("Talks excessively") ( Figure 2 and Figure S12, Tables S7 and S12). Moreover, clinical loadings for CV 2jD and CV 5jR significantly correlated across cohorts (corr[v 5jR , X R 3 b 2jD ]: r = .25, p uncorr = .0001, p FWER = .0004; corr[v 2jD , X D 3 b 5jR ]: r = .45, p uncorr = .0001, p FWER = .0001). However, within each sample, connectivity patterns associated with the two latent phenotypes were different, although brain connectivity data informed latent clinical dimensions.
Across samples, CV 4jD and CV 4jR loaded ..20 on three items characterizing disruptive behavior from the ADHD domain ("Can't sit still," "Impulsive," "Loud") and one item from the domain of irritability ("Loses temper easily"). Furthermore, both CV 4jD and CV 4jR loaded negatively on one irritability item ("Angry most of the time") ( Figure 4, Tables S7 and S12). Inspection of substantial edge loadings in both samples indicated strong representations in the variate of connections among nodes in motor, attention, default mode, and temporalparietal networks ( Figure S19).

DISCUSSION
Three key findings emerge from this study. First, analyses found seven CVs in a discovery dataset; four showed stability within the discovery dataset and replicability of clinical patterns in an independent sample; three CVs demonstrated at least weak signs of replicability for the associated rsfMRI connectivity patterns. This suggests the presence of meaningful relations between patterns of intrinsic brain connectivity and psychiatric symptom dimensions in youth. Second, the three most strongly replicable CVs from the discovery dataset varied in clinical specificity; one loaded on all three domains, the second captured shared aspects of irritability and ADHD, and the third loaded specifically on anxiety. Finally, CVs showed weak to modest associations, with multiple edges spanning widely distributed brain areas.
Pediatric psychopathology involves broadly correlated symptom dimensions (1)(2)(3)(4)(5)(6). Dimensions of irritability, ADHD, and anxiety are particularly closely interrelated. Understanding of these cross-dimension relations may follow from research on shared and unique neural correlates. Past work in this area assessed symptom covariation independent of imaging data before then relating symptoms to task-based imaging patterns (1). CCA connects clinical and neural measures simultaneously to identify more complex relations (7,8). We used rating scales employed in the previous task-based fMRI research examining unique and shared dimensions of pediatric psychopathology (1). Using these measures, the current rsfMRI study identified two variates loading strongly on Connectivity-Based Dimensions of Psychopathology multiple clinical dimensions and a third loading strongly only on anxiety items. Thus, consistent with our hypotheses based on past studies, current findings demonstrate coexisting cross-dimensional and domain-specific neural correlates in treatment-seeking youth.
The detection of only anxiety but not irritability or ADHDspecific neural correlates in the current study could reflect many factors. These include differences between task-based and rsfMRI methods, differential sensitivity in CCA to particular domain-specific features, or biological features of anxiety that generate specific rsfMRI signatures. Additional imaging research might seek to refine clinical groupings based on replicable cross-study patterns for these and other interrelated dimensions.
Findings in the current and past CCA studies exhibited both similarities and differences. Cross-sample correlations for clinical loadings in the current study were notably similar in magnitude to those for variables involving emotion symptoms in the only other study of cross-domain pediatric psychopathology (9). Given differences across the two studies, such consistency speaks to the robust nature of pediatric emotional problem manifestations. The previous study also found strong cross-sample replicability for a pure externalizing factor, which did not emerge in the current study. Failure to detect this factor might reflect lesser diversity in targeted symptoms or larger proportions of treatment-seeking cases in the current study. Finally, unlike past research in treatment-seeking adults, the current study showed crosssample replicability of latent clinical and connectivity patterns, a finding that might reflect age-related differences or distinct analytic approaches.
Interesting rsfMRI patterns manifested. Connectivity related to clinical dimensions was broadly distributed, involving hundreds of relatively weakly loading interhemispheric and within-hemisphere connections spanning distinct networks. Moreover, while within-sample stability was acceptable in the discovery sample, rsfMRI patterns minimally correlated across datasets. Interestingly, such weak replicability manifested alongside stronger replicability for clinical patterns, themselves defined by relations with rsfMRI. Replicable clinical patterns defined by less replicable rsfMRI patterns raise important questions for future studies. First, greater cross-sample differences existed for the fMRI than clinical assessments. Thus, whether homogeneous cross-sample imaging methods could generate improved rsfMRI replicability remains unclear. Second, replicable clinical patterns defined by minimally replicable fMRI patterns could arise from "many-to-one" mappings between neural and clinical variables. Such configurations commonly underlie brain-behavior relationships at many spatial scales. Thus, whether such "many-to-one" patterns also represent a common motif for mental disorders remains unclear.
From the clinical perspective, broadly distributed connectivity disturbances might require a diverse set of approaches to identify targets for novel interventions. Currently, therapies such as cognitive training or neural stimulation target functions in specific networks (29)(30)(31). However, at least for pediatric anxiety, irritability, and disruptive behavior, broadly distributed patterns may better represent the nature of connectivity disturbances during rest than patterns limited to particular networks. The focus on broad connectivity disturbances as opposed to particular networks might increase effect sizes of studies relating clinical domains to intrinsic brain connectivity.
Findings inform analytic decisions in future CCA studies. Different analyses within and across samples used different rsfMRI data, accounted for different amounts of overall rsfMRI variance, and yielded differences in CV structure. That input affects output is not unique to CCA. However, no ground truth informs selection of PCA-based or other input components for CCA. Thus, risk of overfitting is balanced against risk of omitting relevant variance through dimensionality reduction. Overfitting is reduced by ensuring proportionally more research participants than variables (32,33). However, particularly in moderately sized datasets, dimensionality reduction can exclude rsfMRI variance components that, even if small, powerfully relate clinical dimensions to connectivity patterns. Such factors create challenges that likely impact findings. The presence of modestly replicable clinical loadings across analyses in the current study suggests the promise of continued iterative work targeting these challenges.
One major limitation of the current study are the medium sample sizes. Also, differences in scanners, imagingacquisition parameters, and data-quality indices introduced noise that decreased the probability of fully replicating findings across datasets. In effect, larger proportions of rsfMRI connectivity variance are explained in the smaller but homogeneous discovery sample, as evidenced by PCA. Furthermore, we did not include youth ratings of ADHD symptoms or ratings of depressive symptoms, another highly prevalent symptom dimension in youth.
Our findings implicate co-occurring transdimensional and anxiety-specific neural features in pediatric psychopathology. Results further suggest that pediatric clinical dimensions reflect widely distributed brain connectivity patterns. Thus, as with genetic correlates, neural correlates of some pediatric psychopathology dimensions may reflect hundreds of individually small associations.