Integrated Lipidomics and Proteomics Point to Early Blood-Based Changes in Childhood Preceding Later Development of Psychotic Experiences: Evidence From the Avon Longitudinal Study of Parents and Children

Background The identification of early biomarkers of psychotic experiences (PEs) is of interest because early diagnosis and treatment of those at risk of future disorder is associated with improved outcomes. The current study investigated early lipidomic and coagulation pathway protein signatures of later PEs in subjects from the Avon Longitudinal Study of Parents and Children cohort. Methods Plasma of 115 children (12 years of age) who were first identified as experiencing PEs at 18 years of age (48 cases and 67 controls) were assessed through integrated and targeted lipidomics and semitargeted proteomics approaches. We assessed the lipids, lysophosphatidylcholines (n = 11) and phosphatidylcholines (n = 61), and the protein members of the coagulation pathway (n = 22) and integrated these data with complement pathway protein data already available on these subjects. Results Twelve phosphatidylcholines, four lysophosphatidylcholines, and the coagulation protein plasminogen were altered between the control and PEs groups after correction for multiple comparisons. Lipidomic and proteomic datasets were integrated into a multivariate network displaying a strong relationship between most lipids that were significantly associated with PEs and plasminogen. Finally, an unsupervised clustering approach identified four different clusters, with one of the clusters presenting the highest case-control ratio (p < .01) and associated with a higher concentration of smaller low-density lipoprotein cholesterol particles. Conclusions Our findings indicate that the lipidome and proteome of subjects who report PEs at 18 years of age are already altered at 12 years of age, indicating that metabolic dysregulation may contribute to an early vulnerability to PEs and suggesting crosstalk between these lysophosphatidylcholines, phosphatidylcholines, and coagulation and complement proteins.


Supplemental Information Lipidomic Analysis and Data Preprocessing
Lipidomic data were firstly processed using MZmine 2 (1), then normalized by lipid-class specific internal standards, and quantified using the inverse-weighted linear model. Signals of internal standards were identified from the standards runs. The internal MS library of retention times was adapted to the study by means of a linear correction based on the observed retention times in the standards runs using the open source R software (2).
The dataset was filtered, allowing the signal to be missing in a maximum of 50 % of the samples at each of the eight batches. Missing values that remained were then imputed with feature-wise half-the-minimum. All lipids that were present in more than 75% of samples were considered for statistical analyses.

High-Abundance Protein Depletion of Human Plasma Samples
To improve the dynamic range for proteomic analysis, 40µl of plasma from each case was immunodepleted for removal of the 14 most abundant proteins (Alpha-

Sample Preparation for Mass Spectrometry
Protein digestion and peptide purification was performed as previously described (5

Targeted Confirmation of Protein Biomarkers Using Data Independent Acquisition (DIA)
The DIA isolation scheme and multiplexing strategy was based on that from Egertson et al.
(2013) in which five 4-m/z isolation windows are analysed per scan (6,7). DIA overcomes many of the limitations of untargeted proteomics, for example missing values (8,9). Samples were run on the Thermo Scientific Q Exactive mass spectrometer in DIA mode. Each DIA cycle contained one full MS-SIM scan and 20 DIA scans covering a mass range of 490-910 Th with the following settings: the SIM full scan resolution was 35,000; AGC 1e6; Max IT 55ms; profile mode; DIA scans were set at a resolution of 17,000; AGC target 1e5; Max IT 20ms; loop count 10; MSX count 5; 4.0 m/z isolation windows; centroid mode (6). The cycle time was 2s, which resulted in at least ten scans across the precursor peak. For DIA library generation, QC samples were injected in DDA mode (10) at the beginning of the run, and after every ten injections throughout the run. The relative fragment-ion intensities, peptide-precursor isotope peaks and retention time of the extracted ion chromatograms from the DIA files were used to confirm the identity of the target molecular species (6,7).

Preprocessing
For DDA, Label-Free Quantification (LFQ), the human FASTA sequence database was searched with MaxQuant (v1.5.2.8) (11,12), as described (5). False Discovery rates (FDR's) were set to 1% at the peptide and protein level, and only proteins with at least two peptides (one uniquely assignable to the protein) were considered as reliably identified. LFQ intensity values were used for protein quantification between groups. Only proteins present in >80% of samples in at least one group were taken forward for quantification, and the filtered data was normalised by subtracting the median intensity for each protein.  (7). For our dataset, the m/z tolerance was < 10 ppm and the average retention time window was 2 minutes. All parent and fragment level data was visually confirmed across the samples run, and peak editing was undertaken where necessary, using the peptide Retention Time (RT), dotproduct (idop), mass accuracy (< 10 ppm), and a confirmed library match to reliably identify and quantify peptides across the DIA runs. For statistical analysis, peak areas of the fragment level data was filtered from the Skyline document grid for analysis in mapDIA, an open source bioinformatics tool for pre-processing and quantitative analysis of DIA data (13). Total Ion Sum (TIS) intensity normalisation procedure was applied, followed by peptide fragment selection using two standard deviation threshold for outlier detection, in the independent sample setup.     Supplementary Table S1. Cluster D and A was composed by 70.6% and 32.6%, respectively, of PE cases, while cluster B and C was composed by 28.5% and 19.23%, respectively (Chi square p= 0.007).