If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Address correspondence to: Ariel F. Martinez, M.S., National Human Genome Research Institute, National Institutes of Health, 35 Convent Drive, Room 1B209, Bethesda, MD 20892-3717; .
Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MarylandGenome Biology Department, John Curtin School of Medical Research, The Australian National University, Canberra, Australia
Genetic factors predispose individuals to attention-deficit/hyperactivity disorder (ADHD). Previous studies have reported linkage and association to ADHD of gene variants within ADGRL3. In this study, we functionally analyzed noncoding variants in this gene as likely pathological contributors.
Methods
In silico, in vitro, and in vivo approaches were used to identify and characterize evolutionary conserved elements within the ADGRL3 linkage region (~207 Kb). Family-based genetic analyses of 838 individuals (372 affected and 466 unaffected patients) identified ADHD-associated single nucleotide polymorphisms harbored in some of these conserved elements. Luciferase assays and zebrafish green fluorescent protein transgenesis tested conserved elements for transcriptional enhancer activity. Electromobility shift assays were used to verify transcription factor–binding disruption by ADHD risk alleles.
Results
An ultraconserved element was discovered (evolutionary conserved region 47) that functions as a transcriptional enhancer. A three-variant ADHD risk haplotype in evolutionary conserved region 47, formed by rs17226398, rs56038622, and rs2271338, reduced enhancer activity by 40% in neuroblastoma and astrocytoma cells (pBonferroni < .0001). This enhancer also drove green fluorescent protein expression in the zebrafish brain in a tissue-specific manner, sharing aspects of endogenous ADGRL3 expression. The rs2271338 risk allele disrupts binding of YY1 transcription factor, an important factor in the development and function of the central nervous system. Expression quantitative trait loci analysis of postmortem human brain tissues revealed an association between rs2271338 and reduced ADGRL3 expression in the thalamus.
Conclusions
These results uncover the first functional evidence of common noncoding variants with potential implications for the pathology of ADHD.
Attention-deficit/hyperactivity disorder (ADHD) is a complex heritable trait that affects more children and adolescents than any other psychiatric disorder. Approximately 5.3% of the world’s population is estimated to be affected (
). ADHD increases the risk of disruptive symptoms, substance use, legal problems, and underemployment, which reduces the quality of life of ADHD sufferers and their families (
). Candidate gene and genome-wide studies of single nucleotide polymorphism (SNP) and copy number variants have identified a number of ADHD susceptibility loci (
), but few molecular studies have attempted to elucidate the functional effects of risk variants.
Common genetic variants within the adhesion G protein–coupled receptor L3 gene (ADGRL3, also known as latrophilin 3 or LPHN3) in 4q13.2 are strongly linked and associated with ADHD (
Influence of a latrophilin 3 (LPHN3) risk haplotype on event-related potential measures of cognitive response control in attention-deficit hyperactivity disorder (ADHD).
). The ADHD-linked region in ADGRL3 (hereafter referred to as the minimal critical region [MCR]) spans exons 4 through 19 of the gene (~207 Kb). Searching for variants that might affect ADGRL3 protein function, Domene et al. (
) sequenced the entire coding region of the gene in a small cohort of subjects with ADHD, but no missense coding changes or canonical splice site alterations were found to be associated with the disorder. This suggests that intronic noncoding variants are the likely pathological contributors.
In the present study, we interrogated the ADGRL3 MCR, aiming to identify transcriptional enhancer elements with potential functional implications for ADHD.
Methods and Materials
Subjects
Individuals with and without ADHD were ascertained from the metropolitan area of Medellin (Antioquia, Colombia). The Paisa community is considered a genetic isolate of Caucasian descent with low admixture with Amerindian and Negroid ethnicities (
). The cohort consisted of 14 multigenerational families and 125 nuclear families for a total of 838 individuals (372 affected and 466 unaffected individuals; 335 children and adolescents [3–16 years of age] and 503 adults [≥17 years of age]). The multigenerational families had an average size of 28 members (range, 9–57 family members) and an average of 2.93 generations (range, 2–4 generations). Full details of the clinical, demographic, and genetic ascertainment features and the methodology of neuropsychological evaluation have been published elsewhere (
). The study was reviewed and approved by the Institutional Review Board of the National Human Genome Research Institute (Protocol 00-HG-0058) and the University of Antioquia Ethics Committee (Protocol 11-13-342).
Genetic Statistical Analyses
SNP genotyping methods are presented in Supplement 1. Genotype data were imported into Golden Helix SVS software (version 8.3.1; Golden Helix, Bozeman, MT) for family-based association testing. Genotype and allelic frequencies were estimated by maximum likelihood. Family-based association testing analyses using ADHD status as a categorical variable were applied to the entire set of markers that passed quality control. We also performed family-based haplotype analyses to compare with the results at the marker level. Comorbid conditions (e.g., substance use and disruptive symptoms) and neuropsychological endophenotypes (e.g., Wechsler Intelligence Scale for Children [WISC] block design, WISC performance intelligence quotient [PIQ], WISC full-scale intelligence quotient [FSIQ], “A”-cancellation and vigilance test correct responses [ACVTCR] and omissions [ACVTO], and the Rey–Osterrieth Complex Figure Test [ROCFT copy]) were used as ADHD-interacting variables. We used endophenotype data only for 255 children and adolescents (170 affected and 85 unaffected individuals) with an FSIQ ≥ 81 and regular school grades adequate for their age in order to exclude participants with potential learning disabilities.
Because the ADGRL3 genomic region under study is known to be linked to ADHD (
), the null hypothesis of linkage and no association was tested. Individual genotypes inconsistent with Mendelian transmission were excluded from the analyses. All markers were tested for deviations from the Hardy-Weinberg equilibrium. Allelic tests of association were applied using a dominant model of inheritance.
Identification of Evolutionary Conserved Elements Within the ADGRL3 MCR and Prediction of Regulatory Function
We used the Evolutionary Conserved Region (ECR) Browser (
) to identify highly conserved elements within the ADGRL3 MCR (accession no. NG_033950.2, GRCh37.p13/hg19 assembly; coordinates, chr4:62,688,419-62,895,842). We looked at DNA sequence conservation across several vertebrate species, including human, mouse, chicken, and zebrafish, using a sliding window of >150 base pairs long and >70% identity. To predict transcriptional regulatory function, we gathered annotation data from three regulation tracks of the University of California, Santa Cruz (UCSC) table browser: chromatin state segmentation by a hidden Markov model (Broad ChromHMM) (
). Additional information on these annotation tracks can be found in Supplement 1.
Luciferase Assays
The generation of ECR-luciferase constructs, cell culture conditions, and transient transfections are described in Supplement 1. After 24 hours of transfection, culture supernatants containing secreted luciferases were collected. Luciferase assays were performed using the Pierce Luciferase Flash Assay kits (Thermo Fisher Scientific, Waltham, MA). Each ECR-luciferase construct was assayed in triplicate, and each transfection experiment was repeated at least three times (n = 9). Luciferase activity data were analyzed using a one-way analysis of variance with Bonferroni correction, as implemented in Prism 6 software (GraphPad Software, La Jolla, CA).
Electromobility Shift Assays
Electromobility shift assays were performed using standard procedures. ECR47 DNA probes were incubated with Myc-DDK-tagged recombinant TFs or whole-cell expression lysates (Origene, Rockville, MD). For supershift reactions, an anti-DDK (FLAG) monoclonal antibody (Origene) was added.
Zebrafish Bioassays
Zebrafish stocks and manipulations followed the animal care and use protocols used in our Zebrafish Core facility (National Institutes of Health/National Human Genome Research Institute). Zebrafish transgenesis and in situ hybridization were performed following standard procedures (see Supplement 1 for details).
Results
Identification of Potential Enhancer Elements Within ADGRL3
Using the ECR Browser, we identified highly conserved elements harbored in the ADGRL3 MCR. Although it is currently well established that the intergenome comparisons of distant species (e.g., humans and fish) are powerful in identifying critical distant regulatory elements, only 5% of the genes in the human genome contain a human/fugu noncoding ECR in their genomic neighborhood (
). For that reason, an analysis with species more closely related to humans than fish is required to identify regulatory elements for many human genes. We therefore used the human/chicken alignment to select candidate sequences. The human/opossum alignment (an evolutionarily closer species) was arbitrarily used to give an identity to the conserved elements, thereby naming 51 regions ECR1 through ECR51 (Figure S1 in Supplement 1 and Table S1 in Supplement 2).
Given that enhancer activity has been correlated with certain properties of chromatin (
), we gathered p300 binding site, histone mark, and DNase I hypersensitivity site annotation data to predict ECR enhancer function (Table S1 in Supplement 2). From the list of elements conserved in chicken, we excluded those ECRs not predicted to be functional by any of the annotation tracks above and that did not contain SNPs with a minor allele frequency > 1%, because genetic studies predominantly suggest that ADHD risk is primarily explained by common variants (
). Following this approach, the number of candidate ECRs was reduced to 10 (i.e., ECR1, ECR2, ECR4, ECR9, ECR20, ECR22, ECR26, ECR37, ECR46, and ECR47). Because coding regions can overlap enhancer sequences (
), we did not exclude those elements containing ADGRL3 exons (i.e., ECR20, ECR26, ECR37, and ECR47).
Genetic Statistical Analyses of ECR Variants
Table S2 in Supplement 1 shows a list of the common variants harbored in ECR sequences that are predicted to be functional. We performed family-based association tests in 838 individuals from the same Colombian multigenerational and nuclear families that previously showed linkage and association of ADHD to 4q13.2 (
). Genotype proportions did not significantly depart from the Hardy-Weinberg equilibrium. SNPs in 6 of the 10 ECRs showed significant association with ADHD, comorbid disorders, or endophenotypes after correction by false discovery rate (FDR) (Table 1). ECR46, ECR47, and ECR4 showed the highest association with ADHD (all five variants, pFDR = .00219), followed by ECR26 (both pFDR = .00342), ECR2 (variant rs1868790, pFDR = .00988), and ECR37 (variant rs1397548, pFDR = .03173). Interestingly, only variants in ECR46 (rs11131352) and ECR47 (rs17226398, rs56038622, and rs2271338) consistently showed association with disruptive symptoms (i.e., oppositional defiant disorder [all pFDR = .00005] and conduct disorder [all pFDR = .00076]; substance use of alcohol [all pFDR = .00199] and nicotine [all pFDR = .00203]; and endophenotypes ACVTCR [all pFDR = .00098], ACVTO [all pFDR = .00098], and ROCFT [all pFDR = .02775]). Variants in ECR1, ECR9, and ECR22 did not show associations with ADHD, comorbid conditions, or endophenotypes and therefore were not considered for functional analysis. None of the variants showed association with WISC block design, WISC PIQ, or WISC FSIQ. Haplotype analyses for ADHD affection status are presented in Table S3 in Supplement 1.
Table 1Association Between Evolutionary Conserved Region Common Variants and Attention-Deficit/Hyperactivity Disorder, Disruptive Symptoms, Substance Use, and Neuropsychological Endophenotypes in the Paisa Dataset
ECR ID
Variant
Allele (Frequency)
ADHD p Value
ODD p Value
CD p Value
Nicotine p Value
Alcohol p Value
ACVTCR p Value
ACVTO p Value
ROCFT p Value
Function
Raw
FDR
Raw
FDR
Raw
FDR
Raw
FDR
Raw
FDR
Raw
FDR
Raw
FDR
Raw
FDR
ECR46
rs11131352
T (0.132)
.00073
.00219
.00002
.00005
.00020
.00076
.00054
.00203
.00066
.00199
.00046
.00098
.00046
.00098
.01295
.02775
Intronic
ECR47
rs17226398
C (0.132)
.00073
.00219
.00002
.00005
.00020
.00076
.00054
.00203
.00066
.00199
.00046
.00098
.00046
.00098
.01295
.02775
Intronic
rs56038622
T (0.132)
.00073
.00219
.00002
.00005
.00020
.00076
.00054
.00203
.00066
.00199
.00046
.00098
.00046
.00098
.01295
.02775
Intronic
rs2271338
A (0.132)
.00073
.00219
.00002
.00005
.00020
.00076
.00054
.00203
.00066
.00199
.00046
.00098
.00046
.00098
.01295
.02775
Intronic
ECR4
rs10021694
T (0.313)
.00063
.00219
.03921
NS
.03274
NS
.03740
NS
.03092
NS
.00003
.00016
.00003
.00017
.00015
.00228
Intronic
ECR26
rs734644
T (0.281)
.00155
.00342
.000003
.00003
.02510
NS
.02262
.04241
.00119
.00255
.00201
.00376
.00199
.00373
NS
NS
Coding-syn
rs2305339
G (0.278)
.00160
.00342
.000003
.00003
.02641
NS
.02223
.04241
.00082
.00204
.00552
.00920
.00548
.00913
NS
NS
Intronic
ECR2
rs1868790
A (0.427)
.00527
.00988
.02291
.03819
.01442
.04327
.02062
.04241
.02062
.03866
NS
NS
NS
NS
NS
NS
Intronic
rs73823249
G (0.064)
NS
NS
NS
NS
NS
NS
.02191
.04241
.000001
.00001
NS
NS
NS
NS
.00186
.00928
Intronic
ECR37
rs1397548
A (0.308)
.01904
.03173
.00272
.00582
NS
NS
NS
NS
NS
NS
.00002
.00013
.00002
.00013
.02180
.04088
Coding-syn
rs1397547
G (0.059)
NS
NS
.01799
.03372
.02892
NS
.04169
NS
.04748
NS
.000004
.00006
.000004
.00006
.00035
.00261
Coding-syn
ACVTCR, “A”-cancellation and vigilance test correct responses; ACVTO, “A”-cancellation and vigilance test omissions; ADHD, attention-deficit/hyperactivity disorder; CD, conduct disorder; ECR, evolutionary conserved region; FDR, Benjamini–Hochberg false discovery rate; NS, not significant; ODD, oppositional defiant disorder; ROCFT, Rey–Osterrieth complex figure test; syn, synonymous.
Given the identical p values for some single-variant associations and the physical proximity of variants, we calculated the linkage disequilibrium (LD) between SNPs. We used phased genotype data from the 1000 Genomes Project for the Northern Europeans from Utah population. For our dataset, we performed an LD pairwise analysis using Golden Helix SVS software. Variants rs2305339 and rs734644 in ECR26 were in complete LD (r2 = .99, D′ = 1.00, haplotypes AC/GT [protective/risk]) as were variants rs11131352 (i.e., ECR46), rs17226398, rs56038622, and rs2271338 (i.e., ECR47) (r2 = 1.00, D′ = 1.00, haplotypes AGAG/TCTA [protective/risk]) (Table S4 in Supplement 1). For initial evaluation in this study, risk alleles in complete LD were tested in luciferase assays as a haplotype rather than individually and were compared to the protective haplotype.
ECR46 and ECR47 appear as two independent conserved elements in chickens, but they are part of a single “core” ECR (≥350 base pairs long, ≥77% identity) (
) in species that are evolutionarily closer to humans, such as opossum, mouse, and chimpanzee. For that reason, these two sequences were evaluated in luciferase assays independently and together.
ECR47 Functions as a Brain-Specific Enhancer
We tested the ADHD-associated ECRs for enhancer activity using a dual secreted luciferase reporter assay in four different cell lines (Figure 1). We compared luciferase activity of ECRs containing the protective versus the risk allele/haplotype. As described above, ECR26 and ECR47 contain markers in complete LD; we therefore compared haplotypes rather than pairwise combinations of alleles. ECR37 and ECR47 were the only elements to stimulate luciferase expression. ECR37 showed weak enhancer activity in all cell lines; however, its ADHD-associated variant rs1397548 did not affect enhancer function. ECR47 also showed weak enhancer activity, but only in B35 neuroblastoma and U87 astrocytoma cells, suggesting tissue-specific activity. Unlike ECR37, the ECR47 risk haplotype (CTA) reduced luciferase activity by approximately 40% in both cells lines (pBonferroni < .0001). ECR2, ECR4, ECR26, and ECR46 showed no stimulation of luciferase activity in any of the cell lines. The activity of core element ECR46/47 was similar to that of ECR47 alone (Figure S2 in Supplement 1), indicating that regulatory activity resides on the ECR47 moiety.
Figure 1Secreted luciferase assays testing attention-deficit/hyperactivity disorder–associated evolutionary conserved region (ECR) sequences for enhancer activity. Four different cell lines were transfected with the ECR-luciferase constructs for 24 hours. Luciferase activity was normalized against a constitutive Gaussia luciferase plasmid and expressed as relative luciferase activity (Cypridina/Gaussia ratio). Results are presented as the stimulation of luciferase activity above the basal, enhancer-less vector containing only the minimal promoter. Letters in parentheses indicate the alleles or haplotypes being tested. ECR37 and ECR47 showed weak enhancer activity compared to the strong control enhancers Simian vacuolating virus 40 (SV40) and human Bicore1. ECR47 risk haplotype CTA decreased enhancer activity in B35 neuroblastoma and U87 astrocytoma cells (pBonferroni < .0001). Statistical differences were defined using one-way analysis of variance with correction for multiple comparisons. CHO-K1, Chinese hamster ovary K1 cells; P19, mouse embryonic carcinoma cells.
Subsequently, we evaluated the ability of the human ECR47 sequence to drive green fluorescent protein (GFP) reporter expression in stable transgenic zebrafish lines. A clear GFP signal was detected in the zebrafish brain that was restricted to the ventral forebrain, the midbrain, and the hindbrain of embryos 28 to 30 hours postfertilization (Figure 2A). In contrast with the weak enhancer activity detected in luciferase assays, a strong GFP signal was observed in zebrafish. This may be explained by the fact that multicopy integration events can occur during transgenesis or that ECR47 may function as a developmental enhancer, therefore behaving differently during embryogenesis compared to differentiated cells in culture. Interestingly, the ECR47-driven GFP expression pattern shared specific aspects of endogenous adgrl3.1 expression in forebrain, midbrain, and hindbrain, but not in telencephalon and retina, as evaluated by in situ hybridization (Figure 2B, C). Because the zebrafish enhancer detection vector system is not robust enough to allow for quantitative measurement of enhancer activity in vivo, we did not investigate the effect of the ECR47 risk haplotype (CTA) on enhancer function in the zebrafish. A number of variables make it challenging to compare ECR47 risk and protective haplotypes quantitatively using the zebrafish enhancer detection system, including 1) different copy number integration during zebrafish transgenesis; 2) the structural complexity of transgene integration loci in the genome (heterochromatin vs. euchromatin); 3) DNA methylation effects; and 4) zebrafish tissue autofluorescence and nonspecific, background GFP expression.
Figure 2In vivo enhancer testing and correlation with adgrl3.1 expression in the zebrafish. Evolutionary conserved region 47–driven green fluorescent protein (GFP) expression was monitored in the brain of transgenic embryos at 28 to 30 hours postfertilization (hpf). (A) Stable transgenic F2 embryo showing GFP expression restricted to the central nervous system (i.e., the forebrain, midbrain, and hindbrain). (B, C) Expression of adgrl3.1 was detected by in situ hybridization of embryos at (B) 36 hpf and (C) 48 hpf (top row, left midsagittal and right retina in focus). Evolutionary conserved region 47–driven GFP expression represents several specific aspects of endogenous adgrl3.1, consistent with the location of neuronal tissue in the developing brain. Endogenous adgrl3.1 messenger RNA expression is detected in the telencephalon, the ventral forebrain, the midbrain, the hindbrain, and the retina throughout the analyzed developmental stages. The anterior is to the left in all images, and the dorsal side is up in all lateral views [scale bar = 100 µm; all images in part (C) are in the same scale]. FB, forebrain; HB, hindbrain; MB, midbrain; MHB, midbrain–hindbrain boundary; r, rostral; v, ventral.
It is important to highlight that ECR47 is also conserved in zebrafish (Figure S3 in Supplement 1), which makes it an ultraconserved element that is likely to have an important biological function.
TFs Preferentially Associated With Brain Function Are Overrepresented in ECR47
Analysis of the TF–binding profiles of ECR47 and 144 in vivo tested brain enhancers from the VISTA Enhancer Browser (
) (Table S5 in Supplement 3) revealed a significant overrepresentation of developmental TF families preferentially associated with brain tissue, such as distal-less homeodomain (V$DLXF), NK6 homeobox (V$NKX6), paired box (PAX) homeodomain (V$PAXH), and Brn-5 POU (V$BRN5) TFs, as suggested by the high Z scores (Table S6 in Supplement 1). Homeobox TFs (V$HBOX), Brn POU domain (V$BRNF), cocaine- and amphetamine-regulated transcript 1 (V$CART), and Lim homeodomain (V$LHXF) factors were also overrepresented, but while they may participate in neural development and function, they are not preferentially associated with the central nervous system (CNS).
The potential effects of ECR47 SNP allele substitutions on TF binding were examined using SNPInspector software (Genomatix, Munich, Germany). All three ADHD risk alleles (CTA) were predicted to produce loss or gain of binding sites for important neurodevelopmental TFs, such as grainyhead-like transcription factor 1 (GRHL1); PAX2; PAX3; YY1 transcription factor; hypoxia responsive elements; Tax-1/cyclic AMP-responsive element-binding protein 1 (CREB) complex; LIM homeobox 3; POU domain, class 3, transcription factor 2 (POU3F2); POU domain, class 4, transcription factor 3 (POU4F3); POU domain, class 6, transcription factor 2(POU6F2); pituitary-specific positive transcription factor 1 (PIT1); and cone–rod homeobox protein (CRX) (Table 2).
Table 2Transcription Factor–Binding Sites Predicted to be Affected by the Attention-Deficit/Hyperactivity Disorder–Associated Haplotype in Evolutionary Conserved Region 47
At the matrix similarity and optimized threshold, a minimum number of matches is found in nonregulatory test sequences (i.e., with this matrix similarity, the number of false positive matches is minimized).
Core similarity refers to the degree of matching to the “core sequence” of a matrix (i.e., the highest conserved positions of the matrix, usually 4 bases). Maximum core similarity (1.0) is reached when the highest conserved bases of a matrix match exactly in the sequence.
Matrix similarity refers to the degree of matching to each sequence position in the matrix. A perfect match to the highest conserved nucleotide at each position gets a score of 1.00. A “good” match usually has a similarity of > 0.80. Mismatches in highly conserved positions of the matrix decrease the matrix similarity more than mismatches in less conserved regions.
rs17226398
G
C
G > C
Gained
V$PAX3/PAX3.02
Pax-3 paired domain protein
0.85
+
1
0.893
Gained
V$HIFF/HRE.02
Hypoxia-responsive element
0.97
–
1
0.978
Lost
V$GRHL/GRHL1.01
Grainyhead-like 1
0.86
+
1
0.864
Lost
V$CP2F/TCFCP2L1.01
Transcription factor CP2-like 1
0.87
+
0.815
0.894
rs56038622
A
T
A > T
Gained
V$CREB/TAXCREB.02
Tax/CREB complex
0.71
+
0.75
0.739
rs2271338
G
A
G > A
Gained
V$LHXF/LHX3.02
LIM-homeodomain 3
0.82
+
1
0.84
Gained
V$BRNF/BRN3.01
Brn-3, POU-IV protein class
0.78
–
0.75
0.815
Gained
V$BRN5/POU6F2.01
Retina-derived POU-domain factor 1, dimeric
0.76
+
0.81
0.779
Gained
V$BRNF/BRN2.04
POU class 3 homeobox 2 (POU3F2)
0.82
+
1
0.827
Gained
V$OCT1/OCT1.03
Octamer-binding protein 1 (POU2F1)
0.85
+
0.767
0.854
Gained
V$PIT1/PIT1.02
Pituitary transcription factor 1 (POU1F1)
0.81
+
1
0.821
Gained
V$HNF6/OC2.01
One CUT-homeodomain protein
0.82
+
1
0.876
Lost
V$PAX2/PAX2.01
Zebrafish PAX2 paired domain protein
0.78
+
0.789
0.784
Lost
V$YY1F/YY1.02
Yin and Yang 1 repressor
0.94
–
1
0.979
Lost
V$BCDF/CRX.01
Cone–rod homeobox protein
0.94
+
1
0.946
Searches were performed using the SNPInspector function within the Genomatix Software Suite (Genomatix, Munich, Germany). The analyses are based on the MatInspector and Genomatix libraries of matrix descriptions for transcription factor–binding sites (MatBase).
SNP, single nucleotide polymorphism; TF, transcription factor.
a At the matrix similarity and optimized threshold, a minimum number of matches is found in nonregulatory test sequences (i.e., with this matrix similarity, the number of false positive matches is minimized).
b Core similarity refers to the degree of matching to the “core sequence” of a matrix (i.e., the highest conserved positions of the matrix, usually 4 bases). Maximum core similarity (1.0) is reached when the highest conserved bases of a matrix match exactly in the sequence.
c Matrix similarity refers to the degree of matching to each sequence position in the matrix. A perfect match to the highest conserved nucleotide at each position gets a score of 1.00. A “good” match usually has a similarity of > 0.80. Mismatches in highly conserved positions of the matrix decrease the matrix similarity more than mismatches in less conserved regions.
YY1 Binding to ECR47 Is Disrupted by the rs2271338 ADHD Risk Allele
The binding sites of three important neurodevelopmental TFs were predicted to be disrupted by ECR47 risk allele substitutions. PAX2 and YY1 were predicted to be disrupted by rs2271338 G>A and GRHL1 was predicted to be disrupted by rs17226398 G>C (Table 2). While GRHL1 and PAX2 did not bind to their predicted sites (Figure 3B, C), YY1 produced a clear gel shift when added to a 27–base pair fragment containing the rs2271338 protective allele. YY1 binding was abrogated by addition of excess unlabeled protective sequence, but not by the sequence containing the risk allele (Figure 3A).
Figure 3YY1 transcription factor binding to evolutionary conserved region 47 (ECR47) enhancer is disrupted by the rs2271338 risk allele. (A) A biotin-labeled DNA fragment containing a rs2271338 attention-deficit/hyperactivity disorder protective allele was incubated with a human embryonic kidney 293 (HEK293) whole cell lysate (WCL) expressing DDK (FLAG)-tagged human YY1 (lanes 2–5 from left to right). A mobility shift in lane 2 indicates protein binding to the probe, which was abrogated by incubation with molar excess of the unlabeled fragment (lane 3), but not of an unlabeled fragment containing the rs2271338 risk allele (lane 4). Higher molecular weight shift with the addition of anti-DDK antibody identifies YY1 as the binding factor (lane 5). Lanes 6 to 10 correspond to a known YY1 binding sequence used as positive control (
). Lanes 7 to 10 show protein binding to the control probe, with the anti-DDK antibody producing a similar supershift, thereby confirming YY1 identity (lane 8). Molar excess of the unlabeled ECR47-protective fragment was capable of reducing protein binding (lane 9), but the risk fragment was not (lane 10). No binding was detected for grainyhead-like transcription factor 1 (GRHL1) (B) or paired box 2 (PAX2) (C) transcription factors. (B) Lanes 1 to 5, biotin-labeled fragment containing the rs17226398 protective allele; lanes 6 to 10, GRHL1-positive control sequence (
). Interestingly, no shift was observed in the positive control except when the antibody was added (lane 8). Apparently, the addition of the antibody stabilizes the GRHL1–DNA interaction, which otherwise is labile under the experimental conditions used. (C) Left panel, lanes 1 to 5, same DNA fragment used for YY1 in (A); lanes 6 to 10, PAX2-positive control sequence (
). Neither the protective nor the risk ECR47 fragments affected PAX2 binding to the positive control probe (lanes 9 and 10). The right panel shows the results for a longer ECR47 probe (Protective 2) extending to the 3ʹ end of the sequence. PAX2 still did not bind to DNA. Position weight matrices were taken from MotifMap for human hg19 (e.g., YY1 and PAX2) and Drosophila dm3 (GRHL1) assemblies (
We next examined whether YY1 affected endogenous ADGRL3 expression. Real-time polymerase chain reaction analyses revealed a strong expression of endogenous ADGRL3 messenger RNA (mRNA) in SH-SY5Y neuroblastoma but not in U87 astrocytoma cells (Figure S4A in Supplement 1), suggesting that expression of this gene may be specific to the neuronal lineage in the CNS. After YY1 small interfering RNA transfection, we could not detect any significant changes in ADGRL3 mRNA expression in these cell lines (Figure S4B in Supplement 1). This result might suggest that the ECR47–YY1 pair is functional only during development or that its spatiotemporal regulation of ADGRL3 expression is complex and cannot be modeled properly outside the native regulatory circuitry of the brain.
rs2271338 Is Associated With Decreased ADGRL3 mRNA Expression in the Thalamus
Unlike other behavioral disorders, such as schizophrenia and major depressive disorder, brain tissue from ADHD patients is not readily available. However, SNPs associated with complex diseases are likely to function as expression quantitative trait loci, and the tissues of unaffected individuals can be used for gene expression association analyses. Expression quantitative trait loci analysis of brain tissue from 137 neuropathologically confirmed controls (16–102 years of age) revealed a significant association between the rs2271338 AA risk genotype and reduced ADGRL3 expression in the thalamus (p < .01) (Figure S5 and Table S7 in Supplement 1). rs2271338 was either absent or not associated with ADGRL3 expression in the National Institutes of Health Common Fund’s Genotype-Tissue Expression (GTEx), Columbia University’s SNPExpress, and the National Heart, Lung, and Blood Institute’s GRASP databases (Supplement 1).
Discussion
Few molecular studies have attempted to explain the molecular effects of ADHD-associated genetic variants. Studies on dopamine transporter (DAT1) (
Attention deficit/hyperactivity disorder-derived coding variation in the dopamine transporter disrupts microdomain targeting and trafficking regulation.
) have examined the functional properties of rare missense mutations of moderate and large effects, but these findings fail to explain the higher incidence of ADHD and larger phenotypic variance observed in populations. Instead, the common disease/common variant hypothesis is better supported by a substantial number of genetic epidemiological studies, with common variants accounting for approximately 40% of ADHD heritability (
ADGRL3 is a strong ADHD candidate gene. ADGRL3 common variants predispose individuals to ADHD, modulate brain metabolism, and predict ADHD severity and comorbidity with disruptive symptoms and substance use disorder (
). When combined with other risk factors, ADGRL3 risk variants improve the prediction of ADHD severity, dysfunctional comorbidity, long-term outcome, and response to treatment with stimulant medication (
Influence of a latrophilin 3 (LPHN3) risk haplotype on event-related potential measures of cognitive response control in attention-deficit hyperactivity disorder (ADHD).
). Recent work from our group also showed the genetic linkage and association of ADGRL3 variants to neuropsychological endophenotypes, providing a more powerful framework for ADHD clinical classification and for the identification of causative genetic variation (
ADGRL3 encodes a member of the latrophilin subfamily of adhesion G protein–coupled receptors that is highly expressed in brain regions implicated in dopaminergic systems (
). ADGRL3 endogenous ligand has been identified as fibronectin leucine rich transmembrane protein 3, a postsynaptic membrane protein involved in axon guidance and neuronal cell migration during embryonic development (
). More importantly, the ADGRL3–fibronectin leucine rich transmembrane protein 3 synaptic pair regulates excitatory transmission both in vitro and in vivo (
Using a combination of evolutionary sequence conservation and regulatory annotation data, we identified candidate sequences within the ADGRL3 MCR with potential regulatory function. Several variants revealed significant association with ADHD, comorbid disorders, or neuropsychological endophenotypes, with a four-marker haplotype in the ECR46/ECR47 core element showing the highest level of association across the board. This result may suggest that ECR46/47 participates in a common neurobiological pathway to ADHD. Given the complexity of the ADHD phenotype, we must expect complex genetic interactions within the ADGRL3 locus and with other genomic regions, which may explain the different levels of association observed across ECRs (Table 1).
The genetic association of ADGRL3 with disruptive behaviors and substance use supports previous observations. Families with ADHD cluster oppositional defiant disorder, conduct disorder, and substance use disorder (
); children diagnosed with ADHD monitored during the transition into adolescence have higher rates of alcohol, tobacco, and psychoactive drug use than unaffected children (
Adolescent substance use in the multimodal treatment study of attention-deficit/hyperactivity disorder (ADHD) (MTA) as a function of childhood ADHD, random assignment to childhood treatments, and subsequent medication.
J Am Acad Child Adolesc Psychiatry.2013; 52: 250-263
). In agreement with their findings, we show a significant association of ECR variants with lower ACVTCR scores and higher ACVTO scores. Impaired response inhibition and poor sustained attention, as measured by these two tests, are fundamental components of the executive dysfunction present in patients with ADHD (
). In the same vein, the inattentive and hyperactive/impulsive motor phenotype associated with patients with ADHD is expected to affect the ability to perform the ROCFT copy test, and therefore the significant association with ECR risk variants. However, the complexity of the multiple cognitive domains assessed by ROCFT (i.e., visuospatial constructional ability, visual memory, and several components of executive function) can confound the interpretation of scores, which might explain the lower level of significance (
). We did not detect any association with the WISC measures (i.e., block design, PIQ, or FSIQ), which might indicate that ADGRL3 variation does not contribute to the cognitive deficits evaluated by these particular tests.
Functional testing in vitro identified ECR47 as a transcriptional enhancer. ECR47 was active in cultured neurons and astrocytes in a tissue-specific manner, and its function was disrupted by the ADHD risk haplotype defined by the variants rs17226398, rs56038622, and rs2271338 (CTA) in complete LD. The nonrandom association between these risk alleles may suggest evolutionary selective pressure to conserve an important biological function. Although neurons and astrocytes have distinct transcriptome profiles, they both share a common neuroepithelial origin, and over the past two decades it has become clear that astrocytes participate in a wide variety of complex functions in the CNS, including crucial roles in synaptic transmission and information processing in neural circuits (
). However, additional studies are required to determine whether ECR47 function in astrocytes is functionally relevant for the pathology of ADHD, because ADGRL3 expression in astrocytoma cells was low compared to human neuroblastoma cells (Figure S4 in Supplement 1).
Strikingly, we also found that ECR47 functions as a brain enhancer in vivo. ECR47 was able to drive GFP expression in the zebrafish brain, which is consistent with various aspects of endogenous adgrl3.1 expression. These results are in concert with a previous report by Lange et al. (
) showing wide adgrl3.1 expression in the zebrafish brain at 24, 48, and 72 hours postfertilization, which suggests that ECR47 activation may share brain-specific factors with the adgrl3.1 transcriptional machinery during zebrafish development.
Enhancer elements have signatures that define tissue specificity (
). Analysis of ECR47 and a large set of in vivo–tested brain enhancers revealed a significant overlapping of binding sites for homeobox TF families that are preferentially associated with neurodevelopmental processes. Distal-less (i.e., DLX-1, DLX-2, and DLX-5), BRN5/POU6F1, PAX (i.e., PAX2, PAX3, PAX5, PAX6, PAX7, and PAX8), and NKX6 (NKX6.2) homeodomain factors are known to play prominent roles in the development and function of the CNS (
High affinity YY1 binding motifs: Identification of two core types (ACAT and CCAT) and distribution of potential binding sites within the human beta globin cluster.
). The potential function of YY1 in the developing nervous system was first suggested by the phenotypic analysis of Yy1+/– mice. Null Yy1 mice show early embryonic lethality, but a subset of Yy1+/– mice (~20%) display growth retardation and neural tube defects. The brains of Yy1+/– mouse embryos show exencephaly, asymmetric structure, and the presence of pseudoventricles (
). While a significant reduction of XYY1 protein levels results in early embryonic lethality, a partial depletion results in anteroposterior patterning defects and reduction of head structures (
). At the molecular level, Xyy1 ablation in Xenopus embryos reveals downregulation of TFs involved in neural patterning (i.e., homeobox genes, engrailed2, otx2, and krox20) and neural crest cell specification and migration (i.e., slug and snai1) (
). Studies using other systems have also shown dysregulation of neurotransmitter signaling and metabolism genes, such as dynamin-1 and dopamine β-hydroxylase in neurons, and Glast glutamate/aspartate transporter in astrocytes (
Analysis of the effect of rs2271338 risk allele on ADGRL3 expression in cultured human cells and postmortem brain tissue showed apparently contradictory results. While YY1 downregulation did not affect ADGRL3 expression in neuroblastoma cells, the rs2271338 risk allele was associated with reduced expression in the adult thalamus. The thalamus was one of the earliest brain areas considered in the pathophysiology of ADHD (
) have been shown previously in children with ADHD. Various methods, including functional connectivity analysis, have uncovered thalamic abnormalities in patients with ADHD (
Although additional studies are required to establish the precise role of ECR47, the results presented here suggest an important neurological function. While brain expression data indicate an ADGRL3 expression maximum across fetal and infant stages (Figure S6 in Supplement 1), relatively high expression levels are maintained throughout life, suggesting that this gene is necessary for proper brain function from conception to demise. The lack of effect of YY1 knockdown on endogenous ADGRL3 expression in differentiated cells suggests that ECR47 may only be active during developmental stages; however, the association between rs2271338 and reduced ADGRL3 expression levels in the adult thalamus hampers our understanding of ADGRL3 spatiotemporal regulation by ECR47–YY1. Elucidation of this genetic interaction will help to decipher the molecular mechanisms underlying ADHD pathogenesis.
The experimental methodology used in this research could be a paradigm for the evaluation of noncoding risk variants associated with complex traits.
Limitations
We tested risk variants only for their effects on transcriptional enhancer activity. While other transcriptional regulatory functions (e.g., those mediated by silencer and insulator elements) were not investigated, functional testing of these types of sequence have proven difficult to design, and there are currently no robust in vivo assays in widespread use (
). We must also consider the possible effect of risk variants on ADGRL3 mRNA splicing—especially synonymous changes and noncoding variants within 50 base pairs of exon–intron junctions—and on the expression and function of overlapping noncoding RNAs.
Acknowledgments and Disclosures
This work was supported by intramural resources from the National Human Genome Research Institute of the U.S. National Institutes of Health.
We thank Paul Kruszka, M.D., for his detailed revision of the manuscript and helpful comments.
This study used the computational capabilities of a demo license to Genomatix Software (Munich, Germany).
The authors report no biomedical financial interests or potential conflicts of interest.
Influence of a latrophilin 3 (LPHN3) risk haplotype on event-related potential measures of cognitive response control in attention-deficit hyperactivity disorder (ADHD).
Attention deficit/hyperactivity disorder-derived coding variation in the dopamine transporter disrupts microdomain targeting and trafficking regulation.
Adolescent substance use in the multimodal treatment study of attention-deficit/hyperactivity disorder (ADHD) (MTA) as a function of childhood ADHD, random assignment to childhood treatments, and subsequent medication.
J Am Acad Child Adolesc Psychiatry.2013; 52: 250-263
High affinity YY1 binding motifs: Identification of two core types (ACAT and CCAT) and distribution of potential binding sites within the human beta globin cluster.