Extinction of Cocaine Memory Depends on a Feed-Forward Inhibition Circuit Within the Medial Prefrontal Cortex

BACKGROUND: Cocaine-associated environments (i.e., contexts) evoke persistent memories of cocaine reward and thereby contribute to the maintenance of addictive behavior in cocaine users. From a therapeutic perspective, enhancing inhibitory control over cocaine-conditioned responses is of pivotal importance but requires a more detailed understanding of the neural circuitry that can suppress context-evoked cocaine memories, e.g., through extinction learning. The ventral medial prefrontal cortex (vmPFC) and dorsal medial prefrontal cortex (dmPFC) are thought to bidirectionally regulate responding to cocaine cues through their projections to other brain regions. However, whether these mPFC subregions interact to enable adaptive responding to cocaine-associated contextual stimuli has remained elusive. METHODS: We used antero- and retrograde tracing combined with chemogenetic intervention to examine the role of vmPFC-to-dmPFC projections in extinction of cocaine-induced place preference in mice. In addition, electrophysiological recordings and optogenetics were used to determine whether parvalbumin-expressing inhibitory interneurons and pyramidal neurons in the dmPFC are innervated by vmPFC projections. RESULTS: We found that vmPFC-to-dmPFC projecting neurons are activated during unreinforced re-exposure to a cocaine-associated context, and selective suppression of these cells impairs extinction learning. Parvalbumin-expressing inhibitory interneurons in the dmPFC receive stronger monosynaptic excitatory input from vmPFC projections than local dmPFC pyramidal neurons, consequently resulting in disynaptic inhibition of pyramidal neurons. In line with this, we show that chemogenetic suppression of dmPFC parvalbumin-expressing inhibitory interneurons impairs extinction learning. CONCLUSIONS: Our data reveal that vmPFC projections mediate extinction of a cocaine-associated contextual memory through

Persistently recurring memories of cocaine reward interfere with the ability of chronic cocaine users to abstain from cocaine intake. It is well established that locations where cocaine is repeatedly used (e.g., drug houses or clubs) become strongly associated with the rewarding effects of cocaine, and re-exposure to these cocaine contexts during prolonged periods of abstinence triggers retrieval of cocaine reward memory and thereby promotes relapse (1). The salience of cocaine-associated contextual cues can be reduced by prolonged cue exposure in absence of cocaine reward, a process called extinction learning (2). However, cue exposure therapies have been largely unsuccessful in treatment of substance use disorders (3). Therefore, it is crucial to elucidate the neuronal circuitry that promotes suppression of behavioral responding to cocaine-contextual cues, with the ultimate goal to facilitate the design of therapeutic interventions that can strengthen inhibitory control over cocaine-seeking behavior.
The medial prefrontal cortex (mPFC) has been implicated in cue-induced craving in humans (4) and responding to stimuli that are associated with cocaine, other drugs of abuse, and natural rewards in animal models (5,6). In rodent operant cocaine self-administration paradigms, a conditioned response is typically referred to as cocaine seeking (7), whereas cocaine conditioned place (i.e., context) preference (CPP) can be assessed after cocaine is administered to an animal by the experimenter (8). Specifically, the dorsal mPFC (dmPFC) is thought to drive conditioned cocaine seeking via projections to the nucleus accumbens (NAc) core (9,10). The ventral mPFC (vmPFC) has a more complex role, because it is able to promote cocaine seeking (11,12) and inhibit conditioned responses (seeking and place preference) after extinction learning (9)(10)(11). Projections of the vmPFC to the NAc shell mediate the effect of extinction-induced inhibitory control (12,13). Thus, both the dmPFC and vmPFC have a critical role in responding to cocaine cues but have thus far been considered to function as separate hubs in parallel circuits that regulate this behavior. Recent studies point to reciprocal connectivity between the vmPFC and the dmPFC (14)(15)(16)(17). However, how this intrinsic mPFC connectivity contributes to adaptive responding to cocaine-associated stimuli has remained poorly understood.
To this end, we anatomically and physiologically dissected vmPFC-to-dmPFC projections and investigated whether this circuit modulates extinction of responding to a cocaineassociated context using CPP. We found that vmPFC projections innervate parvalbumin-expressing interneurons (PV-INs) in the dmPFC, resulting in disynaptic inhibition of dmPFC pyramidal neurons (PNs). Furthermore, vmPFC-to-dmPFC neurons were preferentially activated during unreinforced re-exposure to a cocaine-associated context, and selective suppression of vmPFC-to-dmPFC projecting neurons or direct suppression of dmPFC PV-INs impaired extinction of cocaine place preference memory, confirming the critical contribution of this circuit to extinction learning.

METHODS AND MATERIALS Animals
Male wild-type C57BL/6J and transgenic PV::Cre mice (The Jackson Laboratory, stock no. 017320, maintained on a C57BL/6J background) aged 6-8 weeks at the start of experiments were individually housed. Mice were kept on a 12-hour light/dark cycle with regular laboratory chow food and water available ad libitum. Behavioral experiments were conducted during the animals' light phase. All experimental procedures were approved by the Central Committee for Animal Experiments (Centrale Commissie Dierproeven) of The Netherlands and the Animal Ethical Care Committee (Instantie voor Dierenwelzijn) of the Vrije Universiteit Amsterdam.

Cocaine CPP
The conditioning apparatus consisted of 2 main compartments that differed in tactile and visual cues, connected by a small center compartment (19). On day 0, baseline preference for the main compartments was assessed by allowing animals to freely explore all compartments (pretest; 10 min). The cocainepaired and saline-paired compartment were counterbalanced within all groups, such that on average, the groups did not have a baseline preference for one of the two main compartments, thereby allowing an unbiased procedure. Conditioning sessions (15 min) were conducted twice daily over 3 consecutive days. For this, mice received saline (intraperitoneal [i.p.]; morning) or cocaine (15 mg/kg in saline; i.p.; afternoon) before being confined to one of the main compartments. After 3 weeks of forced abstinence in the home cage, animals were subjected to extinction training and/or a postconditioning test (post-test) (19). For extinction training, mice were confined to the cocaine-and saline-paired compartment (15 min each) in vmPFC-to-dmPFC Projections Extinguish Cocaine Memory the absence of cocaine or saline treatment. In a post-test, we determined preference scores by allowing animals to freely explore all compartments for 5 minutes (to avoid withinsession extinction) under drug-free conditions. Time spent in each compartment was measured using a video camera and Ethovision video-tracking software (Noldus). A preference score for each animal was calculated as time spent in cocainepaired compartment minus saline-paired compartment.

Immunohistochemistry
Mice were transcardially perfused using ice-cold PBS pH 7.4, followed by ice-cold 4% paraformaldehyde in PBS pH 7.4. Brains were removed, postfixed overnight in 4% paraformaldehyde solution, and then immersed in 30% sucrose in PBS with 0.02% NaN 3 . Brains were then sliced in 35-mm coronal sections using a cryostat and stored in PBS with 0.02% NaN 3 at 4 C until further use. Immunohistochemical stainings were performed using standard procedures (19) with the following antibodies: rabbit anti-Fos (1:500, sc52), mouse  vmPFC-to-dmPFC Projections Extinguish Cocaine Memory

Electrophysiological Recordings
See the Supplement.

Quantification and Statistical Analysis
Statistical details are presented in the figure legends. Number of animals is shown as n and number of cells as n cells . All graphs show means 6 SEM. SPSS software (version 25; IBM Corp.) and Prism (GraphPad Software, Inc) were used for statistical analysis. Comparisons between groups were made using two-tailed unpaired t tests or, in case of paired data, a two-tailed paired t test. When the data was not modeled by a normal distribution, analysis was subjected to nonparametric Mann-Whitney U test for between-group comparisons and Wilcoxon signed-rank test for within-subject comparisons. To investigate differences in activation of labeled cells between groups (group 3 population [Fos 1 /mCherry 2 vs. Fos 1 / mCherry 1 ]), a repeated measures analysis of variance was conducted, followed by post hoc Bonferroni tests. Significance was set at p , .05.

Identification of vmPFC-to-dmPFC Projecting Neurons
We previously found that optogenetic stimulation of vmPFC PNs has no effect on expression of cocaine CPP memory during the first 2 days after conditioning but facilitates extinction 3 weeks after conditioning (19). We now aimed to identify the neuronal target of vmPFC PN projections that promote extinction learning. To anatomically trace projections of vmPFC PNs, we unilaterally expressed CaMKIIa promoterdriven ChR2 (channelrhodopsin-2) fused to EYFP in the vmPFC of mice and observed EYFP 1 axonal fibers in both the ipsi-and contralateral dmPFC ( Figure 1A). Layer 5/6 of the dmPFC exhibited the highest density of EYFP 1 fibers. We then applied retrograde viral tracing by injecting retroAAV-hSyn::Cre in the dmPFC and Cre-dependent AAV-hSyn::DIO-mCherry in the contralateral vmPFC. This revealed a population of vmPFC-to-dmPFC projecting neurons ( Figure 1B). Axonal fibers were observed in the corpus collosum, indicating that they ran via the ipsilateral forceps minor of the corpus collosum through the corpus collosum and then back through the contralateral forceps minor of the corpus collosum to terminate in the dmPFC ( Figure 1C). The vmPFC heavily innervates the NAc shell (19) and projections to this region have been implicated in extinction of cocaine seeking (12,13). Therefore, we questioned whether vmPFC neurons that project to the dmPFC have collateral projections to the NAc shell. Whereas dense mCherry 1 axonal fibers were observed in the dmPFC after retrograde labeling, very sparse mCherry 1 fibers were observed in the NAc shell ( Figure 1D), suggesting that vmPFC neurons that project to the dmPFC and NAc shell overlap to a small extent only. To confirm this, we retrogradely labeled vmPFC neurons by injection of CTB-488 or CTB-555 in the NAc shell and dmPFC, respectively ( Figure 1E). We examined colocalization in the contralateral vmPFC to exclude the possibility that neurons in the ipsilateral vmPFC were labeled as a result of CTB injection in the adjacent dmPFC and/ or NAc shell. Of the vmPFC-to-dmPFC projecting neurons (CTB-555 1 ), 9.96 6 4.8% (mean 6 SEM) were also CTB-488 1 ( Figure 1F). Inversely, only 5.54 6 2.1% of the vmPFC-to-NAc shell projecting neurons (CTB-488 1 ) were CTB-555 1 . Hence, vmPFC neurons that project to the dmPFC and NAc shell represent largely distinct populations.

vmPFC-to-dmPFC Projecting Neurons Mediate Extinction of Cocaine CPP Memory
We next assessed whether vmPFC-to-dmPFC projecting cells are activated on re-exposure to a cocaine-associated context. Mice were conditioned to associate one of two distinct contexts with cocaine reward, and after 3 weeks of forced abstinence, they showed a strong preference to explore the previously cocaine-paired context over the neutral (salinepaired) context ( Figure 2A). Unreinforced re-exposure to the cocaine and neutral context on the day before the test reduced preference for the cocaine context ( Figure 2A) (19), pointing to successful extinction learning. Next, independent groups underwent conditioning and were re-exposed to the cocaineassociated context in the presence (no extinction) or absence (extinction) of cocaine reinforcement ( Figure 2B). Ninety minutes later, animals were sacrificed to examine colocalization of the neuronal activity marker Fos and retrogradely labeled mCherry 1 vmPFC-to-dmPFC projecting neurons ( Figure 2C). Both groups showed a similar percentage of Fos 1 and mCherry 1 neurons in the vmPFC, but mCherry and Fos preferentially colocalized in mice that did not receive cocaine before the last session ( Figure 2D), suggesting that the vmPFC-to-dmPFC projecting neuronal population is activated during extinction learning. Additionally, we found that Fos colocalized more with vmPFC-to-dmPFC projecting neurons after unreinforced exposure to the cocaine context compared with a novel context ( Figure S1). To determine whether this projection is necessary for extinction learning, we retrogradely expressed the inhibitory DREADD (designer receptors exclusively activated by designer drug) hM4Di fused to mCherry or mCherry alone (control) in vmPFC-to-dmPFC projecting neurons ( Figure 2E). One day after chemogenetic suppression during extinction training, preference for the cocaine context was diminished in control mice, whereas hM4Di mice still showed a robust preference to explore the cocaine context  Figure 2F). Note that the CPP score of the mCherry control group is similar to the extinction group in Figure 2A, suggesting that CNO alone did not influence extinction learning. Hence, vmPFC-to-dmPFC projecting neurons are activated on unreinforced re-exposure to a cocaine-associated context, and accordingly, are required for extinction learning.

vmPFC Projections Evoke Strong Monosynaptic Excitation of dmPFC PV-INs and Disynaptic Inhibition of PNs
Global optogenetic stimulation of the vmPFC reduces firing of dmPFC PNs (20), suggesting that vmPFC PNs may target local GABAergic (gamma-aminobutyric acidergic) interneurons. Furthermore, dmPFC PV-INs facilitate extinction of natural reward seeking (21) and conditioned fear (22). We found that PV-INs comprise the majority (w66%) of GABAergic neurons in the dmPFC and are most abundant in dmPFC layers 5/6 ( Figure S2), where we also observed the highest density of vmPFC axons ( Figure 1A).   Figure S3J). Evoked excitatory drive was strongest onto dmPFC PV-INs ( Figure 3E; Figure S3K), whereas evoked inhibitory drive was strongest onto dmPFC PNs ( Figure 3F; Figure S3L). In neurons exhibiting both excitatory and inhibitory responses (12/15 PNs; 8/9 PV-INs), the eEPSC/eIPSC (E/I) ratio robustly favored excitation of dmPFC PV-INs ( Figure 3G; Figure S3M). The latency to onset of eEPSCs was shorter than of eIPSCs in PNs and PV-INs ( Figure 3H; Figure S3N), suggestive of monosynaptic excitation and disynaptic inhibition. In support of this, application of the AMPA/kainate receptor antagonist CNQX abolished both eEPSCs and eIPSCs ( Figure 3I, J), whereas GABA A receptor blockade by Gabazine only affected eIPSCs ( Figure 3K), demonstrating that glutamate release initiated both evoked responses. Gabazine applied together with CNQX and the NMDA receptor antagonist D-AP5 prevented the residual eEPSC ( Figure 3K). In line with this, both responses were also abolished by tetrodotoxin ( Figure S3O, P) and coapplication of the potassium channel blocker 4-AP (23) recovered eEPSCs only ( Figure S3Q), further confirming the monosynaptic and disynaptic nature of the excitatory and inhibitory response, respectively.

dmPFC PV-INs Mediate Feed-Forward Inhibition and Extinction of Cocaine CPP Memory
To determine whether the disynaptic inhibitory response in PNs is evoked by excitation of PV-INs, we expressed hM4Di-mCherry in dmPFC PV-INs and ChR2-EYFP in vmPFC PNs ( Figure 4A). PN responses to optic stimulation of ChR2 1 vmPFC terminals were recorded before and after CNOmediated suppression of PV-INs ( Figure 4B, C). CNO did not alter the eEPSC amplitude ( Figure 4D) but reduced the eIPSC amplitude in dmPFC PNs ( Figure 4E). Consequently, the E/I ratio shifted toward less inhibition ( Figure 4F), confirming that disynaptic inhibition of dmPFC PNs is at least partially mediated by excitation of PV-INs. CNO similarly suppressed disynaptic inhibition of PV-INs ( Figure S4). We next determined whether activity of dmPFC PV-INs is necessary to extinguish preference to explore a cocaine-associated context. In PV::Cre mice, hM4Di-mCherry or mCherry alone (control) was expressed in the majority of dmPFC PV-INs ( Figure 4G-I).
Following chemogenetic suppression of PV-INs during extinction training, hM4Di animals showed stronger preference for the cocaine context than control mice ( Figure 4J), confirming that extinction learning requires activity of these neurons. Finally, we determined whether chemogenetic stimulation of dmPFC PNs also impairs extinction of cocaine place preference. For this, we bilaterally expressed a CaMKIIa promoterdriven excitatory DREADD hM3Dq fused to mCherry or mCherry alone in dmPFC PNs ( Figure S5A). One day after CNO or saline treatment during extinction training, we did not observe a difference in preference scores between groups, and all groups showed a similar low preference for the cocaine context ( Figure S5B). Thus, bulk activation of dmPFC PNs does not affect extinction learning, suggesting that vmPFC-mediated recruitment of feed-forward inhibition in the dmPFC does not result in global suppression of local PNs. Moreover, because the saline control groups showed similar preference scores as the CNO-treated groups, this further confirmed that CNO alone did not affect extinction learning.

DISCUSSION
Our data reveal that extinction of context-evoked cocaine memory depends on activation of an intrinsic mPFC circuit ( Figure 4K). We show that vmPFC projections innervate the dmPFC, and vmPFC-to-dmPFC projecting neurons are activated on unreinforced re-exposure to a cocaine-associated context. In line with this, chemogenetic suppression of vmPFC-to-dmPFC projecting neurons prevented extinction of context-evoked cocaine memory. In the dmPFC, PV-INs receive strong monosynaptic excitatory input from vmPFC terminals and subsequently inhibit PNs, typical for feed-forward inhibition (23). Similar to manipulation of vmPFCto-dmPFC neurons, chemogenetic suppression of dmPFC PV-INs impaired extinction of context-evoked cocaine reward memory. Hence, under extinction conditions, vmPFC projections recruit feed-forward GABAergic inhibition in the dmPFC to attenuate conditioned responding to a cocaineassociated context.
Whereas previous models propose that the dmPFC and vmPFC exert control over conditioned cocaine seeking via divergent projections to other brain regions (2,5,12,13), we now demonstrate that direct connectivity between the vmPFC and dmPFC provides critical adaptive control over responding to cocaine-associated contextual cues. Our findings do not rule out the involvement of vmPFC projections to other regions, such as the NAc shell (12,13), but reveal an additional mechanism for extinction-induced behavioral inhibition. Of relevance is that we found that vmPFC-to-dmPFC projections are required for acquisition of extinction, whereas vmPFC-to-NAc shell connectivity exerts inhibitory control over cocaine seeking after extinction learning (12,13) but not during a first extinction session (24). Global manipulation of vmPFC function, however, affects both the acquisition and expression of extinguished cocaine seeking (25). Because we found that the vmPFC-to-dmPFC projecting neuronal population has little overlap with the vmPFC-to-NAc shell projecting neurons and sends only sparse collateral projections to the NAc shell, this suggests that extinction learning requires vmPFC projections to the dmPFC, whereas vmPFC projections to the NAc shell control the retention of extinguished responding to cocaine-associated cues.
Distinct coexisting neuronal ensembles within the vmPFC exert opposing effects on conditioned behavior (12,26,27). Although the selective suppression of vmPFC-to-dmPFC projecting neurons using our retrograde tracing approach is a more refined intervention method than global vmPFC manipulation, we cannot rule out that the vmPFC population that projects to the dmPFC is composed of distinct neurons that can promote and inhibit cocaine-conditioned responses. However, vmPFC-to-dmPFC neurons were not preferentially activated on reinforced re-exposure to the cocaine-associated context, suggesting that the cocaine-context association is not allocated to this neuronal population. This leaves the possibility that selective suppression of vmPFC-to-dmPFC vmPFC-to-dmPFC Projections Extinguish Cocaine Memory neurons that are activated during extinction learning could have a different effect on cocaine place preference. Unfortunately, it is technically not feasible to assess the effect of manipulation of the activated neuronal ensemble on extinction learning in our CPP paradigm, because it would require neuronal activity-dependent labeling and manipulation within the same extinction session.
Recruitment of feed-forward GABAergic inhibition in the dmPFC during extinction learning may result in global suppression of PN firing and a reduction of output from this region. However, we think that the cortical network effect of PV-IN activation might be more complex, potentially enabling a switch between dmPFC PNs that promote or suppress expression of cocaine memory. This hypothesis is supported by our finding that global chemogenetic stimulation of dmPFC PNs does not affect extinction learning. Furthermore, neurons within the prelimbic cortex fire during the initiation of reward seeking and under extinction conditions, both in a During the test, hM4Di animals showed higher preference for the cocaine context than control mice (t 14 = 3.33, *p = .005). (K) Model of vmPFC-to-dmPFC circuit. Unreinforced re-exposure to a cocaine-associated context triggers activation of vmPFC-to-dmPFC projecting neurons, which strongly excite dmPFC PV-INs, resulting in feedforward inhibition of PNs. All graphs show mean 6 SEM. aCSF, artificial cerebrospinal fluid; CNO, clozapine N-oxide; coc, cocaine; dmPFC, dorsal medial prefrontal cortex; eEPSC, evoked excitatory postsynaptic current; eIPSC, evoked inhibitory PSC; PN, pyramidal neuron; PV-IN, parvalbumin-expressing inhibitory interneuron; sal, saline; Vh, holding potential; vmPFC, ventral medial prefrontal cortex. vmPFC-to-dmPFC Projections Extinguish Cocaine Memory context-dependent manner (28). Therefore, distinct neuronal ensembles within the dmPFC may drive and inhibit cocaineconditioned responses, similar to what has been reported for the vmPFC (12). If this is true, global chemogenetic stimulation of dmPFC PNs may have resulted in simultaneous activation of 2 ensembles with opposing effects on cocaine place preference, which could have canceled out effects on extinction learning. Alternatively, stimulation of dmPFC PNs may not have affected extinction learning, because bulk excitation using chemogenetics does not mimic endogenous firing patterns required to overrule extinction learning. Whether PV-INs in the dmPFC suppress dmPFC neurons that drive expression of cocaine memory and/or facilitate the recruitment of dmPFC neurons that mediates extinction-induced behavioral inhibition is an important topic for future research.
We implicate the vmPFC-to-dmPFC feed-forward inhibition circuit in extinction of cocaine-conditioned behavior, but this network may be involved in extinction learning in general. Pharmacological inactivation of the vmPFC or inhibition of protein kinase M zeta in this region results in a reemergence of opiate-induced place preference (29,30), and cue-induced heroin seeking is driven by acute AMPA receptor endocytosis and a reduction in synaptic strength in the vmPFC (31). Additionally, lesions of the vmPFC enhance spontaneous recovery, reinstatement, and contextual renewal of Pavlovian food seeking (32,33). Together, these studies indicate that following extinction learning, the vmPFC exerts behavioral inhibition over responding to reward-related cues. Note that this does not exclude the coexistence of vmPFC ensembles that promote conditioned responses before extinction learning (12,26). To our knowledge, the monosynaptic innervation of dmPFC PV-INs by vmPFC projections has not been previously reported, but independent studies have shown that vmPFC-to-dmPFC projections and PV-INs mediate extinction of conditioned fear (16,22) and extinction of natural reward seeking (21). In line with our observations, extinction of cue-evoked food seeking drives recruitment of GABAergic interneuron activity in the dmPFC (34). Furthermore, excitatory projections in the opposite direction, from dmPFC to vmPFC, also mediate extinction of conditioned fear (14). Together with our findings, this indicates that a reciprocal intrinsic mPFC circuit serves to provide important adaptive control over conditioned behavior, in particular when an originally learned association (context / reinforcer) does not match with the conditions during reexposure to the same context and an alternative association (e.g., context / no reinforcer) is learned.
To conclude, we discovered that monosynaptic interaction between the vmPFC and dmPFC mediates extinction of cocaine reward memory through activation of dmPFC PV-INs. This sheds new light on the architecture of the neuronal circuit that enables adaptive responding to cocaine-contextual cues and may provide a new therapeutic target for strengthening of behavioral inhibition on context-evoked retrieval of cocaine memories.