CHD8 suppression impacts on histone H3 lysine 36 trimethylation and alters RNA alternative splicing

Abstract Disruptive mutations in the chromodomain helicase DNA-binding protein 8 gene (CHD8) have been recurrently associated with autism spectrum disorders (ASDs). Here we investigated how chromatin reacts to CHD8 suppression by analyzing a panel of histone modifications in induced pluripotent stem cell-derived neural progenitors. CHD8 suppression led to significant reduction (47.82%) in histone H3K36me3 peaks at gene bodies, particularly impacting on transcriptional elongation chromatin states. H3K36me3 reduction specifically affects highly expressed, CHD8-bound genes and correlates with altered alternative splicing patterns of 462 genes implicated in ‘regulation of RNA splicing’ and ‘mRNA catabolic process’. Mass spectrometry analysis uncovered a novel interaction between CHD8 and the splicing regulator heterogeneous nuclear ribonucleoprotein L (hnRNPL), providing the first mechanistic insights to explain the CHD8 suppression-derived splicing phenotype, partly implicating SETD2, a H3K36me3 methyltransferase. In summary, our results point toward broad molecular consequences of CHD8 suppression, entailing altered histone deposition/maintenance and RNA processing regulation as important regulatory processes in ASD.

A direct effect can be proposed since CHD8 is able to bind DNA at promoters and enhancer regions in hN-PCs, mouse midfetal brain and embryonic cortex (11,16). However, an indirect mechanism can also be postulated since other genes, not bound by CHD8, appear to be transcriptionally dysregulated following CHD8 suppression (11). Chromatin structure is intimately related to transcription and gene expression (17). CHD8 was shown to co-purify with components of the MLL and CoR-EST, SWI/SNF and NuRD ATP-dependent remodeling complexes, supporting its possible role in transcriptional initiation (18). On the other hand, reduction of CHD1, another member of the CHD protein family, alters H3K4me3 and H3K36me3 patterns, suggesting its role in establishing/maintaining the boundaries of these mutually exclusive histone marks (19). The SETD2 and SETD5 histone H3, Lys36 methyltransferases normally associate with RNA polymerase II (RNAPII), and their activity results in increased H3K36me3 toward the 3 end of active genes (20,21). Based on its placement on the phylogenetic tree and the presence of an ATPase domain (18), CHD8 is most probably acting as an ATP-dependent chromatinremodeling factor; thus, similar to CHD1, CHD8 loss might cause increased nucleosome turnover and alterations in co-transcriptional processes, such as cryptic transcription within gene bodies and alternative splicing (AS) (22,23). In addition to the general splicing machinery, many RNAbinding proteins (RBPs), such as serine-rich proteins (SRproteins) and heterogeneous nuclear ribonucleoproteins (hnRNPs), function as splicing enhancers and silencers (24)(25)(26). Broad chromatin conformation and transcriptional kinetics also play a role: chromatin relaxation accelerates RNAPII processing and generally correlates with alternative exon skipping; conversely, packed nucleosomes slow down RNAPII progression, causing pausing of transcription and the inclusion of non-constitutive weak exons (27)(28)(29)(30)(31)(32)(33)(34). Aberrant splicing, in turn, might contribute to altered neuronal development and dysfunction (35)(36)(37)(38). Thus, as experimental evidence is pointing to dysregulated chromatin regulation as a key feature in the pathogenesis of ASD, it is tempting to hypothesize that chromatin function of CHD proteins, and CHD8 in particular, might act to regulate RNA transcription, elongation and processing, thereby being responsible for the characteristic neurodevelopmental effects observed in ASD.
In order to dissect this mechanism, we characterized the consequences of CHD8 suppression on the chromatin landscape, analyzing different histone modifications using chromatin immunoprecipitation and sequencing (ChIP-seq). Specifically, we interrogated histone marks characteristic of transcriptionally active (H3K4me2, H3K4me3, H3K27ac and H3K36me3) and repressed regions (H3K27me3) as well as active/poised enhancers (H3K4me1 and H3K27ac) in control induced pluripotent stem cell (iPSC)-derived neuronal progenitors and in previously characterized lines where an ∼50% reduction in CHD8 was obtained by lentiviral delivery of short hairpin RNAs (shRNAs) (11). We uncovered alterations affecting the H3K36me3 histone mark in the body of highly transcribed genes, which do not primarily affect RNA transcription, but rather alter AS of genes implicated in 'mRNA catabolic process', 'regulation of RNA splicing' and 'translation initiation'. Strikingly, by mass spectrometry (MS) analysis in human neuronal progenitors, we identified and validated a novel, direct protein-protein interaction between CHD8 and hnRNPL. This splicing regulator, reported to be part of the human KMT3a/SET2 complex required for H3 Lys36 trimethylation activity (39,40), represents a new link bridging chromatin to RNA processing regulation with possible crucial implications for ASD and other presently incurable brain disorders.

Chromatin immunoprecipitation and sequencing (ChIP-seq)
ChIP was performed using the protocol described by (42), with minor modifications. Briefly, ∼25 × 10 6 iPSCderived NPCs, controls and Shs-CHD8 were fixed with 1% formaldehyde and incubated for 10 min at room temperature with rotation. The cross-linking was quenched by adding 1.1 ml of 2.5 M glycine and incubation for 5 min at room temperature with rotation. The cells were pelleted Nucleic Acids Research, 2022, Vol. 50, No. 22 12811 at 1000 rpm, resuspended in ice-cold phosphate-buffered saline (PBS)/protease inhibitor (PI), spun for 5 min at 1000 rpm, washed with ice-cold PBS twice, harvested, pelleted and directly resuspended in 300 l of lysis buffer/PI [50 mM Tris-HCl (pH 8.1), 1% sodium dodecylsulfate (SDS), 10 mM EDTA], kept on ice for 10 min rotating occasionally and vortexed vigorously for 15 s every 3 min. Sonication of the 200-700 bp smear of the samples was accomplished using a Bioruptor sonicator (Diagenode), for a total of 45 min of sonication at full power and sonication cycles of 30 s on/30 s off. Samples were centrifuged at max speed for 10 min at 4 • C. Then, sheared chromatin was diluted 10- C overnight and treated with proteinase K. DNA isolation was performed using phenol:chloroform:isoamyl alcohol. DNA was precipitated with 200 mM NaCl, supplemented with 30 g of glycogen, washed with ethanol and then treated with RNase I (Invitrogen). Finally, DNA was purified with the MinElute Kit (Qiagen). Quantification of ChIP and INPUT DNA was accomplished using the Qubit 2.0 Fluorometer system (Invitrogen). ChIP-seq libraries were prepared starting from 5 ng of fragmented DNA using the NEBNext UltraII DNA Library preparation kit (Illumina) following the manufacturer's instructions with no modifications. In order to obtain enough material for sequencing, eight cycles of polymerase chain reaction (PCR) amplification were performed on adaptorligated fragments.

Analysis of histone marks by ChIP-seq
ChIP-seq reads were aligned on the human genome reference assembly GRCh38 using BWA (version 0.7.15) (43). Aligned reads were filtered to discard unmapped, multiply mapped, PCR duplicate reads (Picard tools MarkDuplicates version: 2.3.0, Picard Toolkit. 2019. Broad Institute, GitHub Repository http://broadinstitute.github.io/ picard/) along with low quality alignments (samtools view -q 1, samtools version 1.71.7) (44). Peak calling was performed with MACS2 (version 2.1.0) using a minimum FDR (false discovery rate; Benjamini-Hochberg adjusted P-value) threshold of 0.00001. The same settings with the addition of the -broad option were used for H3K27me3 and H3K36me3 marks (45). Peaks localized to blacklisted regions (http://mitra.stanford.edu/kundaje/akundaje/release/ blacklists/) or unplaced contigs were filtered out. Narrow peaks (for H3K27ac, H3K4me1, H3K4me2 and H3K4me3 histone marks) closer than 350 bp were merged into a single peak [BEDTools merge (version 2.25.0)]. Peaks were considered common between replicates if they overlapped by at least 50% of the length of the shortest peak [BEDTools intersect (version 2.25.0)], then extended coordinates were maintained and used in downstream analyses. CHD8 ChIPseq reads from (11) were realigned [BWA (version 0.7.15)] to the reference GRCh38, and peaks were called with the same procedure as used for the narrow histone marks, with a default FDR threshold of 0.05 as only peaks identified by all three antibodies were retained. GENCODE v.26 was used for peak annotation (46). Genes losing H3K4me1, H3K27ac and H3K36me3 peaks in CHD8 knockdown were tested for enrichment of Gene Ontology (GO) terms with the enricher function from clusterProfiler [version 3.10.1 (47)] with Benjamini-Hochberg adjusted P-value cut-off of 0.05. The full list of slimGO terms used was downloaded from Ensembl BioMart on September 9, 2019. Enrichment of the same gene list was assessed on custom gene sets by using one-tailed Fisher's exact test. Custom gene sets were derived from public databases and publications of interest (Supplementary Table S1). Only nonredundant gene lists enriched at P-value <0.01 are shown in Figure 1G; complete enrichment results are available in Supplementary Table S1. Combining the histone mark enrichment patterns over the genome, 10 chromatin states were identified for control (Sh-GFP and Sh-GFP2) using ChromHMM (version 1.14) (48). The states were manually annotated according to the literature. The number of peaks called in these regions in both replicates for each histone mark was counted using BEDTools intersect, version 2.25.0 (49), and the difference between control and CHD8 knockdown was calculated. The total number of peaks for each histone mark as a percentage was plotted as a heatmap in control cells ( Figure 1B, left); the difference ( Figure 1B, right) refers to the percentage of control peaks-CHD8 KD peaks. The number of peaks per mark called in each chromatin state was tested with a two-sided t-test (python 3.6 scipy.stats.ttest ind) considering two replicates in both control and CHD8 knockdown. Differential enrichment analysis for H3K36me3 was performed for three CHD8 knockdown samples (Sh1-CHD8, Sh2-CHD8 and Sh4-CHD8) and two controls (Sh-GFP and Sh-GFP2) with DiffBind [version 2.14, (50,51)] and DESeq2 for the differential analysis, using peaks previously called by MACS2, present in at least two samples. Peaks with FDR <0.05 were considered differentially enriched. The volcano plot in Figure 1 was plotted with ggplot2 [version 3.3.2 (52)].
To generate the metagene profiles with deepTools, version 3.2.1 (53), ChIP-seq samples were normalized to INPUT with the SES method (54). Enrichment was calculated in 10 bp bins over the gene body, scaled to 5 kbp and 2 kbp upand downstream of the gene. In order to minimize noise in the metagene profile plots, a stringent set of filters was applied to protein-coding genes before plotting: a minimum length of 2 kbp, a minimum distance of 4 kbp from other genes and absence of other features on the opposite strand, leading to a set of 9442 protein-coding genes. For each histone mark, only genes with enrichment were plotted (at least one non-zero bin). These genes were divided into three groups based on the CHD8 binding enrichment pattern via k-means with deepTools plotProfile command. To complement the statistical hypothesis testing, paired Cohen's d effect size statistics were calculated between groups along the entire region (55). The 99% simultaneous confidence intervals were constructed controlling FWER (Bonferroni correction). CHD8-binding site profiles were calculated with deepTools computeMatrix reference point and plotted with deepTools plotProfile. Visualization of enrichment tracks for chromatin was performed with the Integrative Genomic Viewer [IGV, version 2.4.9 (56)].

RNA-seq and AS data analysis
Raw reads obtained from Sugathan et al. (11) for corresponding samples (Sh-GFP, Sh-GFP2, Sh1-CHD8, Sh2-CHD8 and Sh4-CHD8) were used to calculate transcript abundance by kallisto (version 0.44.00) (57) on GEN-CODE v.26 transcripts. Per-transcript RSEM-normalized read counts from RNA-seq of sh-hnRNPL and control samples in the HepG2 and K562 human cell lines were obtained from ENCODE (IDs: ENCSR155BMF and ENCSR563YIS). Transcripts per million (TPM) were used to plot expression levels. RNA-seq libraries from hn-RNPL small interfering RNA (siRNA)-treated samples (si-hnRNPL-C) and si-scrambled control (si-SCR) were obtained from total RNA isolated using TRIZOL (Invitrogen) and treated with DNase (AMBION). RNA quality was assessed by Agilent bio-analyzer (RNA integrity number from 7.8 to 9.2). A stranded Illumina library was prepared according to the Illumina Stranded mRNA Prep manufacturer's protocol starting from 100 ng of purified total RNA. Samples was barcoded using the IDT for Illumina RNA UD Indexes, Ligation kit (Illumina). The library quality was checked by the DNA 1000 Kit and the 2100 Bioanalyzer instrument (Agilent Technologies). The 100 bp pairedend sequencing was performed by the Novaseq 6000 System (Illumina) at the Genomics Facility at the Italian Institute of Technology (IIT, Genova, Italy), obtaining ∼50-60 million reads per samples. SUPPA (version 2.3) was used to calculate the percentage spliced-in (PSI) value per splicing event with an empirical method (58). Significant splicing events were selected for a P-value <0.05 and PSI >0.2. The same raw reads from Sugathan et al. (11) were aligned with STAR (version 2.6) (59) with default parameters on the GRCh38 reference, to analyze the AS with rMATS [version 3.1.0 (60)]. As the two AS analysis methods are complementary (https://github.com/comprna/SUPPA/issues/47), both were included in the analysis. To check the overlap of splicing events with chromatin marks, the spliced in/out exon coordinates were intersected with the coordinates of the histone mark peaks. AS events were represented in sashimi plots via ggsashimi (version 0.4.0) (61) using GENCODE annotation v.33 as a reference (46).

RBP motif pattern matching
Matches to known motifs for human RBPs, derived from the CISBP-RNA database (64), were obtained with the Biopython package v1.78 (65), using 95% similarity and 0.001 false-positive rate as thresholds. Sequences on which the match was performed were obtained by considering the 100 nt upstream and downstream of exons involved in a splicing event, extracted via the BioMart biomaRt R package (66).

ChIP qPCR
Candidate genes for ChIP quantitative PCR (qPCR) experimental validation were chosen on genomic regions displaying both a significant H3K36me3 enrichment loss and at least one significant differential splicing event (resulting from SUPPA v2.3 RNA-seq analysis, 0.30 PSI threshold). Genes displaying the highest signal variation in both ChIP-seq and RNA-seq were prioritized. FASTA sequences of genomic regions displaying significant H3K36me3 peak loss were retrieved and used as a template for ChIP qPCR primer design with the NCBI Primer-BLAST tool (https://www.ncbi.nlm.nih.gov/tools/ primer-blast/), and an in silico specificity screen was performed with NCBI BLAST (https://blast.ncbi.nlm.nih.gov/ Blast.cgi) and the UCSC In-Silico PCR tool (https://www. genome.ucsc.edu/cgi-bin/hgPcr). Amplicon size was tested by an electrophoretic gel run, and sequence specificity was verified by Sanger sequencing (Eurofins). qPCR analysis was performed using 200 pg of ChIP DNA and an equal amount of unenriched INPUT DNA, reaction volume 10 l with iTaq™ Universal SYBR® Green Supermix (Biorad) using the recommended thermocycling parameters on Thermal Cycler C1000 CFX384 or CFX96 (Biorad). qPCR analysis was carried out with CFX Manager v3.1 Software. At least two technical replicates of qPCR were analyzed, with intra-assay variation <0.5 Cq. Two gene desert regions hGD12 (chr12:61273372-61273435) and hGD4 (chr4:187946873-187946954) were used as negative controls. Fold enrichment over INPUT was calculated using the 2 -Cq method. Depending on the condition, two or three biological replicates were analyzed for each cell clone. Error bars in the graphs represent the standard error of the mean (SEM).

RNA extraction, reverse transcription and semi-quantitative PCR
Candidate genes for differential splicing experimental validation were prioritized among ChIP qPCR targets, as described above. For exon skipped events (SEs), FASTA sequences of invariant upstream and downstream exons (relative to the exon involved in the differential event) were retrieved from GENCODE annotation v.33, employing IGV (version 2.4.9), and used as a template for forward and reverse primer design, respectively, in order to amplify 'spliced-in' and 'spliced-out' transcript isoforms. Total RNA was extracted from iPSC-derived NPC lines using TRIZOL reagent (Invitrogen) following the manufacturer's instruction. DNA contamination was removed by treating the samples with DNase I (Invitrogen) and RNase inhibitor SUPERase (Invitrogen) for 30 min at 37 • C followed by purification with the RNase Mini Kit (Qiagen). RNA quality was evaluated by an agarose gel electrophoretic run and Agilent 2100 Bioanalyzer. A 1 g aliquot of RNA was retro transcribed using the SensiFAST cDNA Synthesis Kit (Bioline). The resulting cDNA was used to perform semiquantitative PCR by employing the Phusion Green Hot Start II High-Fidelity PCR Master Mix (Thermo Scientific). TATA-binding brotein (TBP) was used as reference gene. PCR products were separated by an electrophoretic run on a 2% agarose gel. The PSI of target transcripts was evaluated by densitometric analysis using ImageJ software (version 1.46r) and relativized to the Sh-GFP line as the reference sample. Data are presented as the mean ± SEM of calculated PSI value per splicing event. One-tailed t-test was performed using Excel; the significance level was reported as not significant (NS) P >0.05, *P ≤0.05, **P ≤0.01, ***P ≤0.001, ****P ≤0.0001.

Statistical analysis
The analysis of target proteins differences in western blotting experiments was evaluated by performing an unpaired, one-tailed t-test. In all t-tests, the significance level was set to 0.05. Data were represented as mean ± standard deviation (SD). The significance level was reported as: NS P >0.05, *P ≤0.05, **P ≤0.01, ***P ≤0.001.

Immunoprecipitation and western blot analysis
For each immunoprecipitation, 1 mg of total protein in the nuclear fraction was incubated overnight rotating at 4 • C with 1 g of the primary antibody specific for the target protein. For CHD8 immunoprecipitation, NB100-60417 and NB100-60418 (Novus Biotechnology), for hnRNPL D-5 (sc-48391, Santacruz) and for SETD2 (38633, SAB) were used, while rabbit IgG isotype control (10500C, Life) or mouse IgG (10400C, Invitrogen) were used as controls. Protein A-Sepharose (CL-4B euroclone) or Protein G-Sepharose (GE-Healthcare) beads were pre-cleared by incubating them with the NF for 45 min rotating at 4 • C. Beads were then removed and the cleared NF (CNF) was treated for 15 min at 37 • C with 100 g/ml RNase A (Invitrogen) or 2 U/ml DNase I (Invitrogen), or otherwise used directly for the overnight incubation with the primary antibody (NF with antibody, NFA). On the other hand, CNF was incubated overnight with primary antibody in the absence (not treated condition NT) or presence of 50 g/ml ethidium bromide (EtBr; Sigma-Aldrich). All the different NFAs were finally incubated with beads for 45 min, rotating at 4 • C. Bound complexes were then washed three times with NLB buffer. For western blot analysis, dried Sepharose bead complexes were eluted with SDS-loading buffer LDS Sample Buffer (Invitrogen) containing 5% Bolt sample reducing agent (Life Technologies) at 92 • C for 10 min. Samples were loaded on 3-8% Tris-acetate protein gels (Thermo Fisher) with Protein Marker (Euroclone) and separated by electrophoresis. After electrophoresis, the proteins were transferred to nitrocellulose membranes (GE-Healthcare). Blotted membranes were incubated overnight with primary antibodies [CHD8 NB100-60417 (Novus Biotechnology), hnRNPL D-5 (sc-48391, Santacruz) and SETD2 (38633, SAB)], then washed three times with PBS with 0.1% Tween and incubated for 1 h at 4 • C with the specific secondary antibody.

Sample preparation and mass spectrometry
The co-immunoprecipitated samples were loaded on 10% SDS-polyacrylamide gel electrophoresis (PAGE) gel and run for ∼1 cm. Gels were then stained with Coomassie and the entire stained area was excised as one sample. Excised gel bands were cut into small pieces (∼1 mm 3 ) and subjected to reduction and alkylation with 10 mM DTT and 55 mM iodoacetamide, respectively. Gel pieces were then washed in water, dehydrated with acetonitrile (ACN) and dried in a speed-vac. Gel plugs were re-hydrated with 50 mM NH 4 CO 3 solution containing 12.5 ng/ml trypsin (Promega) on ice for 30 min. The digestion continued at 37 • C overnight. The supernatant was collected, and the peptides were sequentially extracted from the gels with 30% ACN with 3% trifluoroacetic acid (TFA) and 100% ACN. All the supernatants were combined and dried in a Speed-Vac. The tryptic peptides were then acidified with 1% TFA to a pH <2.5, desalted on C18 stage tips and resuspended in 20 l of 0.1% formic acid buffer for liquid chromatographytandem MS (LC-MS/MS analysis). Peptides were separated on an Easy-nLC 1200 high-performance liquid chromatography (HPLC; Thermo Fisher) system by 85 min gradients with a 400 nl/min flow rate on a 25 cm column with an inner diameter of 75 m packed in-house with ReproSil-Pur C18-AQ material (3 m particle size, Dr Maisch, GmbH). The gradient was set as follows: from 5% to 25% over 52 min, from 25% to 40% over 8 min and from 40% to 98% over 10 min at a flow rate of 400 nl/min. Buffers were 0.1% formic acid in water (A) and 0.1% formic acid in ACN (B). Peptides were analyzed in an Orbitrap Fusion Tribrid mass spectrometer (Thermo Fisher) in datadependent mode, with a full-scan in the Orbitrap performed at 120 000 FWHM resolving power (at 200 m/z), followed by a set of (higher energy collision dissociation) MS/MS scans over a 3 s cycle time. The full scans were performed with a mass range of 350-1100 m/z, a target value of 1 × 10 6 ions and a maximum injection time of 50 ms. The MS/MS scans were performed at a collision energy of 30%, 150 ms of maximum injection time (ion trap) and a target of 5 × 10 3 ions.

Mass spectrometry data analysis and processing
Raw files were searched using Proteome Discoverer software v.2.2.0 (Thermo Fisher Scientific). Peptide searches were performed against the in silico digested UniProt Human database (downloaded July 2019) and a database containing common contaminants. Trypsin/P was chosen as the enzyme with five missed cleavages. Static modification of carbamidomethyl (C) with variable modification of oxidation (M) and acetylation (protein N-term) were incorporated in the search. The MASCOT search engine [v.2.6.2 (MatrixScience)] was used to identify proteins, using a precursor mass tolerance of 10 ppm and a product mass tolerance of 0.6 Da. The FDR was set to 1% at both the peptide and protein level. The results were filtered to exclude potential contaminants. Peak intensities of peptides were normalized on the average of the specific protein abundance within each sample (68). Log2-normalized intensities of single peptides were averaged through replicates and by subtracting the averaged values of IgG to IP samples; the fold change (FC) was calculated for each peptide. Then, the FC of each protein was calculated by averaging the FC of all peptides assigned to each protein. CHD8-binding partners were detected by comparing fold enrichment of the IP samples versus IgG control (IP/IgG). Statistical significance was assessed using Student's t-test (two-tailed, two-sample unequal variance, Excel). In the case of a single replicate, the enrichment of each peptide was calculated by subtracting the log2-normalized intensities of the IgG to the IP sample. Statistical significance was calculated as previously described (Student's t-test).

Targeted siRNA knockdown for hnRNPL
A total of 3 ×10 6 hiNPCs were electroporated with 200 pM siRNA oligos using the Lonza Nucleofector® 2b device and a custom-made electroporation buffer (5 mM KCl, 15 mM MgCl 2 , 10 mM glucose and 120 mM K 2 HPO 4 /KH 2 PO 4 pH 7.2). Knockdown efficiency was determined after 48-72 h from electroporation. The siRNA oligos targeting the hnRNPL gene (three unique 27-mer and one universal scrambled negative control) were purchased from OriGene (SR302174). Proteins for western blotting were extracted using RIPA buffer (ThermoScientific), and sonicated using Q700 (Qsonica) prior to centrifugation, while total RNA was obtained by using TRIZOL (Invitrogen).

CHD8 suppression significantly affects transcriptional elongation chromatin states
To assess the functional consequences of CHD8 suppression on chromatin organization, we resorted to a previously characterized control iPSC-derived NPC line, GM8330-8, and its derivatives where ∼50% reduction in CHD8 was obtained by lentiviral-mediated delivery of shRNAs (11). In these model systems, we analyzed six different histone modifications using ChIP-seq, specifically interrogating transcriptionally active (H3K4me2, H3K4me3, H3K27ac and H3K36me3) and repressed regions (H3K27me3) as well as active/poised enhancers (H3K4me1 and H3K27ac). For each of the six histone marks, three independent shRNAs targeting the coding sequence of CHD8 (Sh1, Sh2 and Sh4) and two technical replicate controls against the GFP sequence were used ( Figure 1A). Sh1-CHD8, Sh2-CHD8 and Sh4-CHD8 presented nearly comparable levels of CHD8 at ∼50% of their physiological levels, thus precisely mimicking the human haploinsufficiency condition (see Supplementary Figure S1A and B for transcript and protein level) (11). Importantly, for CHD8-knockdown models as well as for GFP controls, genome-wide transcriptomic data and CHD8-binding sites were available ( Figure 1A) (11).
ChIP-seq experiments were conducted to obtain on average 40 million reads for narrow marks (H3K4me3/me2/me1 and H3K27ac) and 60 million for broad histone marks (H3K36me3/K27me3) and INPUT samples. After mapping and filtering, an average of 42 308 peaks per sample were identified (Supplementary Figure  S1C, D), and showed the expected enrichment pattern and metagene profiles at the transcriptional start site (TSS: H3K4me3/me2/me1 and H3K27ac), the gene body (H3K36me3) of actively transcribed genes or on larger genomic regions spanning transcriptionally silent gene units (H3K27me3), and the expected enrichment correlation with ENCODE public datasets (Supplementary Figure  S1E, F) (69,70).
Upon CHD8 suppression, H3K4me1, H3K27ac and H3K36me3 presented a substantial decrease in the number of peaks (37.44, 38.45 and 47.82%, respectively) compared with the control (intersection of Sh-GFP and Sh-GFP2) (Supplementary Figure S2A). However, some level of replicate heterogeneity was observed, especially for H3K4me1 and H3K27ac, while H3K36me3 presented a more consistent and considerable reduction (Supplementary Figures  S1D-F and S2A). Importantly, the third biological replicate Sh1-CHD8, presenting less efficient CHD8 transcription suppression (Supplementary Figure S1A) (11), but stronger CHD8 protein reduction (Supplementary Figure S1B), confirmed impaired H3K36me3 enrichment (Supplementary Figure S2A). In addition, Sh2-CHD8 and Sh-GFP independent ChIP-seq datasets (generated in a different laboratory from the previous set) again sustained the conclusion that CHD8 suppression was associated with a decreased H3K36me3 enrichment (Supplementary Figure S2B).
To further dissect these results, we compared the number of peaks for each histone mark in controls and CHD8 knockdowns within each chromatin state. While differences between biological replicates did not support a statistically significant variation at enhancer and promoter states (Supplementary Figure S3A, B), the number of peaks decorated by histone H3K36me3 at genomic regions involved in transcriptional elongation was significantly lower in Sh2-Sh4 CHD8 (P-value <0.05, t-test) compared with controls ( Figure 1C). Within the genomic regions involved in transcriptional elongation, H3K36me3 reduction affected ∼50% of all expressed protein-coding genes in human NPCs (out of the protein-coding genes with >2 TPM, 5447 lost H3K36me3 while 6451 were not affected). Additionally, genes that lost H3K36me3 following CHD8 haploinsufficiency were significantly longer (90290 versus 60902 bp) and composed of more exons (7.36 versus 6.20) (not shown) compared with the unaffected genes. Independent quantitative analysis using DiffBind and western blot further supported a critical reduction of H3K36me3bound regions in the three Sh1-Sh2-Sh4 CHD8 knockdown clones compared with control Sh-GFP ( Figure 1D-F). On the contrary, metagene profiles, effect size and DiffBind analyses reported no substantial difference for H3K27ac and H3K4me1 marks (Supplementary Figure  S3C-H). A composite heatmap of all genes presenting reduced H3K36me3 following CHD8 suppression versus unaltered genomic location is shown in Supplementary Figure S4. Genes with reduced H3K36me3 following CHD8 suppression were strongly enriched for 'constrained' genes [intolerant to loss-of-function mutations, gnomAD (71)], 'FMRP targets in brain' (72), SFARI ASD genes (https: //gene.sfari.org/about-gene-scoring/), 'essential genes' [required for a cell's survival (73)] and the 'M3 co-expression module' (6), whose expression peaks early during nervous system development ( Figure 1G; Supplementary Table S1).

CHD8 suppression-dependent reduction in histone H3 Lys36 trimethylation impacts on CHD8-bound genes
By overlaying human CHD8-binding sites on the previously established chromatin states ( Figure 1B), we confirmed that CHD8 was confined to active promoters (90.11%) and, less prominently, to enhancers [strong, weak/poised a and b    Figure S5D). Upon CHD8 suppression, stringently defined (see the Materials and Methods) CHD8-bound genes appeared to be more sensitive than CHD8-unbound genes (988 and 4205, respectively), presenting a significantly reduced H3K36me3 enrichment profile as confirmed by the effect size analysis (Figure 2A, B; Supplementary Figure S6A). Notably, the reduction in histone H3K36me3 elicited by CHD8 suppression was specific since histone H3K4me3, another histone modification enriched at TSSs of highly expressed genes (Supplementary Figure S7A Figure S6B). Functional enrichment of genes from clusters #1 and #2, i.e. CHD8-bound and losing H3K36me3 enrichment upon CHD8 suppression, highlighted GO biological process terms related to 'mRNA pro-cessing' and 'RNA splicing' (Figure 2H; Supplementary Table S1). Functional enrichment of genes from cluster #3 is also presented (Supplementary Figure S8).

Reduction in histone H3 Lys36 trimethylation elicited by CHD8 suppression alters RNA AS
To gauge the functional significance of H3K36me3 reduction observed following CHD8 suppression, we then leveraged RNA-seq data from controls and CHD8-knockdown clones. As histone H3K36me3 seems to be correlated with high levels of RNA expression (Supplementary Figure S7A) (75), we reasoned that a decline in H3K36me3 levels would be associated with reduced RNA expression levels. Unexpectedly reduced levels of CHD8, and impaired H3K36me3 enrichment, did not correspond to a global difference in transcription in either CHD8-bound or CHD8-unbound genes (Supplementary Figure S9A-C).
Importantly, a significant proportion of genes losing H3K36me3 upon CHD8 suppression [462 genes of which 176 (38.1%) had CHD8-binding sites in promoter or enhancer regions] presented altered AS profiles, as evidenced by two analysis approaches ( Figure 3A, B; Supplementary Figure S9D-F) (74,76). We noted an almost equal distribution of the H3K36me3 peaks lost following CHD8 suppression between exonic and intronic regions (data not shown).
The vast majority of altered AS events were categorized as 'alternative first exon' [994 (51.19%)] or 'exon skipping' [283 (15.71%)] ( Figure 3C; Supplementary Figure S9F). For the ∼1000 genes for which differential splicing events were detected by SUPPA ( Figure 3A Figure 3D). Among genes characterized by a reduction in H3K36me3 with concomitant splicing alterations, over-representation of GO terms and pathways related to 'protein deacetylation', 'histone deacetylation' and 'protein deacylation' (Figure 3E, top; Supplementary Table S1) was observed, while those bound by CHD8 and presenting altered splicing patterns following CHD8 suppression were enriched for 'mRNA catabolism', 'translational initiation' and 'regulation of RNA splicing' (Figure 3E, bottom; Supplementary Table S1). The direct comparison of specific H3K36me3 lost peaks and exons presenting a differential splicing event revealed modest intersection (Supplementary Figure S9G). Nevertheless, at eight genomic coordinates, concomitant reduction in H3K36me3 and AS variation with PSI >0.30 could be observed. Specifically, experimental validation at the ITSN1 locus, the Intersectin 1 gene involved in endocytic membrane traffic and synaptic transmission (77,78), complex learning and memory formation (79) as well as previously implicated in schizophrenia (80), confirmed a clear reduction in H3K36me3 enrichment and a corresponding increase in the PSI ratio (spliced in/spliced in + spliced out) of the splicing variants in hN-PCs presenting knockdown levels of CHD8 ( Figure 3F-I).
The aberrant AS phenotype was also mirrored in previously published murine datasets (14,62). Specifically, Chd8 genetic ablation in P5 cortical regions ( Supplementary Figure S10A, C, D, F) and mouse NPCs (mNPCs; Supplementary Figure 10B, C, E, F) elicited AS pattern alteration that closely resembled (numerically and in the AS subtype distribution) the one described in iPS-derived NPCs. For P5 cortical samples, 950 (59.90%) 'alternative first exon' and 191 (12.04%) 'exon skipping' events were detected. Similar values. i.e. 840 (52.01%) 'alternative first exon' and 240 (14.86%) 'exon skipping', were observed in mNPCs. The distribution of events with positive or negative PSI also remained similar (51.01% positive and 48.99% negative PSI in the P5 cortical samples, and 49.10% positive and 50.90% negative PSI in mNPC samples) (Supplementary Figure S10A, B), thus supporting the hypothesis that CHD8/Chd8 suppression generally correlates with AS defects, regardless of the species considered. Interestingly, although the genes presenting splicing alterations within each neuronal model system were generally different (GO terms/pathways presented in Supplementary Figure S10G-I), nevertheless significant overlaps for some of the intersections tested were identified (Fisher's test, Supplementary Figure S10C-F), suggestive of a core group of genes, half of which were bound by CHD8 (i.e. ITSN1, CNOT2 and MDM2), involved in 'neuronal differentiation', 'cell cycle' and 'DNA repair', were strongly sensitive to splicing alterations.

hnRNPL as a novel CHD8 interactor: bridging altered splicing to H3K36me3 enrichment
To shed light on the observed AS phenotype and to unveil how CHD8 controls H3K36me3 enrichment at the genes' body, we characterized the CHD8 interactome by MS/MS analysis using two independent, immunoprecipitationvalidated, antibodies (#17-N-and #18-C-terminal, see the Materials and Methods) on the hNPC nuclear-enriched fraction ( Figure 4A Table S1). CHD8 #17 antibody yielded a higher number of immunoprecipitated and sequenced peptides (n = 99), compared with CHD8 #18 (n = 26). Combining the statistically significant results from three independent biological replicates and two antibodies (Ab #17 A, B, C; Ab #18 A, B, C), a list of 18 stringently defined protein interactors (Figure 4D, E) was obtained. These proteins were strongly enriched for terms related to 'RNA binding', 'mRNA processing' and 'mRNA splicing, via spliceosome' (Supplementary Figure S11D). Specifically, 5 out of the 18 stringently defined, new CHD8-interacting candidates belonged to the hnRNP RBP family already foreseen by motif analysis of the ± 100 bp sequences adjacent to all differentially spliced exons ( Figure 3D), actively regulating AS, mRNA stabilization and transcriptional and translational processes (85) (Figure 4C-E). Importantly, independent Chd8 MS analysis on mouse embryonic stem cells (mESCs) identified a similar pattern of interacting proteins, with GO terms and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways associated with 'spliceosome' and 'mRNA processing' redundantly enriched between the two experimental models (Supplementary Figure S11E, F). Among CHD8interacting proteins, hnRNPL, sequenced in both hNPCs and mESCs, was validated as a direct CHD8 interactor, not sensitive to DNase and EtBr treatments, in independent co-immunoprecipitation experiments ( Figure 4F, G; Supplementary Figure S11G). However, RNase A treatment partially impacted the binding between CHD8 and hnRNPL ( Figure 4G; Supplementary Figure S11H A, B) Venn diagrams represent the overlap between genes losing H3K36me3 peaks following CHD8 knockdown (losing H3K36me3) and genes presenting altered AS events as detected by SUPPA ing or functional stabilization of this interaction. Although not directly implicated in H3K36 methyltransferase activity, hnRNPL was previously described as SETD2 interactor and regulator (40). In fact, the previously reported binding between SETD2 and hnRNPL was also confirmed in our hNPC model (Figure 4H), suggesting a novel, indirect role for CHD8 in modulating SETD2-dependent H3K36me3 via hnRNPL, bridging AS changes with chromatin regulation. HnRNPL was shown to play a direct role in splicing regulation (86), binding to CA-rich RNA elements (87) and repressing cryptic exon inclusion (88). Thus, we asked whether hnRNPL suppression could correlate with dysfunctional splicing events mirroring CHD8 suppression. To gain a comprehensive view of hnRNPLdependent AS events in hiNPCs, we electroporated either control or hnRNPL-targeting siRNAs [single oligos (A, B or C)] to our cells. Initial experiments to test siRNA efficiency showed a high hnRNPL suppression using 200 pM si-C at 72 h post-electroporation (Supplementary Figure  S12A, B). Both hnRNPL RNA and protein showed a significant reduction by ∼60% compared with control (Figure 5A-C). Thus, we then applied paired-end sequencing on the polyadenylated transcriptome, performing four biological replicates for each treatment (see the Materials and Methods). RNA-seq analysis confirmed proper clustering of the samples (principal component analysis; Supplementary Figure S12C) and a strong, specific reduction of hnRNPL transcript with other hnRNPs unaltered by the treatment (Supplementary Figure S12D). AS analysis [the same tool (SUPPA) and the same parameters (Pvalue <0.05) previously employed] of RNA-seq data from hnRNPL knockdown in human hiNPCs revealed 473 AS events, affecting 287 unique genes, partly overlapping those described following CHD8 suppression. Specifically, 47.7% of genes affected by at least one aberrant AS event following hnRNPL suppression overlapped with those presenting altered AS in CHD8 knockdown ( Figure 5D). Notably, once again, AF, the most affected AS subtype by CHD8 suppression, was revealed to be the most recurrent aberrant AS event type after hnRNPL reduction ( Figure 5E), while pathways and GO terms associated with 'nuclear-transcribed mRNA' and 'RNA catabolic process' were recurrently enriched in the lists of genes affected by dysfunctional splicing in both CHD8 and hnRNPL suppression data ( Figure 5F). Finally, although with a reduced percentage of gene overlap (20%), AS analysis of data from hnRNPL knockdown in unrelated cellular model systems (HepG2 and K562 cells, ENCODE; see the Materials and Methods) still correlated with the splicing defects described following CHD8 suppression (Supplementary Figure S12E).

DISCUSSION
Disruption of CHD8 from de novo protein truncating and structural variants is well established as a highly penetrant risk factor for ASD (3,4,7,76). CHD8, initially described as interacting with ␤-catenin as a negative regulator of WNT signaling (83,(89)(90)(91), has important roles during nervous system development (14,83). CHD8 directly binds DNA in ∼7000 genomic locations at H3K27ac-and H3K4me3enriched regions of highly transcriptionally active promoters and enhancers (11,16). However, with both direct and indirect transcriptional effects observed following CHD8 suppression, its molecular mode of action in ASD remains unclear (11). CHD8 recruits histone KMT2/MLL methyltransferase complexes (18,92), while a cross-talk with the core PRC2 methyltransferase, Ezh2 (12), cannot be excluded. However, as CHD8 associates with elongating RNAPII, its role in transcriptional elongation needs to be taken into account, especially for highly expressed genes that are densely decorated by histone H3K36me3 (84).
In our ASD-relevant, human neuronal progenitor model system, we observed that CHD8 suppression is prominently associated with a depletion, rather than a gain, in H3K36me3 histone modification. While an effect at enhancers and promoters could not be completely ruled out, stringent statistical criteria to focus on the strongest and most reliable epigenetic changes supported a drastic depletion of H3K36me3, a phenotype validated by independent quantitative approaches. Genes displaying reduction in this histone mark are enriched for 'constrained' genes [intolerant to loss-of-function mutations, gnomAD (71)], 'FMRP targets in brain' (72), SFARI ASD genes (https: //gene.sfari.org/about-gene-scoring/), 'essential genes' (73) and genes whose expression peaks early during nervous system development (M2, M3, post-conception weeks 10-12)      Figure 5. siRNA-mediated hnRNPL reduction causes AS changes that partly mirror those elicited by CHD8 suppression. (A) The bar graph reports the normalized expression levels (2 − CT ) of hnRNPL transcript following Si-C administration for 72 h in hiNPCs. Four different biological replicates per experimental condition are reported. Means ± SE are shown. One-tail t-test was performed, with ***P ≤0.001. (B) Representative western blot images illustrate total hnRNPL levels, comparing hiNPC exposed to siRNA against hnRNPL (Si-C) and control (Si-Scr). Two representative biological replicates per condition are presented. Comparable amounts of total protein were loaded as detected by the HSP90 loading control. (C) The bars in the chart represent (6). Thus, CHD8 could facilitate the methyltransferase activity (SETD2/SETD5) leading to H3K36 trimethylation or act as an inhibitor of KDM2B (93). Modulation of H3K36 methylation relates to cell cycle regulation during neuronal development and differentiation. In fact, mutations in SETD2 have been described in Sotos syndrome, a childhood overgrowth condition with macrocephaly (94), and in an ASD proband, also presenting macrocephaly (5,95), while disruptive mutations in SETD5--a newly described H3K36me3 methyltransferase (96)--are associated with ID/ASD (1,76-79) (97) and 3p25.3 microdeletion syndrome (98). Thus, it is possible that CHD8 suppression, through its ability to modulate H3K36me3 levels, might lead to aberrant development and proliferation as observed in animal models and in CHD8-autistic subjects (9,14). How can CHD8 modulate H3K36me3? From RNA-seq data, CHD8 suppression does not correlate with a direct reduction in SETD2/SETD5 levels or the up-regulation in H3K36me3 KDM2B methyltransferase (not shown). However, genes bound at their promoters/enhancers by CHD8 specifically present a significant depletion in H3K36me3, thus suggesting a possible direct interplay between the chromodomain and the H3K36 trimethylases. Indeed, SETD2 and CHD8 display similar temporal expression patterns during human brain development (data from the BrainSpan Atlas) (9).
Trimethylation of H3K36 demarcates body regions of actively transcribed genes, providing signals for modulating transcription fidelity, mRNA splicing and DNA damage repair (75). Aberrant H3K36me3 reduction in our data is not directly causative of transcriptional differences, and this is coherent with previous findings (99). Rather, reduced H3K36me3 correlates with altered AS. Previous association of H3K36me3 with AS has been reported (100,101). Splicing differences detected in this study as a consequence of CHD8 suppression are mirrored in different datasets, even in murine models of ASD [P5 cortices and mNPC (14,62)]. Splicing alterations are, in all cases, identified during nervous system development or in neuronal committed progenitors (14,62), with skipped event and alternative 5 , SE-AF, event types among the most sensitive, thus supporting previous finding of a neuronal splicing defect consequent to Chd8 haploinsufficiency (36). However, while a significant proportion of genes with reduced H3K36me3 enrichment present altered splicing patterns, a significant amount of other aberrantly spliced genes do not present reduced H3K36me3, but binding sites for CHD8. This observation, while supporting a functional link between H3K36me3, CHD8 and splicing regulation, also suggests a H3K36me3-indirect role for CHD8 in AS regulation (Figure 5G). Moreover, motif analysis of RNA sequences adjacent to the aberrant splicing events predicts a relevant contribution by SRSF, hnRNP and ELAV RBP families, well-established regulators of AS, thus suggesting a mode through which CHD8 could operate. Interestingly, genes correlated with 'RNA splicing' and 'mRNA processing' are among those bound by CHD8 and displaying aberrant AS. This, presently overlooked molecular mechanism warrants further investigations especially in neurodevelopmental syndromes and ASD in particular.
In order to fill this gap, we resorted to MS/MS analysis in hNPCs. Next to a redundant pool of previously identified CHD8 interactors (81)(82)(83)(84), our analysis revealed a significant group of nucleoproteins directly implicated in 'RNA binding', 'RNA processing' and 'spliceosome' regulation. Specifically, many hnRNPs (85), already foreseen by the RNA motif analysis of the sequences adjacent to aberrant spliced events, were independently identified with two N-terminal and C-terminal CHD8 antibodies, as well as in mESCs ( Figure 4D, E). Intriguingly, recent human genetics studies identified hnRNPs as candidate genes in neurodevelopmental conditions, strongly implicating disruption of this gene's family in altered brain development (102). Among the others, hnRNPL, preferentially bound to CA-rich elements (103), and a regulator of inducible exon skipping (104), was prioritized for validation. CHD8 robustly interacts with hnRNPL possibly through the N-terminal domain as judged by the more efficient pull-down obtained using the CHD8 C-terminal antibody, with no effect of DNase and EtBr treatments ( Figure 4F, G). However, because of the impact of RNase A treatment on this binding, a stabilizing contribution by RNA might be proposed for CHD8/hnRNPL association ( Figure 5G). Indeed, hnRNPL was previously reported to bind to chromatin--and enhancers in particular--in an RNA-dependent manner (105). Thus, CHD8/hnRNPL association, possibly also recruiting the Mediator complex normalized hnRNPL protein levels comparing hiNPCs exposed to siRNA against hnRNPL (Si-C) and control (Si-Scr). HnRNPL bands were normalized on HSP90 loading control. Mean values ± SE from four independent biological replicates are plotted. T-test for two mean populations was performed, *P ≤0.05. (D) Venn diagrams represent the overlap between genes presenting aberrant splicing following CHD8 suppression and genes presenting altered AS events after siRNA-mediated reduction of hnRNPL (72 h post-electroporation). The AS events are detected by SUPPA. The number of genes for each condition is indicated. The enrichment significance for the intersection is computed by Fisher's exact test and represented by colors. Color-coded key: -log10(P-value). The P-value and odds ratio are reported. (E) The stacked bar plot represents the 473 differential AS events detected by SUPPA in hiNPCs subjected to siRNA against hnRNPL distributed by event type. Event types: SE, skipped event; RI, retained intron; MX, mixed event; A3, alternative 3 ; A5, alternative 5 ; AF, alternative first exon; AL, alternative last exon. (F) The bar plot represents GO biological process and KEGG pathway terms significantly enriched in genes presenting altered AS events as detected by SUPPA following CHD8 and hnRNPL suppression (intersection in D). The bars are ordered according to adjusted P-values in -log10 scale; the x-axis represents the number of genes enriched for each term. (G) CHD8 functions at transcription initiation, elongation and regulation of AS. Schematic representation of the molecular mechanisms proposed to explain CHD8's roles at promoters/enhancers and in the modulation of AS (figure created in BioRender.com). CHD8 interacts with hnRNPL, possibly through stabilizing RNA bridges (red hairpins). Thus, CHD8/hnRNPL association, possibly also recruiting the Mediator complex (106), might stabilize enhancer/promoter looping and transcriptional initiation. However, hnRNPL also solidly interacts with SETD2 at elongating RNAPII, thus implicating CHD8 in the regulation of RNA processing and AS. By comparing CHD8 and SETD2 MS experiments [our data and (39)], a number of other hnRNP interactors emerge, providing a functional link between SETD2, hnRNPs and CHD8 with elongating RNAPII, with functional consequences for the regulation of the splicing machinery. Differential alternative splicing events can be modulated by the CHD8/hnRNPL/SETD2 complex. supporting transcription initiation. Indeed, hnRNPL was shown to associate with Med23 (107), while other members of the CHD family (CHD1) were previously reported to interact with MED1/23/6 and coordinate pre-initiation complex (PIC) assembly ( Figure 5G).
HnRNPL was also previously reported as a subunit of the human KMT3a/Set2 complex, assisting H3K36 trimethylation in vivo (40). This interaction occurs through the SETD2-hnRNP interaction domain, deletion of which leads to reduced H3K36me3 deposition (39). Indeed, hn-RNPL also robustly interacts with SETD2 in our hNPCs ( Figure 4H). Importantly, siRNA-mediated hnRNPL transient reduction in our ASD-relevant model system partially mirrored (47.7%) the dysfunctional AS observed following CHD8 suppression. These observations reinforce the hypothesis that CHD8-driven H3K36me3-dependent AS regulation is, at least in part, ascribable to hnRNPL-CHD8 association. However, by comparing CHD8 and SETD2 MS experiments [our data and (39)], a number of other hnRNP interactors emerge, thus providing additional functional links between SETD2, hnRNPs and CHD8 with elongating RNAPII to be explored and characterized.
In conclusion, thanks to the identification of hnRNPL--but also other hnRNPs--as new CHD8 interactors, here we uncover a new function for CHD8--and possibly extendable to other members of the CHD family--at the cross-talk between chromatin, transcription and splicing machinery regulation. Thus, dissecting the consequences of elongation-coupled H3K36 methylation dysfunction for transcription fidelity, RNA splicing and DNA damage repair (75) will represent the next challenge for understanding chromatin-linked alterations in neurodevelopmental disorders and ASD in particular.

DATA AVAILABILITY
The authors submitted all datasets to the GEO under the accession number GSE148057. The MS proteomics data have been deposited in the ProteomeXchange Consortium via the PRIDE (108) partner repository with the dataset identifier PXD025739.