SPACE exploration of chromatin proteome to reveal associated RNA- binding proteins

Chromatin is composed of many proteins that mediate intermolecular transactions with the genome. Comprehensive knowledge of these components and their interactions is necessary for insights into gene regulation and other activities; however, reliable identification of chromatin-associated proteins remains technically challenging. Here, we present SPACE (Silica Particle Assisted Chromatin Enrichment), a stringent and straightforward chromatinpurification method that helps identify direct DNA-binders separately from chromatinassociated proteins. We demonstrate SPACE’s unique strengths in three experimental setups: the sensitivity to detect novel chromatin-associated proteins, the quantitative nature to measure dynamic protein use across distinct cellular conditions, and the ability to handle 1025 times less starting material than competing methods. In doing so, we reveal an unforeseen scale of association between over 500 nuclear RNA-binding proteins (RBPs) with chromatin

enabling stringent washing with denaturing reagents. DNA labelling after a pulse was successfully used to study nascent chromatin and DNA replication; however, prolonged treatment with these modified nucleotides is toxic to cells, so potentially distorting measurements. Sensitive cells such as mouse embryonic stem cells (mESC) are particularly affected 8 . Finally, none of the previous purification methods is able to distinguish between direct DNA-binders and chromatin-associated proteins. Therefore, a stringent and straightforward protocol to purify chromatin without manipulating cells is potentially of great use.
Here, we present SPACE (Silica Particle Assisted Chromatin Enrichment), a reproducible and specific method that relies on silica magnetic beads for parallel processing of large numbers of samples at low cost. It is fast to perform and it is readily combined with multiple protocols including ChIP, SICAP, mass spectrometry and sequencing. To demonstrate the power of the method, we evaluated SPACE in three different experimental settings. First, we studied the global chromatin composition of mESCs: we successfully identified the expected DNA-and chromatin-binding proteins, as well as over 500 RBPs that were not known to be associated with chromatin. One of these, Dazl -best known for targeting the 3' untranslated regions (3' UTRs) of mRNAs -binds primarily to the transcription start sites (TSSs) of developmental genes. Second, to demonstrate its versatility and quantitative nature, we combined SPACE with ChIP (chromatin immunoprecipitation) to study the dynamic changes in chromatin composition at Nanog-bound enhancers upon transition from serum to 2iL medium of mESCs. Finally, we applied SPACE to human induced pluripotent stem cellderived (hiPSC) neural precursors to study changes in the chromatin proteome of hiPSC lines containing ALS-causing point mutations in the valosin-containing protein (VCP), as a chromatin-associated RBP 9,10 . Our approach uncovered mutation-specific deregulation in chromatin composition during neuronal specification, thus shedding light on the pathogenic mechanism linked to these mutations.

SPACE reveals many RNA-binding proteins that bind chromatin and DNA
Silica matrices (columns or beads) are widely used to purify DNA from diverse samples; however, they have not been applied to chromatin purification yet. We reasoned that regions of DNA are likely to remain accessible even after formaldehyde crosslinking of chromatin and SPACE exploits the capacity of silica magnetic beads to purify formaldehyde-crosslinked chromatin in the presence of chaotropic salts, followed by chromatin digestion on the beads ( Fig. 1, steps 1-4a). Unlike other methods, in SPACE chromatin is not precipitated and cells are not fed with modified nucleotides. Non-crosslinked negative controls are prepared in a similar way to routine DNA purification, which is normally free of contaminating proteins.
By applying SILAC-labelling to the crosslinked (heavy SILAC) and non-crosslinked (light SILAC) samples before adding silica magnetic beads, we are able to determine whether a protein is isolated due to crosslinking or non-specific association to the beads and other proteins. SPACE is fast to perform (~ 1h from the cell lysis to the start of protein digestion), yet stringent: denaturing reagents include 4M guanidinium isothiocyanate, 2% Sarkosyl, 80% ethanol and 100% acetonitrile that efficiently remove contaminants. Chromatin fragments are also treated extensively with RNase remove RNA contaminations. Finally, the method is readily extended to identify direct DNA-binding proteins by a two-step protein digestion strategy (Fig. 1, steps 4b-6).
We first applied SPACE-MS to mESCs cultured in 2iL (2i+LIF) and we identified 1,965 significantly enriched proteins compared with the negative control ( Fig. 2A, Table 1A). We  Fig. 2A-B). We identified 1,618 RBPs in our full proteome dataset. While 827 out of the 1618 RBPs are localized to the nucleus, interestingly, 522 (63%) of these were detected also by SPACE. This demonstrates that RBPs and splicing factors are the most common previously unannotated chromatin-associated proteins.
Next, to increase the stringency of purification and to distinguish direct DNA binders, we combined SPACE with SICAP (Selective Isolation of Chromatin-Associated Proteins 11 ).
Here, DNA is biotinylated using a terminal deoxynucleotidyl transferase and captured by protease-resistant streptavidin magnetic beads (Fig. 1, steps 4b-6; SPACE-SICAP). In addition to the double stringent purification, SPACE-SICAP also allows to identify some of the direct DNA-binders using the same samples prep. For the latter aim, we use streptavidin beads 12 with a two-step protein digestion strategy 13 . Following SPACE-SICAP purification, LysC is applied to break down proteins, leaving only small fragments of direct DNA-binding proteins crosslinked with DNA; we then reverse the crosslinks by heating, to identify the DNA-bound peptides. Thus, SPACE-SICAP yields two fractions: the supernatant of the LysC 6 digestion, containing indirect DNA-binding proteins, and the DNA-bound fraction (referred to as SPACE-SICAP I and II, respectively). Both fractions are further digested by Trypsin to improve protein identification ( Supplementary Fig. 1).
We identified 1,567 enriched proteins in SPACE-SICAP I, (~25% less than SPACE alone; Fig. 2C and Table 1C) and identified proteins are distributed into the four categories similarly (Fig. 2D). The higher enrichment fold change compared with non-crosslinked controls reflects the increased stringency of SPACE-SICAP (compare Fig. 2A  Considering the vast parts of the proteins that are removed during the LysC digestion, the number of reproducibly identified proteins ( at least 2 out of 4 replicates) in SPACE-SICAP II is just 55 proteins ( Fig. 2F and Table 1D). This includes 31 known DNA-binding proteins including 7 histones, whereas most of the remaining enriched proteins are RBPs (14/24, Fig.   2G) that are primarily involved in mRNA splicing (10/14). 52% of the peptides we identified using this approach are inside or within 50 amino acids of annotated DNA/RNA-binding domains. As such, H15 (the linker histone) determined to be the most frequent domain that we identified using the enriched peptides ( Supplementary Fig. 2D-E). Together, we show that SPACE successfully enriches for chromatin-associated proteins, with both SPACE and SPACE-SICAP identifying an unexpectedly large number of nuclear RBPs, some of which directly bind DNA.

Dazl co-localizes with PRC2 and H3K27me3 in the ground-state of mESC
To demonstrate the quantitative capacity of SPACE, we next compared mESC grown in 2iL (heavy SILAC) and serum conditions (light SILAC), which correspond to the ground and primed states of pluripotency in the mouse, respectively. Although previous studies compared these two pluripotency states 14 , it is still not clear how pluripotency gene regulatory networks are reinforced by chromatin-associated deregulations in different states. We identified 1,879 proteins in total (Fig. 3A): 100 proteins are statistically significantly more abundant in 2iL and 87 in serum (fold-change > 2 and adj. p-value < 0.1, moderated t-test, Table 2). We also compared the SPACE results with the full proteome from the total cell lysate, which identified 6007 proteins, 1767 proteins overlapping with the SPACE dataset, while 112 proteins were only identified using SPACE, including pluripotency transcription factors such as Prdm14 and Klf4 ( Supplementary Fig. 3A-B). There is a strong correlation in log2 foldchange values between SPACE and the full proteome ( Fig 3B; R 2 = 0.64), but SPACE displays a larger dynamic range (on average ~2-fold higher than full proteome ratios, Supplementary Fig. 3C); demonstrating that it is more sensitive to changes in the abundance of chromatin-associated proteins.
98 proteins with known functions in the maintenance of pluripotency or exit from it or blastocyst development were identified in the 2iL and/or serum conditions (Fig. 3C). Among them are chromatin proteins that physically interact with the core pluripotency circuitry (Nanog, Oct4, Sox2): Tfcp2l1, Prdm14, Cbfa2t2, b-Catenin, Zfp42 (Rex1), Klf4, Trim24 and Esrrb bind to chromatin more abundantly in 2iL condition, whereas Lin28a, Zscan10, Znf281 and Nr0b1 bind more abundantly in serum condition (Fig. 3C). Although members of both groups are known for roles in preserving pluripotency, our results suggest that the core circuitry of pluripotency is supported by different sets of proteins in the 2iL and serum conditions. Among these proteins, Prdm14, Lin28a and Nr0b1 are known for their RNA-8 binding functions. Notably, we identified two additional RBPs that have not been previously implicated in pluripotency, but had highly differential chromatin-binding in 2iL and serum (moderated t-test adj. p-value < 0.1 and fold-change > 4, Supplementary Fig. 3E): Dazl and Lire1. When clustered with the 98 known pluripotency proteins according to their log2 foldchanges (2iL/Serum) and protein abundances (iBAQ), Dazl clusters with the 2iL-related factors such as Nanog, Tfcp2l1, Prdm14 (cluster 6), whereas Lire1 clusters with serumrelated factors such as Lin28a and Dnmt3l (cluster 5) (Fig. 3D).
We investigated the genome-wide locations of Dazl binding sites by chromatin immunoprecipitation and sequencing (ChIP-seq) using a validated antibody ( Fig. 3E and S3F), revealing ~1,300 reproducible peaks (IDR < 0.01). Surprisingly, 75% of peaks are found close (< 1kb) to transcription start sites (TSS); many target genes are pluripotency regulators, including Esrrb, Sox2, several Wnt ligands and Frizzled receptors (Supplementary H3K27me3 profiles on the mESC genome (Fig. 3F). Interestingly, we observed a very similar pattern, demonstrating that Dazl co-localizes with PRC2 on chromatin, especially at the promoters of genes related to the differentiation programs and exiting from pluripotency.
We also performed individual-nucleotide crosslinking and immunoprecipitation (iCLIP) to identify the RNA binding sites of Dazl across the transcriptome 15 . We identified 2,550 peaks, 2099 of which locate to 3' UTRs, and only 166 locate within 3,000 nucleotides from the 5' end of mRNAs (Supplementary Table 2E). Thus, the RNA binding sites were positioned at different locations in genes compared to DNA-binding sites, which were located mainly in promoters ( Fig. 3G and Supplementary Fig. 3H). Moreover, most of the genes containing DNA-binding sites of Dazl in their promoter or gene body did not overlap with the genes 9 containing RNA-binding sites of Dazl within their transcripts; only 61 out of 1144 genes (5%) with a ChIP-seq-defined peak on their genes (gene body and 3kb upstream of the TSS) also have an iCLIP-defined peak on their transcript ( Supplementary Fig. 3I). These results suggest that the DNA-and RNA-binding functions of Dazl are mechanistically independent.

Dynamics of proteins at Nanog-binding enhancers during the pluripotency transition
In addition to understanding the global dynamics of chromatin, we combined SPACE with ChIP to gain a detailed view of chromatin-associated proteins that co-localise with a target of interest. In doing so, we overcome the challenge faced by ChIP-MS, which is limited in the rigour of washing due to the sensitivity of the antibodies. We previously developed and recently updated ChIP-SICAP that enables very stringent washes 11 , which revealed proteins co-localized with the core circuitry of pluripotency (OSN). Here, we compare ChIP-SICAP and ChIP-SPACE by immunoprecipitating Nanog (Supplementary Fig. 4A-B). Both methods enriched for potential true positives (PTP: known Nanog interactors and other chromatinbinders) while depleting for potential false positives (PFP: ribosomal proteins and other cytoplasmic proteins). ChIP-SICAP gives the largest relative difference between the positive and negative controls (PTP:PFP ratio of 28:1), ChIP-SPACE specificity is still much higher than ChIP-MS (7:1 compared with 2:1; Supplementary Fig. 4B). Moreover, ChIP-SPACE has the advantage over ChIP-SICAP that it doesn't need DNA end-labelling by biotinylated nucleotides, and streptavidin purification of chromatin. Thus ChIP-SPACE is significantly more convenient for studies with many samples to be processed simultaneously such as chromatin dynamics across multiple time points.
To understand the protein composition bound to the cis-regulatory elements (CREs) during the transition from the primed and ground-state pluripotency conditions, we applied ChIP-SPACE to study Nanog-bound CREs upon switching the medium from serum to 2iL (0h, 12h, 24h and 48h). We immunoprecipitated Nanog followed by H3K27ac to enrich for Nanog-bound CREs. We chose H3K27ac as a marker for active enhancers 16

VCP mutations globally affect chromatin composition in neural precursor cells
In our final experimental set-up, we tested SPACE on a biologically challenging system, using <2 million cells. VCP or p97 is a hexameric protein that is conserved across all eukaryotes. VCP is a member of the type II AAA+ ATPase (ATPases Associated with diverse cellular Activities) family of proteins, involved in multiple cellular processes, including protein degradation, intracellular trafficking, DNA repair and replication, and the regulation of cell cycle 17 . Moreover, multiple proteomic methods have reported it to bind RNA 9, 10 and chromatin, with multiple chromatin-related functions 18 . Mutations in VCP cause ALS (amyotrophic lateral sclerosis) 19,20 ; however the disease-causing mechanism is unknown. We compared chromatin compositions in neural differentiated (day 14) human induced pluripotent stem cells (hiPSCs) between three cell lines (M1, M2 and M3) with VCP mutations (one R191Q and two R155C; hereafter VCPmut) and control lines from three healthy donors (C1, C2 and C3; Fig. 5A). hiPSCs were differentiated into ventral spinal motor neuron precursors (NPs 21 ). We identified 1,639 proteins in total, with 1,540 proteins quantified in at least four samples (Supplementary Table 4 We also observe decreased chromatin-association of many proteins related to DNA damage response in VCPmut cells, including TP53BP1. TP53BP1 promotes non-homologous end joining repair of damage sites 24 , and its decrease in VCPmut cells agrees with the finding that VCP directly promotes the recruitment of TP53BP1 to the DNA double-strand breaks 25 . SMC1A (cohesion complex) and MED1 (RNA polymerase II mediator complex) are also decreased in VCPmut cells, in agreement with their direct interactions with TP53BP1 and roles in DNA repair 26,27 (Fig. 5C). Thus, our data suggest that the decreased capacity of mutant VCP to bind chromatin decreases the chromatin binding of TP53BP1 and associated proteins involved in DNA repair.
Among the up-or down-regulated chromatin-associated proteins are several RBPs involved in RNA-processing and splicing, nonsense-mediated decay (NMD), RNA transport, transcriptional control and neuronal processes ( Fig. 5C and Supplementary Fig. 5D).
Remarkably, UPF1 is significantly higher in VCPmut cells. UPF1 is an RNA helicase that remodels RNPs 28 , contributes to release of mRNA from DNA, shuttles between nucleus and cytoplasm 29 , and plays key roles in NMD 30 . Interestingly, UPF1 also has neuroprotective effects against the accumulation of mutant RBPs, such as TDP43 and FUS, and improves survival in a neuronal model of ALS 31,32 . Thus, the observed upregulation of RNA transport and NMD components on chromatin may be part of a rescue mechanism in VCPmut cells.
Altogether, chromatin composition analysis sheds light on how mutations in VCP might make the cells vulnerable to DNA damage, and reveals changes in recruitment of many RBPs. This opens the opportunity to understand if these RBPs contribute to the vulnerability of cells to degeneration, or if they might, along with UPF1, help to protect them.

Discussion
This study presents SPACE, a robust method for purification of chromatin-associated proteins by silica magnetic beads for proteomic analysis. Recent ChIP-seq studies revealed dozens of RBPs that can bind to chromatin 33 , and strikingly, our study now detects 63% of the nuclear RBPs known to be associated with chromatin. Besides, by combining SPACE with SICAP, we developed a high-throughput approach to identify direct DNA-binders, we showed that many RBPs, specifically mRNA splicing factors, are capable of directly binding to DNA. Our results are in line with co-transcriptional RNA processing concept 34 , and indicate that proteins binding both to RNA and DNA may play a role in physically integrating transcription and RNA processing. The straightforward and cost-effective nature of SPACE makes it well-suited to study chromatin remodelling, which we evaluated in three experiments: First, we compared the global chromatin composition in 2iL and serum conditions of mESCs, which showed that the differential chromatin binding detected by SPACE upon altered pluripotency states was generally twice larger than the changes detected at the total protein level. For instance, transcription factors involved in pluripotency such as b-Catenin, Nanog, Cbfa2t2, Zfp42, Znf281, Zscan10 (12/15 proteins, Fig 3B) showed differential changes using SPACE (adj. p-value < 0.1 and fold-change >2), whereas full proteome changes were not significant (fold-change <2). Thus, SPACE provides a very sensitive method to detect chromatin composition changes in response to the pluripotency state.
One of the unexpected differential proteins was the RBP Dazl, which is highly abundant on chromatin specifically in the 2iL condition. Dazl has been primarily studied in the context of germ cells due to its substantial roles in controlling the mRNA translation and stability, which is necessary for germ cell survival 35,36 . Moreover, cytoplasmic Dazl was reported to regulate Tet1 translation and hence DNA demethylation in 2i condition in mESCs 37 . Dazl was previously reported to also localise in the nucleus 38 , but its nuclear function has not yet been explored. Therefore we used ChIP-seq, which found that Dazl associates with the same chromatin sites as PRC2, including TSS of many developmental genes such as Hox genes, Wnt ligands, Wnt receptors and some pluripotency genes (Esrrb and Sox2). This is in line with the finding that RBPs often interact with enhancers, promoters and transcriptionally active regions on chromatin (31251911). Considering that we apply RNase treatment during the SPACE procedure, the co-location of Dazl and PRC2 on chromatin is not indirect via RNA. Nevertheless, it is likely that Dazl binds to and regulates the non-coding RNAs that can associate with the PRC2-bound chromatin sites, and we speculate that Dazl might influence the reported capacity of RNA to displace PRC2 from the chromatin 39 .
In the second experiment, we use SPACE to assess local chromatin dynamics on the Nanogbinding CRMs during the transition from serum to 2iL condition. ChIP-seq analyses showed that roughly 50% of the binding sites of the many well-known TFs do not contain the cognate motifs 40 . Our results are in line with the cooperative model of TF binding to the hotspots or clusters that are built through a complex network of protein-protein, in addition to protein-DNA, interactions 41, 42 . Specifically, our data suggest that shortly after induction of naive pluripotency, proteins with both activating and inhibiting functions bind to the Nanogregulated CRMs, and then the activators take over the CRMs 48h after the induction, indicating how the CRMs are remodelled during this transition.
Finally, to explore the capacity of SPACE to provide insight into disease-related chromatin remodelling, we examined the impact of VCP mutations on NPs. We observed a significant decrease in chromatin abundance of mutant VCP, as well as several proteins involved in DNA repair, such as TP53BP1, which relates to the well-recognised role of DNA damage in neurodegeneration [43][44][45] . Moreover, VCP contributes to genome stability, and mutant VCP neurons are highly sensitive to DNA damage-induced transcriptional stress 46,47 . Interestingly, 15 VCPmut cells have increased chromatin abundance of RBPs involved in RNA clearance from chromatin, such as NMD and RNA export from the nucleus (Fig. 5D). It will be interesting to test whether the presence of these RBPs affects the role that nascent RNA plays as a competitor for many chromatin-binding proteins 39

Competing interests
The authors declare no competing interests  The peptides are cleaned and injected to the mass spectrometer.

Mass spectrometry and proteomics data analysis
The details of sample preparation using SPACE, SPACE-SICAP and ChIP-SPACE procedures are provided at the end. Following sample preparation, peptides were separated on a 50 cm, 75 µm I.D. Pepmap column over a 120 min gradient for SPACE and SPACE-SICAP, or a 70min gradient for ChIP-SPACE. Peptides were then injected into the mass spectrometer (Orbitrap Fusion Lumos) running with a universal Thermo Scientific HCD-IT method. Xcalibur software was used to control the data acquisition. The instrument was run in data-dependent acquisition mode with the most abundant peptides selected for MS/MS by HCD fragmentation. RAW data were processed with MaxQuant (1.6.2.6) using default settings. MSMS spectra were searched against the Uniprot (Swissprot) database (Mus musculus and Homo sapiens) and database of contaminants. Trypsin/P and LysC were chosen as enzyme specificity, allowing a maximum of two missed cleavages. Cysteine carbamidomethylation was chosen as the fixed modification, and methionine oxidation and protein N-terminal acetylation were used as variable modifications. Global false discovery rate for both protein and peptides was set to 1%. The match-from-and-to and re-quantify options were enabled, and Intensity-based quantification options (iBAQ) were calculated.

Quantitative proteomics, statistical and computational analysis
The protein groups were processed in RStudio using R version 4.0.0. After filtering out Reverse, potential contaminants and proteins only identified by site, protein groups were quantified by their SILAC or iBAQ intensities. In the experiments related to mESC chromatin composition (Fig 2), if a SILAC ratio of CL/nCL was NA then iBAQ intensities were used to calculate the ratios. Differential proteins were determined by the limma R package. Proteins with log2-fold change > 1 and adj. p-value < 0.1 were considered as "Enriched ++", and proteins with log2-fold change > 1 were considered as "Enriched +". Uniprot and DAVID were used as the source of Gene Ontology and protein domain information. To categorize the proteins, the DNA-binders were labeled as DB, then chromatin and chromosome binders were labeled as CB. This means if a protein is a DNAbinder and chromatin-binder it is labeled as DB. Then proteins present in the nucleus were labeled as PN, then the rest of the proteins were labeled as "unexpected" or UE. In the specific case of SPACE-SICAP to determine direct DNA-binders we filtered proteins > 20 kD to remove small proteins. The Clusterprofiler R package was used for Gene enrichment analysis. In the experiments related to chromatin composition in 2iL and serum states of mESC (Fig 3), physical protein-protein interaction data was imported from STRING database (11.0), and visualized by Cytoscape (3.8.0). The proteins were clustered by k-means clustering algorithm. The protein clusters were visualized by the Factoextra R package. In the experiments related to the Dynamics of proteins associated with Nanog-binding CREs (Fig  4), the iBAQ intensities were quantile normalized by the preprocessCore package, and the mean of two replicates per time-point was used as the average protein intensity to cluster the proteins by k-means clustering algorithm. In the experiments related to the chromatin composition in VCPmut and control NPs (Fig 5), the iBAQ intensities were quantile normalized by the preprocessCore package, and differential proteins were determined by limma package. Protein information was downloaded from Uniprot and DAVID Gene Ontology database. The DOSE R package was used for disease ontology.

Dazl ChIP-seq experiment and data analysis
The ChIP procedure and analysis were carried out essentially as described previously 11 . Briefly, mESCs were grown in 2iL medium. The cells were detached, and fixed by 1.5% formaldehyde in PBS for 15min. Chromatin was solubilized by sonication, and sheared to < 500 bp fragments. Dazl immunoprecipitation was carried out using CST antibody #8042 overnight at 4 °C. Following washing steps, chromatin was eluted, and DNA was purified by SPRI beads. Library was prepared for the Illumina platform as described previously. Sequencing was carried out using 75nt reads on paired-end mode by HiSeq4000. Reads were trimmed, aligned to the mouse genome (mm10) using Bowtie2, and duplicated reads were removed with samtools. Peak calling was performed using macs2. Narrow peaks called by macs2 were extended by 250bp around their middle (to a total width of 500bp) Dazl peaks annotation into genomic features was done using ChIPseeker R package with 3kb around TSS set for promoter region window. The ChIP-seq profiles of Dazl, Suz12, Aebp2 and H3K27me3 were compared by deeptools 2.
Dazl iCLIP and data analysis iCLIP was carried out as previously described 49 . Briefly, mESCs were grown in 2iL medium. Cells were UV cross linked, lysed and IP performed using 1:70 DAZL antibody (CST #8042) in IP. RNaseI was used at 0.4U/mg cell lysate per IP. Finalised libraries were sequenced as single end 100bp reads on Illumina HiSeq 4000. Processing of DAZL iCLIP raw data was carried out using iMaps (https://imaps.genialis.com/). The demultiplexed and quality controlled data was mapped to mm10 genome assembly using STAR (2.6.0) with default settings. Both PCR duplicates and reads that did not map uniquely to the genome were discarded.

ChIP-SICAP
Nanog IP was carried out using CST antibody #8822. ChIP-SICAP was carried out essentially as described previously 11,12 . The step by step protocol of ChIP-SICAP is available in the following database: dx.doi.org/10.17504/protocols.io.bcrriv56

Data availability
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD020037. The accession numbers for the Dazl ChIP-seq and iCLIP reported in this paper are ArrayExpress: E-MTAB-9302 and E-MTAB-9332, respectively. 43. Clean-up the peptides using stage-tips or ZipTips.

SPACE-SICAP procedure
Additional required material:  °C for 2min in a Thermomixer with 1000 RPM agitation. 4. Remove the beads on the magnet, and transfer the supernatant to a new 2-ml tube 5. Add 300ul 2-propanol, and vortex 6. Add 30ul of DNA-binding beads, vortex, and spin 7. Wait 10min 8. Separate the beads on the magnet, and discard the supernatant 9. Wash the beads with 500ul Wash buffer 1, separate the beads on the magnet, and discard the supernatant. 10. Wash the beads with 500ul Wash buffer 2, separate the beads on the magnet, and discard the supernatant. 11. Repeat the last step once again. 12. Wash the beads with 500ul Acetonitrile 100%, separate the beads on the magnet, and discard the supernatant. 13. Resuspend the beads in 80ul of Acetonitrile 40%, and transfer them to PCR tubes.
Note: You may also need to repeat this step once again to transfer all the beads to a PCR tube. 14. Put the cells on a magnet, and discard the supernatant. 15      Nanog IP or normal IgG