Haberman, N;
(2017)
Insights into protein-RNA complexes from computational analyses of iCLIP experiments.
Doctoral thesis , UCL (University College London).
Preview |
Text
University_College_London_thesis-FINAL-oneside-Appendix.pdf - Accepted Version Download (21MB) | Preview |
Abstract
RNA-binding proteins (RBPs) are the primary regulators of all aspects of posttranscriptional gene regulation. In order to understand how RBPs perform their function, it is important to identify their binding sites. Recently, new techniques have been developed to employ high-throughput sequencing to study protein-RNA interactions in vivo, including the individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP). iCLIP identifies sites of protein-RNA crosslinking with nucleotide resolution in a transcriptome-wide manner. It is composed of over60steps,whichcanbemodified,butitisnotclearhowvariationsinthemethod affect the assignment of RNA binding sites. This is even more pertinent given that several variants of iCLIP have been developed. A central question of my research is how to correctly assign binding sites to RBPs using the data produced by iCLIP and similar techniques. I first focused on the technical analyses and solutions for the iCLIP method. I examinedcDNAdeletionsandcrosslink-associatedmotifstoshowthatthestartsof cDNAs are appropriate to assign the crosslink sites in all variants of CLIP, including iCLIP, eCLIP and irCLIP. I also showed that the non-coinciding cDNA-starts are caused by technical conditions in the iCLIP protocol that may lead to sequence constraintsatcDNA-endsinthefinalcDNAlibrary. Ialsodemonstratedtheimportance of fully optimizing the RNase and purification conditions in iCLIP to avoid thesecDNA-endconstraints. Next,IdevelopedCLIPo,acomputationalframework that assesses various features of iCLIP data to provide quality control standards which reveals how technical variations between experiments affect the specificity of assigned binding sites. I used CLIPo to compare multiple PTBP1 experiments produced by iCLIP, eCLIP and irCLIP, to reveal major effects of sequence constraintsatcDNA-endsorstarts,cDNAlengthdistributionandnon-specificcontaminants. Moreover, I assessed how the variations between these methods influence themechanisticconclusions. Thus,CLIPopresentsthequalitycontrolstandardsfor transcriptome-wide assignment of protein-RNA binding sites. I continued with analyses of RBP complexes by using data from spliceosomeiCLIP. This method simultaneously detects crosslink sites of small nuclear ribonucleoproteins (snRNPs) and auxiliary splicing factors on pre-mRNAs. I demonstratedthatthehighresolutionofspliceosome-iCLIPallowsfordistinctionbetween multiple proximal RNA binding sites, which can be valuable for transcriptomewide studies of large ribonucleoprotein complexes. Moreover, I showed that spliceosome-iCLIP can experimentally identify over 50,000 human branch points. In summary, I detected technical biases from iCLIP data, and demonstrated how such biases can be avoided, so that cDNA-starts appropriately assign the RNA binding sites. CLIPo analysis proved a useful quality control tool that evaluates data specificity across different methods, and I applied it to iCLIP, irCLIP and ENCODE eCLIP datasets. I presented how spliceosome-iCLIP data can be used to study the splicing machinery on pre-mRNAs and how to use constrained cDNAs from spliceosome-iCLIP data to identify branch points on a genome-wide scale. Taken together, these studies provide new insights into the field of RNA biology and can be used for future studies of iCLIP and related methods.
Type: | Thesis (Doctoral) |
---|---|
Title: | Insights into protein-RNA complexes from computational analyses of iCLIP experiments |
Event: | UCL (University College London) |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Keywords: | Protein–RNA interactions, iCLIP, CLIP, irCLIP, eCLIP, PAR-CLIP, spliceosome-iCLIP, Binding site assignment, High-throughput sequencing, Polypyrimidine tract binding protein 1 (PTBP1), Eukaryotic initiation factor 4A-III (eIF4A3), Exon-junction complex, U2AF65, RBP, splicing, Spliceosome, computational biology, CLIPo |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment |
URI: | https://discovery.ucl.ac.uk/id/eprint/1568450 |
Archive Staff Only
View Item |