Turkes, Emir;
(2025)
Development of Gene Set Enrichment and Imputation Methods for Transcriptomics and Proteomics: Application in the Study of Neurofibrillary Tangle-bearing Neurons in Alzheimer’s Disease.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Turkes_10215006_Thesis_sigs_removed.pdf Download (24MB) | Preview |
Abstract
Transcriptomics and proteomics are high-throughput methods that assay gene expression and protein abundance in a biological sample at a given point in time. These datasets feature high dimensionality and technical noise which are routinely addressed using various computational methods. In particular, gene set enrichment is commonly used for measuring how enriched expression is for functionally defined subsets of the gene/protein profile, whereas imputation handles the replacement of missing data with predicted values. In this thesis, I review the prior art of these methods and identify pitfalls that warrant investigation. I then introduce novel methods, GeneFunnel and ImputeFinder, with freely available software implementations (https://github.com/eturkes/genefunnel and https://github.com/eturkes/imputefinder) that attempt to address these pitfalls without imposing performance bottlenecks, stringent assumptions, or unintuitive reasoning. Although ImputeFinder did not have a comparable equivalent, GeneFunnel was benchmarked against leading methods in both synthetic and real-world data, showing superior analytic and computational performance across all metrics. An interactive web viewer of these benchmarks is available at https://data.duff-lab.org/app/genefunnelbenchmarks-viewer. I deploy the methods in a real-world context by developing a pipeline for characterising neurofibrillary tangle-bearing neurons in Alzheimer’s Disease. A previously available dataset of human post-mortem tissue, where tangle-bearing neurons were isolated from non-tangle-bearing neurons and subject to transcriptomic profiling, was reanalysed alongside a similarly designed in-house dataset that profiled proteins. The integrated analysis is complemented by interactive network visualisations and a web-based viewer, allowing in-depth exploration of the results at https://data.duff-lab.org/app/tangle-bearingneurons-viewer. The analysis focused on uncovering major drivers of biological pathways upregulated in tangle-bearing neurons in both the transcriptomic and proteomic datasets, identifying the pathway hubs NEFM, APP, SQSTM1, HSP90AA1, YWHAE, WASF1, CNTNAP1, and GOT2. Using informatics and literature review, I investigate the contribution of these hubs to distinct functional domains, laying the groundwork for a unified model of the pathophysiology of tangle-bearing neurons in Alzheimer’s Disease.
| Type: | Thesis (Doctoral) |
|---|---|
| Qualification: | Ph.D |
| Title: | Development of Gene Set Enrichment and Imputation Methods for Transcriptomics and Proteomics: Application in the Study of Neurofibrillary Tangle-bearing Neurons in Alzheimer’s Disease |
| Open access status: | An open access version is available from UCL Discovery |
| Language: | English |
| Additional information: | Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
| UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology |
| URI: | https://discovery.ucl.ac.uk/id/eprint/10215006 |
Archive Staff Only
![]() |
View Item |

