UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Penalised regression improves imputation of cell-type specific expression using RNA-seq data from mixed cell populations compared to domain-specific methods

Lin, Wei-Yu; Kartawinata, Melissa; Jebson, Bethany R; Restuadi, Restuadi; Peckham, Hannah; Radziszewska, Anna; Deakin, Claire; ... Wallace, Chris; + view all (2025) Penalised regression improves imputation of cell-type specific expression using RNA-seq data from mixed cell populations compared to domain-specific methods. PLOS Computational Biology , 21 (3) , Article e1012859. 10.1371/journal.pcbi.1012859. Green open access

[thumbnail of LinEtAl2025.pdf]
Preview
Text
LinEtAl2025.pdf - Published Version

Download (2MB) | Preview

Abstract

Gene expression studies often use bulk RNA sequencing of mixed cell populations because single cell or sorted cell sequencing may be prohibitively expensive. However, mixed cell studies may miss expression patterns that are restricted to specific cell populations. Computational deconvolution can be used to estimate cell fractions from bulk expression data and infer average cell-type expression in a set of samples (e.g., cases or controls), but imputing sample-level cell-type expression is required for more detailed analyses, such as relating expression to quantitative traits, and is less commonly addressed. Here, we assessed the accuracy of imputing sample-level cell-type expression using a real dataset where mixed peripheral blood mononuclear cells (PBMC) and sorted (CD4, CD8, CD14, CD19) RNA sequencing data were generated from the same subjects (N=158), and pseudobulk datasets synthesised from eQTLgen single cell RNA-seq data. We compared three domain-specific methods, CIBERSORTx, bMIND and debCAM/ swCAM, and two cross-domain machine learning methods, multiple response LASSO and ridge, that had not been used for this task before. We also assessed the methods according to their ability to recover differential gene expression (DGE) results. LASSO/ ridge showed higher sensitivity but lower specificity for recovering DGE signals seen in observed data compared to deconvolution methods, although LASSO/ridge had higher area under curves than deconvolution methods. Machine learning methods have the potential to outperform domain-specific methods when suitable training data are available.

Type: Article
Title: Penalised regression improves imputation of cell-type specific expression using RNA-seq data from mixed cell populations compared to domain-specific methods
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1371/journal.pcbi.1012859
Publisher version: https://doi.org/10.1371/journal.pcbi.1012859
Language: English
Additional information: Copyright © 2025 Lin et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Keywords: Gene expression, RNA sequencing, Gene prediction, Cytotoxic T cells, Machine learning, Flow cytometry, Immune cells, White blood cells
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health > Infection, Immunity and Inflammation Dept
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health > Population, Policy and Practice Dept
URI: https://discovery.ucl.ac.uk/id/eprint/10209729
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item