Early human lung immune cell development and its role in epithelial cell fate

Studies of human lung development have focused on epithelial and mesenchymal cell types and function, but much less is known about the developing lung immune cells, although the airways are a major site of mucosal immunity after birth. An unanswered question is whether tissue-resident immune cells play a role in shaping the tissue as it develops in utero. Here, we profiled human embryonic and fetal lung immune cells using scRNA-seq, smFISH and immunohistochemistry. At the embryonic stage, we observed an early wave of innate immune cells, including innate lymphoid cells, natural killer cells, myeloid cells and lineage progenitors. By the canalicular stage, we detected naive T lymphocytes expressing high levels of cytotoxicity genes, and the presence of mature B lymphocytes, including B1 cells. Our analysis suggests that fetal lungs provide a niche for full B cell maturation. Given the presence and diversity of immune cells during development, we investigated their possible effect on epithelial maturation. We found that IL-1β drives epithelial progenitor exit from self-renewal and differentiation to basal cells in vitro. In vivo, IL-1β-producing myeloid cells were found throughout the lung and adjacent to epithelial tips, suggesting that immune cells may direct human lung epithelial development.


RNA isolation and qPCR of fetal lung tissue
Fresh embryonic-, pseudoglandular-and canalicular-stage lung tissue was collected in Hibernate E medium or RNAlater Stabilization reagent (ThermoFisher Scientific, AM7020).A maximum of 30 mg of tissue was disrupted using a micro-pestle in 600 μl of RLT buffer containing β-mercaptoethanol (10 μl/ml).Lysate was homogenized using QIAshredder spin columns (Qiagen, 79656) and total RNA was extracted with the RNeasy Mini Kit (Qiagen, 74104), according to the manufacturer's instructions.RNA purity and concentration was determined using the Nanodrop 1000 spectrometer (ThermoFisher Scientific).0.5 μg RNA per sample was reverse transcribed using qScript cDNA Supermix (Quantabio, 733-1177), according to the manufacturer's protocol.cDNA was diluted 1:2 with RNAse-free water.For each qPCR reaction, 1 μl diluted cDNA was added to 5 μl of Power SYBR Green Master Mix (Life Technologies Ltd), 0.25 -2 μl of 10 mM forward and reverse oligonucleotide primer mix concentration optimized previously (Table S4), in a final volume of 10 μl in RNAse-free water.For each gene, the reaction was run in triplicate and a 'no template control' (water) was included.After 10 min at 95°C, 40 x PCR cycles were run, consisting of 15 s denaturation at 95°C, followed by an annealing and extension step at 60°C.Data was collected using Quant Studio Real-Time PCR Software v1.1.Relative transcript quantities were calculated using the ΔΔCt method, normalized to the expression of GAPDH.Fold-change was calculated relative to the embryonic stage samples, with p-values obtained by one-way ANOVA, followed by Tukey's post-hoc test.

Genotyping
Fetal skin and matched maternal saliva samples were used for DNA extraction, according to manufacturers' protocols (Qiagen,DNeasy Blood and Tissue Kit,69504,and QIAamp DNA Investigator Kit,56504 respectively).Genotyping was performed with the Affymetrix UK Biobank Axiom™ Array kit by Cambridge Genomic Services (CGS).

Cryosections
Lung samples were fixed at 4°C in 4% paraformaldehyde (PFA) overnight and processed as described previously (4).Samples were cut (if needed) to fit 15x15x5 mm molds.Post-fixation, samples were washed in PBS at 4°C, then sucrose-protected by incubation in 15%, 20%, then 30% sucrose in PBS for 1 hr each at room temperature.Samples were incubated 1:1 in 30% sucrose: optimal cutting temperature compound (OCT, Sakura) overnight at 4°C (small tissue fragments had an additional 100% OCT wash for 2 hrs at room temperature), then embedded in OCT and frozen on dry ice in an isopentane bath.7 µm cryosections were cut, mounted onto SuperFrost® slides (VWR®) and stored at -80°C.
Sections were dried at room temperature for 2 hrs prior to staining.Tissue was permeabilized using 0.3% Triton-X100 in PBS for 10 min at room temperature, and slides were then washed in PBS.Where required, antigen retrieval was performed, by heating slides in 10 mM sodium citrate buffer pH6 in a full power microwave for 5 min, followed by 1 hr cooling at room temperature.Sections were blocked in 5% serum (same species as that of the secondary antibody) plus 1% BSA (bovine serum albumin), 0.1% Triton-X100 in PBS for 1 hr at room temperature.Primary antibodies (Table S1) were diluted in block solution and added to sections, with overnight incubation at 4°C.Appropriate primary isotype control antibodies were used on tissue sections in all experiments (to rule out non-specific binding of the primary antibody) and secondary antibody-only controls were also performed (ie.no primary antibody, to rule out non-specific binding of the secondary antibody).After PBS washes, sections were incubated in secondary antibodies (1:1000; all ThermoFisherScientific, Table S2a) diluted in 5% serum (same as block), 0.1% Triton-X100 in PBS for 3 hrs at room temperature.Sections were incubated with DAPI for 20 min, followed by PBS washes, then mounted in Fluoromount™ Aqueous Mounting Medium (Merck).Images were obtained with a Zeiss Axiophot, Leica DMi8 or Leica SP8vis confocal microscope.

Paraffin sections
Lung samples were fixed as above.Post-fixation, samples were washed in PBS at 4°C, then incubated in 25% ethanol (EtOH) in 0.1% Tween-20 in PBS (PBS-T) for 30 min at 4°C.Further incubations in 50% EtOH/PBS-T, then 70% EtOH/PBS-T for 30 min each were performed, prior to transfer of the sample into an automated vacuum tissue processor for further dehydration and impregnation of the tissue with paraffin (Leica TP1050).Lungs were then embedded in paraffin wax. 3 µm sections were cut for IHC and mounted onto SuperFrost® slides.
Sections were dewaxed and rehydrated using an automatic slide stainer (DRS 2000, Sakura) and then the same permeabilization and staining procedure was used as for cryosections.

smFISH/RNAscope
Fetal lung tissue was prepared as FFPE or fixed frozen blocks as described above.To determine the best samples for analysis, morphology was assessed using hematoxylin and eosin staining and RNAscope 3plex positive and negative control probes run.For RNAscope, 5-(FFPE) or 10-(fixed frozen) µm thick sections were cut on to SuperFrost® Plus slides and processed using the RNAscope 2.5 LS multiplex fluorescent assay (ACD, Bio-Techne) on the Leica BOND RX system (Leica).FFPE slides were baked and dewaxed (BOND protocol), incubated with protease III for 15 min, ER2 for 15 minutes at 95°C, whereas fixed frozen slides were first fixed for 15 min with 4°C 4% PFA, baked for 30 min at 60°C, dehydrated through a standard ethanol series and run on the Leica BOND RX with 15 min protease III at room temperature and epitope retrieval for 5 min with ER2.All slides received opal 520, 570 and 650 fluorophores (Akoya Biosciences) at 1:1000 concentration.Probes used can be found in Table S5 (ACD, Bio-Techne).Slides were counterstained with DAPI (ThermoFisher D1306, 300nM working dilution), and imaged on a Leica SP8vis confocal microscope or Perkin Elmer Opera Phenix.For some probe mixes, EpCAM antibody staining (abcam ab71916, 1:1000) was run following RNAScope, using TSA-biotin (1:400) and goat anti-rabbit IgG-HRP (1:1000).

scRNA-seq and CITE-seq staining
For scRNA-seq, FACS-sorted CD45 + fetal lung cells were collected at 8-9, 12 and 20 pcw and frozen, due to logistics.Cell suspensions were thawed rapidly at 37°C in a water bath.Warm RPMI1640 medium (ThermoFisher Scientific) (20 to 30 ml) containing 10% FBS (RPMI-FBS) was added slowly to the cells before centrifuging at 300 g for 5 min.This was followed by a wash in 5 ml RPMI-FBS.Cell number and viability were determined using Trypan Blue.Cells from different donors were pooled in equal numbers where possible.At this point, samples were either loaded directly onto a 10X chip, as described below, or stained for CITE-seq.
For CITE-seq staining, reagent volumes were adjusted according to pooled cell number.For 5x10 5 cells, samples were resuspended in 25 µl cell staining buffer (BioLegend, 420201).2.5 µl Human TruStain FcX block (BioLegend, 422301) was added and cells were incubated on ice for 10 min, to block non-specific binding.The cells were then stained with TotalSeq-C antibodies (BioLegend, 99814; full antibody list available in Yoshida et al ( 26)) according to the manufacturer's instructions.After incubating with 0.5 vials of TotalSeq-C for 30 min at 4 °C, cells were washed three times by centrifugation at 500 g for 5 min at 4 °C.
Cells were counted again and processed immediately for 10X 5′ single-cell capture (Chromium Next GEM Single Cell V(D)J Reagent Kit v1.1 with Feature Barcoding technology for cell Surface Protein-Rev D protocol).One lane of 4,000 to 25,000 cells were loaded per pool onto a 10X chip.

Single-cell library construction and sequencing
Single-cell gene expression libraries (GEX) and V(D)J libraries were built using 10X Chromium Single Cell V(D)J Kits (v1).The manufacturer's protocol (10X Genomics) was followed using individual Chromium i7 Samples Indices.γδ TCR V(D)J libraries were prepared as previously described (74).GEX and V(D)J libraries were pooled in a 10:1 ratio and sequenced on a NovaSeq 600 S4 Flowcell aiming for a minimum of 50,000 paired-end (PE) reads per cell for GEX libraries and 5,000 PE reads per cell for V(D)J libraries.

CITE-seq data analysis
Background and non-specific staining by the antibodies used in CITE-seq was estimated using SoupX (v.1.4.8), which models the background signal on near-empty droplets.The soupQuantile and tfidfMin parameters were set to 0.25 and 0.2, respectively, and lowered by decrements of 0.05 until the contamination fraction was calculated using the autoEstCont function.Antibody-derived tag counts were normalized with the centered log-ratio transformation.

Cell type composition analysis
The number of cells for each sample and cell type composition was modeled with a generalized linear mixed model with a Poisson outcome using the 'glmer' function in the lme4 R package.Donor ID, sequencing batch and fetal age were fitted as random effects to overcome colinearity among the factors.
The effect of each factor on cell type composition was estimated by the interaction term with the cell type.
The standard error of variance parameter for each factor was estimated using the numDeriv package.The conditional distribution of the fold change estimate of a level of each factor was obtained by the 'ranef' function in the lme4 package.The statistical significance of the fold change estimate was measured by the local true sign rate (LTSR), which is the probability that the estimated direction of the effect is true; that is, the probability that the true log-transformed fold change is greater than 0 if the estimated mean is positive (or less than 0 if the estimated mean is negative).It is calculated on the basis of the estimated mean and s.d. of the distribution of the effect (log-transformed fold change), which is to an extent similar to performing a (one-sided) one-sample Z-test and showing (1 − P).

B cell and myeloid cell trajectory analyses
The soup-corrected single-cell transcriptomics data was preprocessed using Scanpy (86) version 1.8.1.The cell cycle effect and fraction of mitochondrial reads were regressed out using sc.pp.regress_out() and batch correction was performed using BBKNN (26).PAGA (87) was applied with sc.tl.paga() to examine the connectivities between cell types, before final UMAPs were computed using the results of PAGA for initialisation as described in (87).Data and UMAPs were exported to R for running monocle3 (88, 89) to find a principal graph and define pseudotime.Differentially expressed genes along pseudotime were then computed using a graph-based test (morans' I)(89, 90) over the principal graph.This allows identification of genes that are upregulated at any point in pseudotime.The results were visualized using complexHeatmap (91) and seriation (92) packages in R, after smoothing gene expression with smoothing splines (smooth.spline(),df=12).In addition, for the myeloid compartment, velocity analysis was performed using the scvelo (93) package in the "dynamical" mode for computing velocities.

Visium spatial transcriptomic data analysis
Our previously published Visium spatial transcriptomic lung data from 12, 14, 16, 17, 19 and 20  We then imported previously calculated cell type abundance scores (produced by cell2location (75)) and scaled the scores across voxels per cell type.The scaled scores were used for one-way (rows) hierarchical clustering by Seaborn (76).The columns are arranged based on the annotated spatial regions.

Maternal cell inference
Maternally-derived cells were inferred using two methods.The libraries were fed into the souporcell (73) pipeline both with and without known genotypes specified, in order to cluster cells into fetal and maternal cells based on genetic background.Additionally, we picked male samples (high expression of DDX3Y) and evaluated the expression levels of XIST cell type by cell type.Neither method suggested evidence of prominent maternal cell presence.

Age-dependent T-cell differential gene expression analysis
To characterize maturation of T cells through fetal and early pediatric life, we tested for differential gene expression in naive T cells from healthy pediatric donors' PBMC samples from Yoshida et al., 2022 (26).
In nonproductive TCR analysis, all_contig_igblast_db-all.tsv from dandelion was used as input for αβTCR; and hiconf_contig_igblast_db-all.tsv was used for γδTCR.scRNA-seq data was integrated with αβTCR and γδTCR data with ddl.pp.check_contigs(productive_only = False).The majority of 'nonproductive' contigs identified were contigs with J and C gene segments, without any V segment.
For differential cell abundance analysis on neighborhoods we used the Milo framework(79), as implemented in the python package milopy v0.1.0.For this analysis, we excluded lung samples older than 17 pcw, to match the age range in the cross-tissue dataset, and tested for significant increase or decrease in cell numbers in lung samples compared to all other tissues.The effect of FACS sorting protocol on cell abundance was accounted for in the differential abundance model.For differential expression analysis on lung-enriched ILC and NK neighborhoods, we label cells belonging to ILC or NK lung-enriched neighborhoods and compare them to cells belonging to other ILC and NK neighborhoods.We performed differential expression analysis pseudo-bulking by sample using the R package glmGamPoi.When selecting DE genes, we excluded genes that were found to be differentially expressed between lungenriched and other neighborhoods in a set of control cell types where we don't expect to see lung-specific signatures (T progenitors, HSC/MPP, immature B cells, Treg).

Organoid maintenance
Human embryonic lung organoids were derived and maintained as previously described (4).Briefly, lung lobes were incubated in Advanced DMEM/F12 medium supplemented with 8 U/ml dispase (ThermoFisher Scientific, Gibco) for 2 min at room temperature.Mesenchyme was carefully dissected away using tungsten needles.Epithelial tips were micro-dissected by cutting the very end of branching epithelial tubes under a dissection microscope.Tips were transferred into 25 μl Matrigel (Corning, 356231) and plated in a 48 well low-attachment plate (Greiner).The plate was incubated for 5 min at 37°C to solidify the Matrigel before adding 250 μl of self-renewing (SR) medium (Table S6) per well.The plate was incubated under standard tissue culture conditions (37°C, 5% CO2).Organoids were cultured in Matrigel (Corning, 356231) with self-renewing medium.Medium was changed twice a week, and organoids were passaged every 10-14 days.

Recovering organoids from Matrigel for RNA/protein extraction or immunostaining
After removing culture medium from organoids and washing with PBS, 1 ml ice-cold Matrigel Cell Recovery Solution (Corning, 354253) was added per well and incubated for 45 min at 4°C.Organoids were washed with ice-cold PBS, centrifuged at 200 g, 4°C, then utilized in subsequent experiments.For RNA extraction, organoids were lysed in 350 μl RLT buffer and total RNA was extracted using the RNeasy Mini Kit (Qiagen), according to manufacturer's instructions.RNA yields were measured using Nanodrop 1000.

Organoid qPCR
0.5 μg organoid RNA was reverse transcribed using qScript cDNA Supermix (Quantabio) and cDNA was used for qPCR using Power SYBR Green Master Mix (Life Technologies Ltd), following the manufacturers' protocols (primers in Table S4).Relative gene expression was calculated using the ΔΔCT method relative to GAPDH control.P-values were obtained using an unpaired two-tailed student's t-test or one-way ANOVA.

Organoid dissociation and colony forming assay
Organoids were transferred to 15 ml tubes, resuspended in 1 ml TrypLE Express (Gibco, 1265036) and incubated at 37°C for 10 min, triturating gently every 2 min.9 ml 10% FBS was added, followed by centrifugation at 4°C (300g, 5 min).The pellet was resuspended in 1% BSA in HBSS and filtered through 40μm cell strainers into 50 ml tubes.The single-cell suspension was centrifuged as before and resuspended in HBSS.Cells were counted using Trypan Blue, to check single-cell efficiency and viability, and were used for scRNA-seq or colony forming assay.Dissociated organoids were seeded into a 96-well plate (500 cells/well)(Greiner) in a 50 μl Matrigel droplet.100 μl SR medium supplemented with 10 nM Rho kinase inhibitor Y-27632 (Merck, 688000) was added for 2 days, and then changed to SR medium supplemented with either 10 ng/ml IL-1β or 100 ng/ml IL-1Ra and cultured for a further 7 days before counting and calculating organoid size and organoid forming efficiency using ImageJ.

Whole mount staining of organoids or embryonic lung tissue
Organoids or embryonic lung tissue were fixed with 4% PFA at 4°C for 30 min or 2 hours respectively.
After fixation and washing in PBS, organoids or tissue were transferred to a 48-well plate for staining.Permeabilization was performed in 0.5% (v/v) Triton-X100 in PBS for 30 min at room temperature, followed by blocking in 1% BSA, 5% normal donkey serum, 0.2% Triton-X100 in PBS (blocking solution) for 1 hr at room temperature.Organoids or tissue were incubated with primary antibodies in blocking solution overnight at 4°C.After washing, secondary antibodies (1:1000 in blocking solution) were added and incubated overnight at 4°C.After washing, DAPI was added for 30 min at 4°C, followed by clearing/mounting using 2'-2'-thio-diethanol (Sigma, 166782), as previously described (4).
pcw human fetuses (Fig S4A)(11) were imported and combined using Scanpy.The feature genes mentioned in He et al., 2022 were used for dimension reduction and clustering.Voxels with fewer than 10 feature genes detected were discarded.BBKNN was used (batch_key='library_id', neighbors_within_batch = 3, trim = 200, approx=False, metric='euclidean', n_pcs=100) to minimize the contribution of batch effects of Visium datasets to the downstream analyses.UMAP embedding and Leiden clustering with curated subclustering were performed, followed by manual annotations of the clusters into spatial regions.
Differential expression across age within these naive CD4 or CD8 T cells was tested by fitting a gamma poisson generalized linear model on log2 transformed age and by creating pseudo-bulks of each donor with the glmGamPoi package.Fetal and pediatric expression of the top 25 most significant upregulated and downregulated genes in pediatric PBMC and lung T cells are shown in Fig S6F.For validation, we also included CD4 and CD8 T cells extracted from postnatal human lung samples (27) in this figure, showing the same genes.The CD4 and CD8 T cells were defined by label transfer using celltypist (77) and our previous human fetal lung annotations (11).

Figure S6 .
Figure S6.Comparison of fetal lung immune cells with published data.(A,B) CellTypist-predicted