Detection and Characterization of a De Novo Alu Retrotransposition Event Causing NKX2‐1‐Related Disorder

Heterozygous NKX2‐1 loss‐of‐function variants cause combinations of hyperkinetic movement disorders (MDs, particularly childhood‐onset chorea), pulmonary dysfunction, and hypothyroidism. Mobile element insertions (MEIs) are potential disease‐causing structural variants whose detection in routine diagnostics remains challenging.


Introduction
NKX2-1 is a homeobox gene encoding thyroid transcription factor-1 (TTF1), a key regulator of tissuespecific gene expression mainly involved in thyroid, lung, and ventral forebrain morphogenesis. 1 Heterozygous NKX2-1 loss-of-function variants account for a clinical spectrum, including childhood-onset chorea, hypothyroidism, and pulmonary dysfunction, in isolation or any combination, with brain-lung-thyroid (BLT) syndrome being the most severe phenotype. 2,3 Other possible manifestations include motor delay, dystonia, ataxia, urinary tract abnormalities, and malignancies. 4,5 Magnetic resonance imaging (MRI) pituitary abnormalities have been recognized in 13% of NKX2-1-variant carriers and 26% of BLT syndrome patients reported thus far. 6 NKX2-1 point and copy number variants were detected in only 26.7% of patients in a large series of benign hereditary chorea/BLT syndrome. 2 Mobile elements are discrete DNA segments constituting about two-thirds of the human genome. 7 Evolutionarily, they have been able to mobilize between genomic regions via a direct cut-and-paste mechanism (transposons) or target-primed reverse transcription of an RNA intermediate (retrotransposons). 8 Among retrotransposons, a small fraction of long (LINEs) and short interspersed nuclear elements (SINEs, which include Alu elements) as well as SINE-variable number of tandem repeat (VNTR)-Alu (SVA) elements retain intragenomic transpositional competence and can cause human genetic diseases because of their activity. [8][9][10][11][12][13][14][15] Detection of mobile element insertions (MEIs) by routine diagnostics is ignored as they are ultra-rare, remains challenging due to their intrinsic characteristics (eg, size, highly repetitive sequence), and requires to run dedicated variant-calling algorithms on short-read sequencing (SRS) data. [16][17][18][19][20] We identified and characterized the first Alu-SINE retrotransposition event in NKX2-1 associated with infancy-onset levodopa-responsive choreo-dystonia in two first-degree relatives (Fig. 1A) for whom no molecular diagnosis had previously been established by Sanger sequencing (SS) of full-length NKX2-1 and canonical pipelines for whole-exome (WES) and short-read wholegenome sequencing (SR-WGS) data analysis.

Proband (II:2)
This 46-year-old white British woman was born fullterm from extended traumatic breech delivery. She was diagnosed with respiratory distress syndrome and required neonatal intensive care. She experienced frequent respiratory infections during infancy. Her motor milestones were delayed (age sitting: 12 months, standing with support: 2 years, walking with rollator: 2.5-3 years). She manifested with limb and truncal twitching movements from age 1. She was diagnosed with athetoid cerebral palsy at age 2. During adolescence, a trial of levodopa (100 mg thrice per day) resulted in gait improvement and allowed her to walk using crutches. Her past medical history included kyphoscoliosis, urinary tract infections, mixed urinary incontinence and mild urinary hesitancy since childhood, dyspnea on exertion requiring inhalers since her 30s, and one episode of acute urinary retention at about age 35. Subclinical hypothyroidism was diagnosed at age 45, which led to initiation of levothyroxine. She was born to non-consanguineous parents. Her father (I:1), mother (I:2), and two brothers were healthy, whereas her only daughter was similarly affected. On examination (Video S1), she had generalized chorea with dystonic posturing of her fingers and legs, particularly when walking, hyperreflexia, and impaired postural reflexes. Brain MRI revealed a cystic lesion of the neurohypophysis (Fig. 1B). Peripheral smear for acanthocytes, phenylalanine loading test, and cerebrospinal fluid analysis of pterins and monoamine metabolites were normal. Electroencephalogram and nerve conduction study/electromyogram were unremarkable. Muscle biopsy revealed slight type I fiber predominance, one ragged red fiber with increased lipid staining, one cytochrome oxidase-deficient fiber, and occasional endomysial T cells. Neuropsychometry was within normal limits. SS of NKX2-1 was negative.

Proband's Daughter (III:1)
This 15-year-old white British girl was born full-term from uncomplicated elective caesarean section and was diagnosed with congenital talipes equinovarus at birth. She was noted to be fidgety and have poor control of her posture during early childhood. She manifested delayed motor milestones (age sitting: 16 months, walking with Kaye walker: 2 years). Her gait improved after levodopa initiation at age 2. She walked with crutches at age 6, started walking unsupported at age 10, and became fully independent with walking at age 13. Brain MRI showed prominent pituitary fossa and gland, with an isointense lesion located behind the anterior pituitary gland (Fig. 1B). Her past medical history included scoliosis, recurrent bronchiolitis during childhood, stress incontinence requiring toilet training at age 3, and oxybutynin since early childhood. Her thyroid function was normal at age 14. On examination (Video S1), she had generalized chorea with dystonic posturing of her neck, fingers, and legs, particularly when walking, hyperreflexia, and impaired postural reflexes.

Methods
All subjects provided written informed consent. Blood for genomic DNA and total RNA extraction was obtained from subjects I:1, I:2, II:2, and III:1. Skin biopsies were collected from subjects I:2, II:2, and III:1. WES was performed in subject II:2 and SR-WGS in all subjects (Illumina Technologies, San Diego, California, USA). Canonical pipelines, including de novo analysis, were applied to WES and SR-WGS data. The proband's biological parenthood was ascertained as described elsewhere. 21 The germline MEI detection tool SCRAMble (https://github.com/GeneDx/scramble) was run on the proband's WES bam files with default settings. 18,19 Solve-RD WES cohort was used to estimate allele frequency, which shows that this event was ultra-rare, occurring in 1 of 21,780 alleles. 22 Neither this nor other structural variants (SV) in NKX2-1 were detected in gnomAD SV 2.1 (accessed on October 10, 2022). All subjects' SR-WGS bam files were inspected for the presence of the candidate MEI using Integrative Genomics Viewer (IGV). 23 Polymerase chain reaction (PCR) primers were designed to amplify the NKX2-1 region carrying the identified MEI ( Fig. 2A). PCR and agarose gel electrophoresis of PCR products were performed. DNA fragments were excised from agarose gel and purified using the Monarch DNA Gel Extraction Kit (New England Biolabs, Ipswich, Massachusetts, USA). Purified DNA was next Sanger sequenced. DNA purified from gel bands predicted to carry the MEI was cloned in ccompetent Escherichia coli using a TOPO TA Cloning Kit (Invitrogen, Waltham, Massachusetts, USA), and SS of recombinant plasmids was performed using the same primers. Annotation was carried out by performing a Blast comparison between the rebuilt sequence and RepeatMasker data. 24 Expasy Translate (https://web.expasy.org/translate/) allowed prediction of the NKX2-1 amino acid sequence generated by Alu insertion. 25 Formalin-fixed paraffin-embedded skin biopsies underwent p62 and NKX2-1 immunostaining. Real-time PCR (RT-PCR) of NKX2-1 transcripts from whole blood, agarose gel electrophoresis of RT-PCR products, and immunoblotting and mass spectrometry (MS/MS) of proteins from lysate of whole-blood mononuclear cells isolated using Ficoll gradient centrifugation 26

Results
Standard analysis of WES/SR-WGS data did not reveal candidate variants in genes associated with movement disorders (MDs). 22 Genetic kinship analysis inferred parent-offspring relationship between the proband and the known parents (kinship coefficient proband father: 0.246, proband mother: 0.247). 21 SCRAMble identified a 46-bp Alu sequence inserted onto the antisense strand of NKX2-1 at genomic position chr14:36987132 (GRCh37). Twenty-seven reads covering the insertion site contained soft-clipped parts, forming a cluster with 100% consensus alignment quality to the Alu sequence. 19 The candidate MEI was visually identified through IGV in NKX2-1 exon 3 in subjects II:2 and III:1 only (Fig. 1C,D). 23 Gel electrophoresis of PCR products revealed two bands in affected family members and one band in unaffected family members and in one healthy control (Fig. 2B). SS of the $700-bp DNA fragment shared by all subjects corresponded to the wild-type NKX2-1 sequence (Fig. 2C). Due to failure of SS, the $1000-bp DNA fragment detected in II:2 and III:1 was sequenced after plasmid cloning, which revealed a 347-bp Alu element insertion with a 65-bp poly-A tail and 16-bp direct repeats on both sides of the element, corresponding to target site duplications (Fig. 2D). 27,28 This Alu sequence corresponds to an AluYa5 element with 0.3% divergence to the respective consensus sequence based on RepeatMasker. 24 Predicted mutant amino acid sequence due to Alu insertion revealed a premature stop codon 31 amino acids downstream of the insertion. 25 Functional analyses are detailed in the Supplementary Data.

Discussion
We identified a de novo Alu insertion in exon 3 of NKX2-1 segregating with infancy-onset levodopa-responsive choreo-dystonia, respiratory and urinary dysfunctions, and MRI abnormalities of the pituitary in two first-degree relatives and with subclinical hypothyroidism in one of them. Deep phenotyping fueled firm clinical suspicion toward NKX2-1-related disorder and prompted additional genetic analyses despite previous extensive candidate-gene and hypothesis-free testing being unrevealing. The novel variant ENST00000354822.7 (NKX2-1):c.556_557insAlu541_556dup p.(Leu186Argfs* 32) is pathogenic per the ACMG/AMP guidelines (PVS1; PS2;PM2;PP1), 29 with the Alu sequence identifying an AluYa5 element. 24 SINE-Alu elements are primate-specific retrotransposons originating from a 5 0 -to-3 0 fusion of the 7SL RNA gene and amplified throughout the human genome up to >1 million copies over $65 million years. 27,30,31 Being nonautonomous, they require the enzymatic machinery of the autonomous LINE-1 retroelements for their intragenomic mobilization. 32 Although most Alu elements map to noncoding regions, mounting evidence that they can induce insertional mutagenesis and modulate gene expression posttranscriptionally exists, thus representing a wide source of potential disease-causing SVs and regulatory functions. 8,[13][14][15]33 In particular, AluY retrotransposons are among the evolutionarily youngest Alu subfamily, including some mobilization-competent elements, 11 as proven by the new retrotransposition event characterized here.
Our study confirmed the clinical relevance of integrating dedicated algorithms for MEI detection in routine pipelines for SRS analysis. [17][18][19][20] Interestingly, SS of full-length NKX2-1 had previously failed to detect MEI. First, the Alu insertion caused preferential amplification of the shorter wild-type allele resulting in mutant allele dropout, as proven by our multiple assays to optimize PCR conditions by increasing the amount of starting DNA, minimizing the cycle number, and extending the elongation time. 34 Second, SS of the Alu allele PCR product succeeded only after it was cloned in plasmids, suggesting that the Alu sequence generated secondary structures.
The mechanism by which this Alu insertion caused NKX2-1 haploinsufficiency remains undetermined even after functional tests, most likely due to low-to-none NKX2-1 expression in whole-blood cells and skin fibroblasts, as documented by public repositories (GTEx, The Human Protein Atlas). 35,36 NKX2-1 is in fact highly and selectively expressed in the thyroid, lung, and pituitary gland, and, to a lesser extent, restricted brain regions, including the hypothalamus and basal ganglia, 35,36 which is in keeping with the pleiotropic functions of TTF1 and reflects the phenotypic spectrum of NKX2-1-related disorder. 37 In rodent and human thyroid and lung, expression of NKX2-1 is consistent throughout life stages from embryonic development to adult tissues. On the contrary, NKX2-1 expression is found in both diencephalic and telencephalic domains A L U I N S E R T I O N I N N K X 2 -1 during brain development but not in adult neurons of the basal ganglia. 38 Based on its rebuilt sequence, Alu insertion is predicted to introduce a premature stop codon 31 nucleotide triplet after the first codon change. 25 Because the Alu exonic insertion is ultimately analogous to a frameshift variant and located in the last coding exon of NKX2-1, the mutant mRNA is likely to evade nonsense-mediated decay and result in a nonfunctional truncated protein. 39 NKX2-1 was identified on immunoblotting only using Femto chemiluminescence using both NKX2-1 antibodies, which confirms extremely low protein concentrations in mononuclear cell lysate. This could explain why semiquantitative analysis of immunoblotting did not show statistically significant differences between affected and unaffected individuals and MS/MS failed to detect NKX2-1, which is expected to localize in the same spectrum as highly expressed housekeeping proteins (eg, actins) due to its molecular weight (42 kDa) and could therefore be masqueraded by their peak.
In conclusion, our study highlights the importance of deep phenotyping and clinicogenetic correlation to redflag false negatives due to intrinsic limitations of genetic testing. 37 It supports the inclusion of dedicated MEI detection pipelines in routine SRS data analysis or the reanalysis of selected MD cases with targeted long-read sequencing to improve molecular diagnostic yield.