Enrichment of SARS-CoV-2 sequence from nasopharyngeal swabs whilst identifying the nasal microbiome

Respiratory disease is often multifactorial and can be caused by different pathogens interacting synergistically and disrupting microbial ecology [1,2]. The microbiome sampled from a nasopharyngeal swab may be reflective of the respiratory microbiome and can be obtained from different sampling strategies including sequencing, culture and arrays [3]. During the first wave of the COVID-19 pandemic, respiratory viruses other than SARS-CoV-2 were often excluded in diagnostics due to prioritising identification of COVID-19 in patients [3]. During a pandemic caused by coronaviruses or analysing legacy samples or in ‘peacetime’, being able to characterise the totality of coronavirus infection from a single sample would be advantageous. This would be to simultaneously derive coronavirus genome sequence and investigate how a nasal microbiome/co-infection may influence disease and outcome, especially if sample sets or amounts are limited.

SARS-CoV-2 has a positive sense single stranded RNA genome of approximately 29,900 nucleotides and replicates in the cytoplasm of an infected cell. The 5’ two thirds of the genome is immediately translated into two polyproteins that are proteolytically cleaved to generate a variety of proteins including those involved in replication [4,5]. The remaining one third of the genome is expressed through the transcription of a nested set of subgenomic RNAs (sgmRNAs), that share a 5’ leader sequence with the genome and polyA tail [4]. Control of sgmRNA transcription is in part due to the transcription regulatory sequence (TRS) that precedes each gene along the genome [6]. The general architecture of a sgmRNA is 5’ to 3’, the leader sequence followed by the TRS (called a leader-TRS), followed by the gene to be translated and then other genes (depending on the sgmRNA), a non-coding reading and a polyA tail. The leader sequence is also found at the 5’ end of the genomic RNA. Detection of SARS-CoV-2 and viral load information from a clinical specimen can involve identification and quantification of the genome/sgmRNAs [7]. The leader-TRS gene junction is also a unique sequence feature that can be used to identify and quantify subgenomic and genomic RNA, particularly using sequencing information (e.g. [8], [9], [10]). The leader-TRS nucleoprotein gene junction is normally the most abundant because during active infection of a cell by a coronavirus the sgmRNA encoding the nucleoprotein is the most abundant [4,6].

In healthy individuals microbial communities exist in the upper and lower respiratory tract and can consist of the phylum Bacteroidetes, Firmicutes, Proteobacteria, and Actinobacteria (reviewed in [11]). Information on this can be inferred by elucidating the nasal microbiome. Disruption of the respiratory microbiome can be associated with disease such as translocation of gut bacteria to the lung and association with acute respiratory distress syndrome (ARDS) [12]. This imbalance or disruption of the respiratory microbiome (or any microbiome) and association with disease is called dysbiosis. The respiratory microbiome may be perturbed during coronavirus infection and other co-infections requiring clinical management can be present [13]. During the first wave of the COVID-19 pandemic (at least in the UK experience), respiratory viruses other than SARS-CoV-2 were often excluded in diagnostics due to prioritising identification of COVID-19 in patients [3]. Several studies have characterised the nasal microbiome in patients with SARS-CoV-2 with inconsistent results (e.g. [14], [15], [16]). Although, reduced abundance of Corynebacterium has been associated with anosmia in patients with COVID-19 [17] and a pattern of dysbiosis has been reported (e.g. [18], [19], [20], [21]).

A broad range of pathogens (viruses, bacteria, fungi, and parasites) can be identified within a clinical sample using random amplification and shotgun sequencing [13,[22], [23], [24], [25]]. Detection of RNA transcripts through metatranscriptomics, also gives an indication of the biological activity of the pathogen that is present [22]. One of the most prominent approaches for random amplification is sequence-independent single primer amplification (SISPA). This approach can be effective as an investigative tool for identifying multiple infectious diseases [23,24] and elucidating complex microbiomes [22]. The analysis of legacy samples from the COVID-19 pandemic continues to provide new insights into the evolution and spread of SARS-CoV-2 as well as the disease profile of different variants. Maximising information from single sample would be advantageous to characterise infection. In this study, a metatranscriptomic approach based around SISPA was optimised with Oxford Nanopore sequencing to provide detailed sequence/lineage information on SARS-CoV-2 as well as provide data on the underlying active nasal microbiome and validated for use in other human coronavirus infections.

Comments (0)

No login
gif
Back To Top