Clinical metagenomic sequencing


Clinical metagenomic next-generation sequencing is the comprehensive analysis of microbial and host genetic material in samples from patients. It allows for identification and genomic characterization of bacteria, fungi, parasites, and viruses without the need for a priori knowledge of a specific pathogen directly from clinical specimens. The capacity to detect all the potential pathogens in a sample makes metagenomic next generation sequencing a potent tool in the diagnosis of infectious disease especially when other more directed assays such as PCR fail. Anyway some limitations are still present and need to be challenged, such as clinical utility, laboratory validity, sense and sensitivity, cost and regulatory considerations.

Laboratory workflow

A typical mNGS workflow consists of the following steps:
  1. Negative selection targets and eliminates the host and microbiome genomic background, while aiming to preserve the nucleic acid derived from the pathogens of interest. Degradation of genomic background can be performed through broad-spectrum digestion with nucleases, such as DNase I for DNA background, or by removing abundant RNA species using sequence-specific RNA depletion kits. Also CRISPR-Cas9-based approaches can be performed to target and deplete human mitochrondrial RNA for example. Generally, however, subtraction approaches lead to a certain degree of loss of the targeted pathogen genome, as poor recovery may occur during the cleanup.
  2. Positive enrichment is used to increase pathogen signal rather than reducing background noise. This is commonly done through hybridization-based target capture by probes, which are used to pull out nucleic acid of interest for downstream amplification and sequencing. Panviral probes have been shown to successfully identify diverse types of pathogens in different clinical fluid and respiratory samples, and have been used for sequencing and characterization of novel viruses. However, the probe approach includes extra hybridization and cleanup steps, requiring higher sample input, increasing the risk of losing the target, and increasing the cost and hands-on time.
Metagenomic approaches haven been used to identify infections in ancient remains, discover novel viral pathogens and characterize the human virome in both healthy and diseased states and for forensic applications.
Specifically, applications of clinical metagenomics to date have included infectious disease diagnostics for variety of syndromes and sample types, microbiome analyses in both diseased and healthy states, characterization of the human host response to infection by transcriptomics and the identification of tumour-associated viruses and their genomic integration sittes.
Aside from infectious disease diagnostics, adoption of mNGS in clinical laboratories has been slow, and most applications have yet to be incorporated into routine clinical practice. Nonetheless, the breadth and potential clinical utility of these applications are likely to transform the field of diagnostic microbiology in the near future.

Infectious diseases diagnosis

The capacity to detect all potential pathogens in a sample and simultaneously interrogate the host responses has great potential utility in the diagnosis of infectious disease. One way to detect these pathogens is detect part of their genome by metagenomics sequencing, which can be targeted or untargeted.
Targeted
The targeted analysis is done by enriching individual genes or genomic regions. This approach increases significantly the number of pathogen reads in the sequence data. Because of that, the sensitivity to detect microorganisms that are being targeted usually increases, but this comes with a limitation of the breadth of potential pathogens that can be identified.
Untargeted
The untargeted analysis is a metagenomic "shotgun" approach. The whole DNA and/or RNA is sequenced with this approach using universal primers. The resultant mNGS reads can be assembled into partial or complete genomes. These genome sequences allow to monitor hospital outbreaks to facilitate infection control and public health surveillance. Also, they can be used for subtyping.
Untargeted mNGS is the most promising approach to analyse clinical samples and provide a comprehensive diagnosis of infections. Various groups have validated mNGS in Clinical Laboratory Improvement Amendments, such as meningitis or encephalitis, sepsis and pneumonia.
The traditional method consists on formulating a differential diagnosis on the basis of the patient's history, a clinical presentation, imaging findings and laboratory testing. But here it is suggested a different way of diagnosis; metagenomic next-generation sequencing is a promissing method because a comprehensive spectrum of a potential causes can be identify by a single assay.
Below we have some examples of the metagenomic sequencing application in infectious diseases diagnosis.

Examples

Diagnosis of meningitis and encephalitis
The traditional method that is used to the diagnosis of infectious diseases has been challenged in some cases: neuroinflamatory diseases, lack of diagnostic tests for rare pathogens and the limited availability and volume of the Central Nervous System samples, because of the requirement for invasive procedures. Owing to these problems, some assays suggest a different way of diagnosis, which is the metagenomic next-generation sequencing ; this is a promising method for diagnosis because a comprehensive spectrum of potential causes can be identify by a singe study. Summarising, NGS can identify a broad range of pathogens in a single test.
In some articles they evaluate the clinical usefulness of metagenomic NGS for diagnosis neurologic infections, in parallel with conventional microbiologic testing. It has been seen that the highest diagnostic yield resulted from a combination of metagenomics NGS of CSF and conventional testing, including serologic testing and testing of sample types other than CSF.
Moreover, some findings from different studies have shown that neurologic infections remain undiagnosed in a proportion of patients despite conventional testing and they demonstrate the potential usefulness of clinical metagenomic NGS testing in these patients.
The results of metagenomic NGS can also be valuable even when concordant with results of conventional testing, not only providing reassurance that the conventionally obtained diagnosis is correcct but also potentially detecting or ruling out coinfections, specially in immunocompromised patients.
Metagenomic is fundamentally a direct-detection method and relies on the presence of nucleic acid form the causative pathogen in the CSF samples.
Study of antimicrobial resistance
is a health problem needed to resolve.
Nowadays to detect resistances of different microbes is used a technique called Antibiotic Sensitivity, but several studies have discovered that bacterial resistance is in the genoma and it is transferred by horizontal way, so sequenciation methods are being developed to ease the identification and characterization of those genomas and metagenomas. For the moment exist the following methods to detect antimicrobial resistances:
It is noted that metagenomic sequencing methods have provided better results than genomics ones, due to these present less number of negative falses. Within metagenomics sequencing, functional metagenomic is a powerful approach for characterizing resistomes; a metagenomic library is generated by cloning the total community DNA extracted from a sample into an expression vector, this library is assayed for antimicrobial resistance by plating on selective media that are lethal to the wild-type host. The selected inserts from the surviving recombinant, antimicrobial-resistant host cells are then sequenced, and resulting sequences are subsequently assembled and annoted .
Functional metagenomics has enabled the discovery of several new antimicrobial resistance mechanisms and their related genes, one such example is the recently discovered tetracycline resistance mechanism by tetracycline destructases.
In conclusion is important to incorporate not only the antimicrobial resistance gene sequence and mechanism but also the genomic context, host bacterial species and geographic location.

Clinical microbiome analyses

Recently there has been a shift from the use of targeted sequencing of 16S rRNA gene to the use of mNGS to characterize the microbiome of an individual. This goes hand in hand with the raise of awareness on the important role of the microbiome in different diseases states. Even though, no microbiome-based tests have been approved for the diagnosis or treatment of a disease. This is mainly due to the non complete understanding of the complexity of the microbiome and how it is related to certain diseases.
The use of mNGS to characterize the microbiome has made possible the development of bacterial probiotics to be administrated as pills, for example, as a treatment of Clostridium difficile-associated diseases.
The analysis of bacterial diversity is another application of the microbiome. It can give information on wheter a patient's illness is infectious or not. Studies of mNGS show that patients with culture-proven infection have less diversity in their respiratory microbiome. This is due to dysbiosis, an abnormal alteration of the microbiome, which can also be related to obesity, diabetes mellitus or inflammatory bowel disease. The identification of dysbiosis using mNGS can be a pathway to treating these pathological conditions by then manipulating the microbiome.

Human host response analyses

Knowing host responses has important prospective utility in the diagnosis of infectious diseases. For this reason, one of the most important possible applications of clinical metagenomics is characterization of the human host response to infection by transcriptomics and the identification of tumor associated viruses and their genomic integration sites.
The studying of genes expression allows us to characterize a lot of infections, for example infections due to Staphylococcus aureus, Lyme disease, candidiasis, tuberculosis and influenza. Also, this approach can be used for cancer classification.
mNGS of RNA libraries used for the detection of pathogens such as RNA viruses in clinical samples incidentally produces host gene expression data for transcriptome analyses. There are no-RNAseq based assays that are clinically validated to date for use in patients, but RNAseq analysis give better results. The reason is that RNAseq assays can detect active microbial gene expression and might enable the discrimination between infection versus colonization and live versus dead organism.
RNAseq analysis have a lot of other purposes and applications such as to identify novel or under appreciated host–microbial interactions directly from clinical samples, to make indirect diagnosis on the basis of a pathogen specific human host response and to discriminate infectious versus noninfectious causes of acute illness.

Applications in oncology

In oncology, whole-genome or directed NGS approaches to identify mutated genes can be used to simultaneously detect viruses associated with cancer and/or to collect information about the interaction between the virus and its host.
Until now, the US Food and Drug Administration has approved the clinical use of two NGS panels testing for actionable genomic aberrations in tumor samples. The addition of specific viral probes allows to detect reads corresponding to both integrated and exogenous viruses. The data gathered about integrated or active viral infections in cancers are important for preventive or therapeutic treatments with target antiviral and chemotherapeutic drugs.
In future, mNGS of cell-free DNA from liquid biopsy sample, such as plasma, might be useful fo the identification of early cancer and diagnosis of infection in immunocompromised patients.

Challenges

Despite the potential and recent successes of metagenomics, clinical diagnostic applications have lagged behind research advances owing to a number of factors. A complex interplay of microbial and host factors influences human health, as exemplified by the role of the microbiome in modulating host immune responses, and it is often unclear whether a detected microorganism is contaminant, colonzier or pathogen. Additionally, universal reference standards and proven approaches to demonstrate test validation, reproductibility and quality assurance for clinical metagenomic assays are lacking. Considerations of cost, reimbursement, turnaround time, regulatory considerations and, perhaps most improtantly, clinical utility also remain major hurdles for the routine implementation of clinical mNGS in patient care settings.

Clinical utility

Molecular diagnostic assays provide a fairly cost-effective and rapid means to diagnose the most common infections. However, nearly all conventional microbiological tests in current use detect only one or a limited panel of pathogens at a time or require that a microorganism be successfully cultured from a clinical sample. By contrast, while NGS assays in current use cannot compare with conventional tests with respect to speed mNGS could enable a broad range of pathogens to be identified from culture or directly from clinical samples on the basis of uniquely identifiable DNA and/or RNA sequences.
To date, several studies have provided a glimpse into the promise of NGS in clinical and public health settings. Even though, most of the metagenomics outcomes data generated consist of case reports which belie the increasing interest on diagnostic metagenomics. For example, NGS was used for the clinical diagnosis of   in a 14-year-old critically ill boy with meningoencephalitis; this case was the first to demonstrate the possible utility of metagenomic NGS in providing clinically actionable information, a successful diagnosis prompted appropriate targeted antibiotic treatment and eventual recovery of the patient.  
Thus, the argument for the clinical utility of metagenomic is eventually only based in the most difficult- to- diagnose cases or for immunocompromised patients, in whom the spectrum of potential pathogens is greater. Accordingly, there is an overall lack of penetration of this approach into the clinical microbiology laboratory, as making a diagnosis with metagenomics is still basically only useful in the context of case report but not for a true daily diagnostic purpose.
Recent cost-effectiveness modelling of metagenomics in the diagnosis of fever of unknown origin concluded that, even after limiting the cost of diagnostic metagenomics to $100–1000 per test, it would require 2.5-4 times the diagnostic yield of in order to be cost neutral and cautioned against ‘widespread rush’ to deploy metagenomic testing. Eventually, mNGS may become cost competitive with multiplexed assays or used as an upfront ‘rule out’ assay to exclude infectious aetiologies. But, for the moment, there is a lack of actionability in clinical metagenomics.
Furthermore, in the case of the discovery of potential novel infectious agents, usually only the positives results are published event though the vast majority of sequenced cases are negative, thus resulting in a very biased information. Besides, most of the discovery work based in metagenomic that precede the current diagnostic-based work even mentioned the known agents detected while screening unsolved cases for completely novel causes.  
Of course, detection of nucleic acids, either by NGS or multiplexed assays, does not by itself prove that an identified microorganism is the cause of the illness, and findings have to be interpreted in the clinical context. In particular, discovery of an atypical or novel infectious agent in clinical samples should be followed up with confirmatory investigations such as orthogonal testing of tissue biopsy samples and demonstration of seroconversion or via the use of cell culture or animal models, as appropriate, to ascertain its true pathogenic potential.
For all of this, the field suffers from a lack of understanding of true clinical utility.

Laboratory validity

To date, most published testing has been run in an unvalidated, unreportable manner. The ‘standard microbiological testing’ that samples are subjected to prior to metagenomics is variable and has not included reverse transcription-polymerase chain reaction testing for common respiratory viruses or, routinely 16S/ITS PCR testing.
Given the relative costs of validating and performing metagenomic versus 16S/ITS PCR testing, the second one is considered an easier and more efficient option. A potential exception to the 16S/ITS testing is blood, given the huge amount of 16S sequence available, making clean cutoffs for diagnostic purposes problematic.
Furthermore, almost all of the organisms detected by metagenomics for which there is an associated treatment and thus would be truly actionable are also detectable by 16S/ITS testing. This makes questionable the utility of metagenomics in many diagnostic cases.
One of the main points to accomplish laboratory validity is the presence of reference standards and controls when performing mNGS assays. They are needed to ensure the quality and stability of this technique over time.
Most available metagenomic reference materials are dedicated to specific applications and/or focused on a more limited spectrum of organisms Thus, these materials may not be applicable to untargeted mNGS analyses.
Custom mixtures consisting of a pool of microorganisms or their nucleic acids can be developed as external controls to establish limits of detection for mNGS testing. Internal spike-in control standards are available for other Next Generation Sequencing applications such as transcriptome analysis by RNA-seq.
Nonetheless, the lack of universally accepted reference standards for mNGS makes it difficult to compare assay performances between different laboratories. There is a critical need for standardized reference organisms and genomic materials to facilitate such comparisons and to define optimal analysis methods.

Sense and sensitivity

In clinical microbiology labs, the quantitation of microbial burden is considered a routine function as it is associated with the severity and progression of the disease. To achieve a good quantitation a high sensitivity of the technique is needed.
A key limitation of metagenomic next-generation sequencing is its decreased sensitivity with high background as these can be clinically relevant as the pathogen load in infections can be very low. Whereas interfering substances represent a common problem to clinical chemistry or to PCR diagnostics, the degree of interference from host or nonpathogen nucleic acids in metagenomics is a new twist. In addition, due to the relative size of the human genome in comparison with microbial genomes the interference can occur at low levels of contaminating material.
Another challenge for clinical metagenomics in regards to sensitivity is the diagnosis of coinfections where there are present high-titer pathogens that can generate biased results as they may disproportionately soak up reads and make difficult to distinguish the less predominant pathogens.
In addition to issues with interfering substances, specially in the diagnosis area, accurate quantitation and sensitivities are essential as a confusion in the results can affect to a third person, the patient. For these reason, practitioners currently have to be keenly aware of the index-swapping issues associated with Illumina sequecing which can lead to trace incorrectly barcoded samples.
Since metagenomics has typically been used on patients for whom every other test to date has been negative, questions surrounding analytical sensitivity haven been less germane. But, for ruling out infections causes being one of the more important roles for clinical metagenomics it is essential to be capable to perform a deep enough sequencing to achieve adequate sensitivities. One way could be developing novel library preparation techniques.

Cost considerations

Although there have been substantial cost reductions in the generation of sequence data, the overall persample reagent cost for sequencing remains fairly high. In fact the Illumina monopoly on high-quality next-generation sequencing reagents and the need for accurate and deep sequencing in metagenomics mean that the sequencing reagents alone cost more than FDA-approved syndromic testing panels. Also additional direct costs of metagenomics such as extraction, library preparation, and computational analysis have to be considered.
This leads to an overall cost of several hundreds to thousands of dollars per sample analysed, which is higher than that for many other clinical tests.
In general, metagenomic sequencing is most useful and cost efficient for pathogen discovery when at least one of the following criteria are met:
  1. the identification of the organism is not sufficient,
  2. a coinfection is suspected,
  3. other simpler assays are ineffective or will take an inordinate amount of time,
  4. screening of environmental samples for previously undescribed or divergent pathogens.

    Regulatory considerations

Every clinical laboratory should be highly regulated and general laboratory and testing requirements applied to all molecular diagnostic assays reported for patient care. An ongoing monitoring is essential especially for mNGS assays in order to verify acceptable performance over time and to investigate atypical findings. Examples of important quality steps are: the initial sample quality checks, the library parameters, the sequence data generation, the recovery of internal controls and performance of external controls. Validation data coming from the assay development and implementation should be recorded and made available to laboratory inspectors or submitted to regulatory agencies, such as the FDA in the USA or the European Medicines Agency in Europe, for approval.
During mNGS assays the monitoring is accomplished using sample internal controls, intrarun control samples, swipe tests for contamination and periodic proficiency testing. Further studies on unexpected or unusual results are always needed and the identification of microorganisms that have not been identified before in the laboratory should be independently confirmed, usually through clinical reference or public health laboratory testing. Furthermore, it is important to determine the clinical significance of these novel or atypical organisms and these findings should be reported and discussed with health care providers, with consideration for their potential pathogenicity and for further testing and treatment options.

Future perspectives

Technological advances in library preparation methods, sequence generation and computational bioinformatics are carrying to quicker and more comprehensive metagenomic analyses at lower cost. While current limitations, like the decrease of the sensitivity for pathogen detection in clinical samples with a high nucleic acid background or with exceedingly low pathogen titres, suggest that mNGS is unlikely to replace conventional diagnostics in the short term, perhaps it could be a complementary or an essential test in certain clinical situations.
Although the use of mNGS for informing clinical care has been demonstrated in multiple small case series, nearly all studies have been retrospective, and clinical utility has yet to be established in a large scale prospective clinical trial. So prospective clinical studies will be critical to understand when to perform mNGS and how the diagnostic yield compares with that of other methods. It could be possible that, over the next 5 years, prospective clinical trial data evaluating the clinical utility and cost effectiveness of mNGS will become available and that overall costs and turnaround time for mNGS will continue to drop.
Furthermore, in a world with constantly emerging pathogens, it's probable that mNGS based testing will have an essential role in monitoring and tracking new disease outbreaks. As surveillance networks and rapid diagnostic platforms such as nanopore sequencing are deployed globally, it will be possible to detect and contain infectious outbreaks at a much earlier stage, saving lives and lowering costs.