Imprimir Resumo


Congresso Brasileiro de Microbiologia 2023
Resumo: 1475-1

1475-1

Varsmetagen: An online platform for microbial NGS data analyses, enabling precision medicine in the diagnosis of infectious diseases

Autores:
Raquel Riyuzo (HIAE - Hospital Israelita Albert Einstein) ; Deyvid Amgarten (HIAE - Hospital Israelita Albert Einstein) ; Tania Mangolini (HIAE - Hospital Israelita Albert Einstein) ; Erick Dorlass (HIAE - Hospital Israelita Albert Einstein) ; Ana Carolina Soares (HIAE - Hospital Israelita Albert Einstein) ; Vitor Riyuzo (HIAE - Hospital Israelita Albert Einstein) ; Veronica Silva (HIAE - Hospital Israelita Albert Einstein) ; Michel Chieregato (HIAE - Hospital Israelita Albert Einstein) ; Taina Floriano (HIAE - Hospital Israelita Albert Einstein) ; Rogerio Conolly (HIAE - Hospital Israelita Albert Einstein) ; Mikely Saraiva (HIAE - Hospital Israelita Albert Einstein) ; Fernanda Malta (HIAE - Hospital Israelita Albert Einstein) ; Rodrigo Reis (HIAE - Hospital Israelita Albert Einstein) ; Marcel Caraciolo (HIAE - Hospital Israelita Albert Einstein) ; Murilo Cervato (HIAE - Hospital Israelita Albert Einstein)

Resumo:
Sequencing technologies have evolved at astonishing rates over the past 20 years. As a consequence, costs to sequence one-million base pairs (Mbp) have decreased to the point where it became accessible for the main public and import applications to disease diagnostics are now routine. Using Next Generation Sequence (NGS) for diagnosis of human hereditary and somatic diseases (cancer) is today widespread with plenty of open-source and commercial platforms available to simplify analyses by test-providers. However, the scenario is quite different for platforms that implement microbial bioinformatics pipelines and provide user-friendly results and clinical reports. In this work, we present the Varsmetagen platform for analyses and interpretation of microbial NGS data, which allows users to execute bioinformatics pipelines on cloud infrastructure as well as interpret results and easily generate reports. An user-friendly online interface enables users to execute the steps of: 1 - Create runs, upload files (FASTQ, FASTQ.gz) and assign patient information to individual samples. 2 - Select one of three base bioinformatics pipelines (target 16S/ITS microbiome, virome metatranscriptomics and shotgun metagenomics) to be executed at the samples. 3 - Inspect metrics for quality control, approving or disapproving samples. 4 - Inspect charts and tables with information of diversity and estimates of abundance. 5 - Interpret results to select and validate potential pathogens linked with patient symptoms. 6 - Generate PDF reports, which may be used for clinical or publication purposes. Regarding the processing back-end, we have implemented three base-pipelines in Workflow Description Language (WDL) orchestrated by a Cromwell server at the AWS cloud infrastructure. In order to validate and exemplify usage for clinical diagnosis, here we show data and results for clinical validation of a routine Virome test following College of American Pathologists (CAP) guidelines. This test was developed for unbiased identification of RNA viruses in plasma samples, addressing existing gaps related with identification of emerging viruses and diagnosis of syndromic diseases. Briefly, it takes raw reads from FASTQ or FASTQ.gz files, perform quality control to remove poor quality sequences, remove host contamination by mapping to the Human reference genome, perform taxonomic identification of remaining sequences using Kraken2 with a customized database, and normalize kraken2 counts by applicating a bayesian method as implemented in Bracken. Additionally, short sequences are assembled in contigs using Spades for a second taxonomic identification round and several metrics are generated to help the analyst to interpret and evaluate findings. 37 samples with previous RT-PCR results were sequenced and submitted to the platform. Most pathogens were correctly identified, yielding 96,6% in accuracy tests. Reproducibility for samples sequenced in triplicate runs yielded 100% agreement (inter and intra assays). Finally, the limit of detection of the test was found to be 1,000 molecules per mL, in which sensibility was determined to be 100%. Varsmetagen is provided as Software as a Service format (SaaS) for commercial usage, and is free of charge for academic and non-profit users (https://varsomics.com/en/varsmetagen/).

Palavras-chave:
 clinical metagenomics, microbiome, bioinformatics, NGS, disease diagnostics