Haku

Development of workflow for picornavirus genome sequence analysis

QR-koodi

Development of workflow for picornavirus genome sequence analysis

Picornaviruses are small, non-enveloped, icosahedral, positive stranded RNA viruses and among the most common human pathogens. Some of the clinically important genera for humans are Enterovirus, Hepatovirus, Parechovirus and Cardiovirus. The symptoms for tthe picornaviral infections range from mild, asymptomatic to fatal disease. Threats posed to human health by these viruses is observedd in the constant outbreaks of enteroviruses and parechoviruses in the different parts of the world. Next generation sequencing provides an efficient way to detect and identify known or novel micro-organisms. Advantages of NGS are rapid sequencing methods, high-throughput process and affordable costs. On the other hand, NGS also requires advanced technical and computational skills, and creates a bottleneck owing to necessity of standardization of bioinformatic tools. It is therefore imperative to optimize and determine parameters, which provide accuracy in every stage of NGS workflow.

The aim of this thesis was to develop a rapid and straightforward, user-friendly workflow for the assembly and analysis of picornaviral genomes. Chipster platform was chosen as the primary test platform. The workflow involved use of automated analysis pipelines (VirusDetect and A5 assembly pipeline), and alternative approaches, which included pre-processing of raw data, and reference-mapping or de novo assembly (Velvet and SPAdes) of picornavirus sequences. Except for de novo assembly and validation and quality assessment of final outputs, all steps were performed in Chipster. Of these approaches, VirusDetect and reference-mapping were not successful. A5 pipeline for microbial genome assembly was found to be very suited for picornavirus identification. Velvet and SPAdes also performed well, but Velvet assembler was found to more computationally exhaustive and time consuming. Quality assessment suggested that performance of SPAdes was relatively better than the performance of A5 or Velvet. As A5 pipeline does not require any parameter settings, it can be used as initila identification and contig/scaffold generation method for picornaviral sequences. Together with implementation of de novo assembler(s) on Chipster platform a novel, user-friendly NGS workflow for picornavirus sequence assembly can be established.

Tallennettuna: