The new protocol for sequencing the messenger RNA in a cell, named RNA-seq produce millions of short sequence fragments. Next Generation Sequencing technology allows more accurate analysis but increase needs in term of computational resources. This paper describes the optimization of a RNA-seq analysis pipeline devoted to splicing variants detection, aimed at reducing computation time and providing a multi-user/multisample environment. This work brings two main contributions. First, we optimized a well-known algorithm called TopHat by parallelizing some sequential mapping steps. Second, we designed and implemented a hybrid virtual GRID infrastructure allowing to efficiently execute multiple instances of TopHat running on different samples or on behalf of different users, thus optimizing the overall execution time and enabling a flexible multi-user environment.

Optimizing Splicing Junction Detection in Next Generation Sequencing Data on a Virtual-GRID Infrastructure / Terzo, Olivier; Mossucca, L; Acquaviva, Andrea; Abate, Francesco; Ficarra, Elisa; Provenzano, R.. - (2012). (Intervento presentato al convegno The 6th International Conference on Bioinformatics and Biomedical Engineering tenutosi a Shanghai nel 17-20 My 2012).

Optimizing Splicing Junction Detection in Next Generation Sequencing Data on a Virtual-GRID Infrastructure

TERZO, OLIVIER;ACQUAVIVA, ANDREA;ABATE, FRANCESCO;FICARRA, ELISA;
2012

Abstract

The new protocol for sequencing the messenger RNA in a cell, named RNA-seq produce millions of short sequence fragments. Next Generation Sequencing technology allows more accurate analysis but increase needs in term of computational resources. This paper describes the optimization of a RNA-seq analysis pipeline devoted to splicing variants detection, aimed at reducing computation time and providing a multi-user/multisample environment. This work brings two main contributions. First, we optimized a well-known algorithm called TopHat by parallelizing some sequential mapping steps. Second, we designed and implemented a hybrid virtual GRID infrastructure allowing to efficiently execute multiple instances of TopHat running on different samples or on behalf of different users, thus optimizing the overall execution time and enabling a flexible multi-user environment.
File in questo prodotto:
File Dimensione Formato  
iCBBE2012_BIO.pdf

accesso aperto

Tipologia: 1. Preprint / submitted version [pre- review]
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 138.45 kB
Formato Adobe PDF
138.45 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2503460
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo