With next-generation sequencing, the genomic data available for the characterization of integration sites (IS) has dramatically increased. At present, in a single experiment, several thousand viral integration genome targets can be investigated to define genomic hot spots. In a previous article, we renovated a formal CIS analysis based on a rigid fixed window demarcation into a more stretchy definition grounded on graphs. Here, we present a selection of supporting data related to the graph-based framework (GBF) from our previous article, in which a collection of common integration sites (CIS) were identified on six published datasets. In this work, we will focus on two datasets, ISRTCGD and ISHIV, which have been previously discussed. Moreover, we show in more detail the workflow design that originates the datasets.

Common integration sites of published datasets identified using a graph-based framework / Vasciaveo, Alessandro; Velevska, Ivana; Politano, GIANFRANCO MICHELE MARIA; Savino, Alessandro; Schmidt, Manfred; Fronza, Raffaele. - In: COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL. - ISSN 2001-0370. - ELETTRONICO. - 14:(2016), pp. 87-90. [10.1016/j.csbj.2015.11.004]

Common integration sites of published datasets identified using a graph-based framework

VASCIAVEO, ALESSANDRO;VELEVSKA, IVANA;POLITANO, GIANFRANCO MICHELE MARIA;SAVINO, ALESSANDRO;
2016

Abstract

With next-generation sequencing, the genomic data available for the characterization of integration sites (IS) has dramatically increased. At present, in a single experiment, several thousand viral integration genome targets can be investigated to define genomic hot spots. In a previous article, we renovated a formal CIS analysis based on a rigid fixed window demarcation into a more stretchy definition grounded on graphs. Here, we present a selection of supporting data related to the graph-based framework (GBF) from our previous article, in which a collection of common integration sites (CIS) were identified on six published datasets. In this work, we will focus on two datasets, ISRTCGD and ISHIV, which have been previously discussed. Moreover, we show in more detail the workflow design that originates the datasets.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S2001037015000513-main (1).pdf

accesso aperto

Descrizione: Main paper
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Creative commons
Dimensione 806.06 kB
Formato Adobe PDF
806.06 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2624363
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo