Making raw data available to the research community is one of the pillars of Findability, Accessibility, Interoperability, and Reuse (FAIR) research. However, the submission of raw data to public databases still involves many manually operated procedures that are intrinsically time-consuming and error-prone, which raises potential reliability issues for both the data themselves and the ensuing metadata. For example, submitting sequencing data to the European Genome-phenome Archive (EGA) is estimated to take 1 month overall, and mainly relies on a web interface for metadata management that requires manual completion of forms and the upload of several comma separated values (CSV) files, which are not structured from a formal point of view. To tackle these limitations, here we present EGAsubmitter, a Snakemake-based pipeline that guides the user across all the submission steps, ranging from files encryption and upload, to metadata submission. EGASubmitter is expected to streamline the automated submission of sequencing data to EGA, minimizing user errors and ensuring higher end product fidelity.

EGAsubmitter: A software to automate submission of nucleic acid sequencing data to the European Genome-phenome Archive / Viviani, Marco; Montemurro, Marilisa; Trusolino, Livio; Bertotti, Andrea; Urgese, Gianvito; Grassi, Elena. - In: FRONTIERS IN BIOINFORMATICS. - ISSN 2673-7647. - 3:(2023), pp. 1-5. [10.3389/fbinf.2023.1143014]

EGAsubmitter: A software to automate submission of nucleic acid sequencing data to the European Genome-phenome Archive

Montemurro, Marilisa;Urgese, Gianvito;Grassi, Elena
2023

Abstract

Making raw data available to the research community is one of the pillars of Findability, Accessibility, Interoperability, and Reuse (FAIR) research. However, the submission of raw data to public databases still involves many manually operated procedures that are intrinsically time-consuming and error-prone, which raises potential reliability issues for both the data themselves and the ensuing metadata. For example, submitting sequencing data to the European Genome-phenome Archive (EGA) is estimated to take 1 month overall, and mainly relies on a web interface for metadata management that requires manual completion of forms and the upload of several comma separated values (CSV) files, which are not structured from a formal point of view. To tackle these limitations, here we present EGAsubmitter, a Snakemake-based pipeline that guides the user across all the submission steps, ranging from files encryption and upload, to metadata submission. EGASubmitter is expected to streamline the automated submission of sequencing data to EGA, minimizing user errors and ensuring higher end product fidelity.
File in questo prodotto:
File Dimensione Formato  
fbinf-03-1143014.pdf

accesso aperto

Descrizione: Main file
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 716.87 kB
Formato Adobe PDF
716.87 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2978575