Uncharacterized proteins pose a challenge not just to functional genomics, but also to biology in general. The knowledge of biochemical functions of such proteins is very critical for designing efficient therapeutic techniques. The bot- tleneck in hypothetical proteins annotation is the difficulty in collecting and aggregating enough biological information about the protein itself. In this paper, we propose and evaluate a protein annotation technique that aggregates different biological infor- mation conserved across many hypothetical proteins. To enhance the performance and to increase the prediction accuracy, we incorporate term specific relationships based on Gene Ontology (GO). Our method combines PPI (Protein Protein Interactions) data, protein motifs information, protein sequence similarity and protein homology data, with a context similarity measure based on Gene Ontology, to accurately infer functional information for unannotated proteins. We apply our method on Saccharomyces Cerevisiae species proteins. The aggregation of different sources of evidence with GO relationships increases the precision and accuracy of prediction compared to other methods reported in literature. We predicted with a precision and accuracy of 100% for more than half proteins of the input set and with an overall 81.35% precision and 80.04% accuracy.

Combining Homolog and Motif Similarity Data with Gene Ontology Relationships for Protein Function Prediction / UR REHMAN, Hafeez; Benso, Alfredo; DI CARLO, Stefano; Politano, GIANFRANCO MICHELE MARIA; Savino, Alessandro; Suravajhala, P.. - STAMPA. - (2012), pp. 441-444. (Intervento presentato al convegno IEEE International Conference on Bioinformatics and Biomedicine (BIBM) tenutosi a Philadelphia (PA), USA nel 4-7 Oct., 2012) [10.1109/BIBM.2012.6392719].

Combining Homolog and Motif Similarity Data with Gene Ontology Relationships for Protein Function Prediction

UR REHMAN, HAFEEZ;BENSO, Alfredo;DI CARLO, STEFANO;POLITANO, GIANFRANCO MICHELE MARIA;SAVINO, ALESSANDRO;
2012

Abstract

Uncharacterized proteins pose a challenge not just to functional genomics, but also to biology in general. The knowledge of biochemical functions of such proteins is very critical for designing efficient therapeutic techniques. The bot- tleneck in hypothetical proteins annotation is the difficulty in collecting and aggregating enough biological information about the protein itself. In this paper, we propose and evaluate a protein annotation technique that aggregates different biological infor- mation conserved across many hypothetical proteins. To enhance the performance and to increase the prediction accuracy, we incorporate term specific relationships based on Gene Ontology (GO). Our method combines PPI (Protein Protein Interactions) data, protein motifs information, protein sequence similarity and protein homology data, with a context similarity measure based on Gene Ontology, to accurately infer functional information for unannotated proteins. We apply our method on Saccharomyces Cerevisiae species proteins. The aggregation of different sources of evidence with GO relationships increases the precision and accuracy of prediction compared to other methods reported in literature. We predicted with a precision and accuracy of 100% for more than half proteins of the input set and with an overall 81.35% precision and 80.04% accuracy.
File in questo prodotto:
File Dimensione Formato  
2012-BIBM-Protein_AuthorVersion.pdf

accesso aperto

Descrizione: Author version
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 6.44 MB
Formato Adobe PDF
6.44 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2502184
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo