Proteins are large molecules that living cells make by stringing together building blocks called amino acids or peptides, following their blue-prints in the DNA. Freshly made proteins are typically long, structure-less chains of peptides, but shortly afterwards most of them fold into characteristic structures. Proteins execute many functions in the cell, for which they need to have the right structure, which is therefore very important in determining what the proteins can do. The structure of a protein can be determined by X-ray diffraction and other experimental approaches which are all, to this day, somewhat labor-intensive and difficult. On the other hand, the order of the peptides in a protein can be read off from the DNA blue-print, and such protein sequences are today routinely produced in large numbers. In this paper we show that many similar protein sequences can be used to find information about the structure. The basic approach is to construct a probabilistic model for sequence variability, and then to use the parameters of that model to predict structure in three-dimensional space. The main technical novelty compared to previous contributions in the same general direction is that we use models more directly matched to the data.
Improving Contact Prediction along Three Dimensions / Feinauer, Christoph; Skwark, M. J.; Pagnani, Andrea; Aurell, E.. - In: PLOS COMPUTATIONAL BIOLOGY. - ISSN 1553-734X. - ELETTRONICO. - 10:(2014), pp. e1003847-1-e1003847-13. [10.1371/journal.pcbi.1003847]
Improving Contact Prediction along Three Dimensions
FEINAUER, CHRISTOPH;PAGNANI, ANDREA;
2014
Abstract
Proteins are large molecules that living cells make by stringing together building blocks called amino acids or peptides, following their blue-prints in the DNA. Freshly made proteins are typically long, structure-less chains of peptides, but shortly afterwards most of them fold into characteristic structures. Proteins execute many functions in the cell, for which they need to have the right structure, which is therefore very important in determining what the proteins can do. The structure of a protein can be determined by X-ray diffraction and other experimental approaches which are all, to this day, somewhat labor-intensive and difficult. On the other hand, the order of the peptides in a protein can be read off from the DNA blue-print, and such protein sequences are today routinely produced in large numbers. In this paper we show that many similar protein sequences can be used to find information about the structure. The basic approach is to construct a probabilistic model for sequence variability, and then to use the parameters of that model to predict structure in three-dimensional space. The main technical novelty compared to previous contributions in the same general direction is that we use models more directly matched to the data.File | Dimensione | Formato | |
---|---|---|---|
journal.pcbi.1003847 (1).pdf
accesso aperto
Descrizione: pdf
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
1.57 MB
Formato
Adobe PDF
|
1.57 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2568336
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo