Scimago Lab
powered by Scopus
call: +1.631.470.9640
Mon-Fri 10 am - 2 pm EST


Medical Science Monitor Basic Research


eISSN: 1643-3750

Get your full text copy in PDF

A tabular approach to the sequence-to-structure relation in proteins (tetrapeptide representation) for de novo protein design.

Jan Meus, Michał Brylinski, Monika Piwowar, Monika Piwowar, Zdzisław Wiśniowski, Justyna Stefaniak, Leszek Konieczny, Grzegorz Surówka, Irena Roterman

Med Sci Monit 2006; 12(6): BR208-214

ID: 451246

BACKGROUND: Experimental observations classify the protein-folding process as a multi-step event. The backbone conformation has been experimentally recognized as responsible for the early-stage structural forms of a polypeptide. The sequence-to-structure and structure-to-sequence relation is critical for predicting protein structure. A contingency table representing this relation for tetrapeptides in their early-stage is presented. Their correlation seems to be essential in protein-folding simulation. MATERIAL/METHODS: The polypeptide chains of all the proteins in the Protein Data Bank were transformed into their early-stage structural forms. The tetrapeptide was selected as the structural unit. Tetrapetide sequences and structures were expressed by letter codes. The transformation of a contingency table of any size (here: 160,000x2401) to a 2x2 table performed for each non-zero cell of the original table allowed calculation of the rho-coefficient measuring the strength of the relation. RESULTS: High values of the rho-coefficient extracted sequences of strong structural determinability and structures of high sequence selectivity. The web-site program to calculate the rho-coefficient ranking list was constructed to enable applying this method to any problem of contingency table analysis. CONCLUSIONS: The results revealed sequence-to-structure (and vice versa) correlation in early-stage folding. Surprisingly, the irregular structural forms of loops and bends appeared to be highly determined. Comparison of these results with another method based on information entropy revealed high accordance. The method oriented on interpretation of a large contingency table seems very useful especially for large-scale microarray analysis, a very popular technique in the post-genomic era.

This paper has been published under Creative Common Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) allowing to download articles and share them with others as long as they credit the authors and the publisher, but without permission to change them in any way or use them commercially.
I agree