Direct and reversed sequence site analysis: implications for non-consensus sites identification and protein seqence comparison
Ivan Y. Torshin
Med Sci Monit 1999; 5(6): BR1039-1048
Functional protein sequence analysis is primarily based on the similarities with known sequences and allows obtaining data on various aspects of the protein functional properties. The analysis is made on the base of elaborated (consensus) patterns. In the algorithms for checking established sequence pattern occurrences only one direction of checking (from N- to C-terminus) is used. The second, reversed direction (from C- to N-terminus), that may correspond to a similar conformation, and therefore, functions of the consensus peptide is left out of view. On the example sequence set (human glycolytic proteins) it was shown that direct as well as reversed sequences of different types sites (kinase phosphorylation mainly) have similar degrees of conservation (50-60%). In the proteins with known three-dimensional structures up to 90% of most frequent sites placed on outer protein surface and thus the sites are accessible for potential modification. Analysis of reversed kinase sites overlaps has allowed identification of protein sequence 'mirror repeats'. Implications for protein sequence analysis, sequence alignments as well as possible origins of the reversed sites in protein sequences were considered.
Keywords: inverted repeats, direct and reversed site analysis, non-consensus phosphorylation, multiple sequence alignment