Size dependent complexity of sequences in protein families
The size dependent complexity of protein sequences in various families in the FSSP database is characterized by sequence entropy, sequence similarity and sequence identity. As the average length L<Subscript>f</Subscript> of sequences in the family increases, an increasing trend of the sequence entropy and a decreasing trend of the sequence similarity and sequence identity are found. As L<Subscript>f</Subscript> increases beyond 250, a saturation of the sequence entropy, the sequence similarity and the sequence identity is observed. Such a saturated behavior of complexity is attributed to the saturation of the probability P<Subscript>g</Subscript> of global (long-range) interactions in protein structures when L<Subscript>f</Subscript> >250. It is also found that the alphabet size of residue types describing the sequence diversity depends on the value of L<Subscript>f</Subscript>, and becomes saturated at 12. Copyright EDP Sciences/Società Italiana di Fisica/Springer-Verlag 2005
Year of publication: |
2005
|
---|---|
Authors: | Li, J. ; Wang, J. ; Wang, W. |
Published in: |
The European Physical Journal B - Condensed Matter and Complex Systems. - Springer. - Vol. 47.2005, 3, p. 431-436
|
Publisher: |
Springer |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
A loan default discrimination model using cost-sensitive support vector machine improved by PSO
Cao, Jie, (2013)
-
Hedonic prices and quality adjusted price indices powered by AI
Bajari, Patrick L., (2021)
-
Li, Jinkai, (2023)
- More ...