The amino acid sequence of a protein is the key to understanding its structure and ultimately its function in the cell. This paper addresses the fundamental is- sue of encoding amino acids in ways that the visual- ization of protein sequences facilitates the decoding of its information content. We show that a feature-based representation in a three-dimensional (3D) space de- rived from substitution matrices provides an adequate representation from which the domain content of a pro- tein can be predicted. In addition, we show that each dimension of the feature space can be related to a phys- ical property of the amino acids.
Citation:
Shengyin Gu, Olivier Poch, Bernd Hamann, Patrice Koehl, "A Geometric Representation of Protein Sequences," bibm, pp.135-142, 2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007), 2007