Previous Articles     Next Articles

A numerical characterization of protein sequences based on k\|word and its application

  

  1. (College of Mathematics and Physics, Bohai University, Jinzhou 121013, China)
  • Online:2014-11-25 Published:2014-12-02

Abstract: Based on 5\|letter classification model of amino acids, a protein sequence was transformed into a 5\|letter sequence. By means of the frequencies of 1\|word and 2\|word, the sequence is transformed into a 30\|D vector. By calculating the Euclidean distance between two vectors, we obtained the evolutionary distance between two species. The phylogenetic analysis on two groups of protein sequences showed that this method was efficient.

Key words: 5\, letter classification, 30\, D vector, phylogenetic analysis