A Protein Sequence Prediction Method by Mining Sequence Data

KIPS Transactions on Computer and Communication Systems, Vol. 10, No. 2, pp. 261-266, Apr. 2003
10.3745/KIPSTD.2003.10.2.261,   PDF Download:


A protein, which is a linear polymer of amino acids, is one of the most important bio-molecules composing biological structures and regulating bio-chemical reactions. Since the characteristics and functions of proteins are determined by their amino acid sequences in principle, protein sequence determination is the starting point of protein function study. This paper proposes a protein sequence prediction method based on data mining techniques, which can overcome the limitation of previous bio-chemical sequencing methods. After applying multiple proteases to acquire overlapped protein fragments, we can identify candidate fragment sequences by comparing fragment mass values with peptide databases. We propose a method to construct multi-partite graph and search maximal paths to determine the protein sequence by assembling proper candidate sequences. In addition, experimental results based on the SWISS-PROT database showing the validity of the proposed method is presented.

Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.

Cite this article
[IEEE Style]
S. I. Cho, D. H. Lee, K. H. Cho, Y. G. Won and B. K. Kim, "A Protein Sequence Prediction Method by Mining Sequence Data," KIPS Journal D (2001 ~ 2012) , vol. 10, no. 2, pp. 261-266, 2003. DOI: 10.3745/KIPSTD.2003.10.2.261.

[ACM Style]
Sun I Cho, Do Heon Lee, Kwang Hwi Cho, Yong Gwan Won, and Byoung Ki Kim. 2003. A Protein Sequence Prediction Method by Mining Sequence Data. KIPS Journal D (2001 ~ 2012) , 10, 2, (2003), 261-266. DOI: 10.3745/KIPSTD.2003.10.2.261.