Identifying Variable-Length Palindromic Pairs in DNA Sequences


The KIPS Transactions:PartB , Vol. 14, No. 6, pp. 461-472, Oct. 2007
10.3745/KIPSTB.2007.14.6.461,   PDF Download:

Abstract

The emphasis in genome projects has Been moving towards the sequence analysis in order to extract biological “meaning” (e.g., evolutionary history of particular molecules or their functions) from the sequence. Especially, palindromic or direct repeats that appear in a sequence have a biophysical meaning and the problem is to recognize interesting patterns and configurations of words (strings of characters) over complementary alphabets. In this paper, we propose an algorithm to identify variable length palindromic pairs (longer than a threshold), where we can allow gaps (distance between words). The algorithm is called palindrome algorithm (PA) and has O(N) time complexity. A palindromic pair consists of a hairpin structure. By composing collected palindromic pairs we build n-pair palindromic patterns. In addition, we dot some of the longest pairs in a circle to represent the structure of a DNA sequence. We run the algorithm over several selected genomes and the results of E.coli K12 are presented. There existed very long palindromic pair patterns in the genomes, which hardly occur in a random sequence.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. R. Kim, K. H. Jeong, D. H. Jeon, "Identifying Variable-Length Palindromic Pairs in DNA Sequences," The KIPS Transactions:PartB , vol. 14, no. 6, pp. 461-472, 2007. DOI: 10.3745/KIPSTB.2007.14.6.461.

[ACM Style]
Hyoung Rae Kim, Kyoung Hee Jeong, and Do Hong Jeon. 2007. Identifying Variable-Length Palindromic Pairs in DNA Sequences. The KIPS Transactions:PartB , 14, 6, (2007), 461-472. DOI: 10.3745/KIPSTB.2007.14.6.461.