Mining Approximate Sequential Patterns in a Large Sequence Database

Hye Chung Kum; Joong Hyuk Chang

Mining Approximate Sequential Patterns in a Large Sequence Database

Hye Chung Kum

Joong Hyuk Chang

The KIPS Transactions:PartD, Vol. 13, No. 2, pp. 199-206, Apr. 2006

10.3745/KIPSTD.2006.13.2.199, PDF Download:

Abstract

Sequential pattern mining is an important data mining task with broad applications. However, conventional methods may meet inherent difficulties in mining databases with long sequences and noise. They may generate a huge number of short and trivial patterns but fail to find interesting patterns shared by many sequences. In this paper, to overcome these problems, we propose the theme of approximate sequential pattern mining roughly defined as identifying patterns approximately shared by many sequences. The proposed method works in two steps: one is to cluster target sequences by their similarities and the other is to find consensus patterns that are similar to the sequences in each cluster directly through multiple alignment. For this purpose, a novel structure called weighted sequence is presented to compress the alignment result, and the longest consensus pattern that represents each cluster is generated from its weighted sequence. Finally, the effectiveness of the proposed method is verified by a set of experiments.

Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.

Cite this article

[IEEE Style]

H. C. Kum and J. H. Chang, "Mining Approximate Sequential Patterns in a Large Sequence Database," The KIPS Transactions:PartD, vol. 13, no. 2, pp. 199-206, 2006. DOI: 10.3745/KIPSTD.2006.13.2.199.

[ACM Style]

Hye Chung Kum and Joong Hyuk Chang. 2006. Mining Approximate Sequential Patterns in a Large Sequence Database. The KIPS Transactions:PartD, 13, 2, (2006), 199-206. DOI: 10.3745/KIPSTD.2006.13.2.199.