An Effective Incremental Text Clustering Method for the Large Document Database


The KIPS Transactions:PartD, Vol. 10, No. 1, pp. 57-66, Feb. 2003
10.3745/KIPSTD.2003.10.1.57,   PDF Download:

Abstract

With the development of the internet and computer, the amount of information through the internet is increasing rapidly and it is managed in document form. For this reason, the research into the method to manage for a large amount of document in an effective way is necessary. The document clustering is integrated documents to subject by classifying a set of documents through their similarity among them. Accordingly, the document clustering can be used in exploring and searching a document and it can increased accuracy of search. This paper proposes an efficient incremental clustering method for a set of documents increase gradually. The incremental document clustering algorithm assigns a set of new documents to the legacy clusters which have been identified in advance. In addition, to improve the correctness of the clustering, removing the stop words can be proposed and the weight of the word can be calculated by the proposed TF X NIDF function.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
D. H. Kang, K. H. Joo, W. S. Lee, "An Effective Incremental Text Clustering Method for the Large Document Database," The KIPS Transactions:PartD, vol. 10, no. 1, pp. 57-66, 2003. DOI: 10.3745/KIPSTD.2003.10.1.57.

[ACM Style]
Dang Hyuk Kang, Kil Hong Joo, and Won Suk Lee. 2003. An Effective Incremental Text Clustering Method for the Large Document Database. The KIPS Transactions:PartD, 10, 1, (2003), 57-66. DOI: 10.3745/KIPSTD.2003.10.1.57.