An Automatic Classification System of Korean Documents Using Weight for Keywords of Document and Word Cluster


The KIPS Transactions:PartB , Vol. 8, No. 5, pp. 447-454, Oct. 2001
10.3745/KIPSTB.2001.8.5.447,   PDF Download:

Abstract

The automatic document classification is a method that assigns unlabeled documents to the existing classes. The automatic document classification can be applied to a classification of news group articles, a classification of web documents, showing more precise results of Information Retrieval using a learning of users interests. In this paper, we use the weighted Bayesian classifier that weights with keywords of a document to improve the classification accuracy. If the system can't classify a document properly because of the lack of the number of words as the feature of a document, it uses relevance word cluster to supplement the feature of a document. The word clusters are made by the automatic word clustering from the corpus. As the result, the proposed system outperformed existing classification system in the classification accuracy on Korean documents.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. H. Hur, J. H. Choi, J. H. Lee, J. B. Kim, K. W. Rim, "An Automatic Classification System of Korean Documents Using Weight for Keywords of Document and Word Cluster," The KIPS Transactions:PartB , vol. 8, no. 5, pp. 447-454, 2001. DOI: 10.3745/KIPSTB.2001.8.5.447.

[ACM Style]
Jun Hui Hur, Jun Hyeog Choi, Jung Hyun Lee, Joong Bae Kim, and Kee Wook Rim. 2001. An Automatic Classification System of Korean Documents Using Weight for Keywords of Document and Word Cluster. The KIPS Transactions:PartB , 8, 5, (2001), 447-454. DOI: 10.3745/KIPSTB.2001.8.5.447.