Automatic Classification of Documents Using Word Correlation


The Transactions of the Korea Information Processing Society (1994 ~ 2000), Vol. 6, No. 9, pp. 2422-2430, Sep. 1999
10.3745/KIPSTE.1999.6.9.2422,   PDF Download:

Abstract

In this paper, we propose a new method for automatic classification of web documents using the degree of correlation between words. First, we select keyworkds from term frequency and inverse document frequency (TF*IDF) and compute the degree of relevance between the keywords in the whole documents, using the probability model proposed in this paper. Second, centering around two words having the most intimate relations, we extract the set of word that was closely connected with them and created a profile that characterizes each class. Finally, if we repeat the above process until lower than threshold value, we will make several profiles which are in keeping with users concern. And, we classified each document with the profiles and compared these with those of other automatic classification methods.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. J. Seob and L. C. Hoon, "Automatic Classification of Documents Using Word Correlation," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 6, no. 9, pp. 2422-2430, 1999. DOI: 10.3745/KIPSTE.1999.6.9.2422.

[ACM Style]
Shin Jin Seob and Lee Chang Hoon. 1999. Automatic Classification of Documents Using Word Correlation. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 6, 9, (1999), 2422-2430. DOI: 10.3745/KIPSTE.1999.6.9.2422.