Korean Document Classification Using Extended Vector Space Model


The KIPS Transactions:PartB , Vol. 18, No. 2, pp. 93-108, Apr. 2011
10.3745/KIPSTB.2011.18.2.93,   PDF Download:

Abstract

We propose a extended vector space model by using ambiguous words and disambiguous words to improve the result of a Korean document classification method. In this paper we study the precision enhancement of vector space model and we propose a new axis that represents a weight value. Conventional classification methods without the weight value had some problems in vector comparison. We define a word which has same axis of the weight value as ambiguous word after calculating a mutual information value between a term and its classification field. We define a word which is disambiguous with ambiguous meaning as disambiguous word. We decide the strengthness of a disambiguous word among several words which is occurring ambiguous word and a same document. Finally, we proposed a new classification method based on extension of vector dimension with ambiguous and disambiguous words.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. K. Lee, "Korean Document Classification Using Extended Vector Space Model," The KIPS Transactions:PartB , vol. 18, no. 2, pp. 93-108, 2011. DOI: 10.3745/KIPSTB.2011.18.2.93.

[ACM Style]
Sang Kon Lee. 2011. Korean Document Classification Using Extended Vector Space Model. The KIPS Transactions:PartB , 18, 2, (2011), 93-108. DOI: 10.3745/KIPSTB.2011.18.2.93.