Improving the Performance of Statistical Automatic Text Categorization by using Phrasal Patterns and Keyword Sets


The Transactions of the Korea Information Processing Society (1994 ~ 2000), Vol. 7, No. 4, pp. 1150-1159, Apr. 2000
10.3745/KIPSTE.2000.7.4.1150,   PDF Download:

Abstract

This paper presents an automatic text categorization model that improves the accuracy by combining statistical and knowledge-based categorization methods. In our model we apply knowledge-based method first, and then apply statistical method on the text which are not categorized by knowledge-based method. By using this combined method, we can improve the accuracy of categorization while categorize all the texts without failure. For statistical categorization, the vector model with Inverted Category Frequency (ICF) weighting is used. For knowledge-based categorization, Phrasal Patterns and Keyword Sets are introduced to represent sentence patterns, and then pattern matching is performed. Experimental results on new articles show that the accuracy of categorization can be improved by combining the tow different categorization methods.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. G. Han, M. G. Park, K. J. Cho, J. T. Kim, "Improving the Performance of Statistical Automatic Text Categorization by using Phrasal Patterns and Keyword Sets," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 7, no. 4, pp. 1150-1159, 2000. DOI: 10.3745/KIPSTE.2000.7.4.1150.

[ACM Style]
Jung Gi Han, Min Gyu Park, Kwang Je Cho, and Jun Tae Kim. 2000. Improving the Performance of Statistical Automatic Text Categorization by using Phrasal Patterns and Keyword Sets. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 7, 4, (2000), 1150-1159. DOI: 10.3745/KIPSTE.2000.7.4.1150.