An Efficient Two-Level Hybrid Signature File Method for Large Text Databases


The Transactions of the Korea Information Processing Society (1994 ~ 2000), Vol. 4, No. 4, pp. 923-932, Apr. 1997
10.3745/KIPSTE.1997.4.4.923,   PDF Download:

Abstract

In this paper, we propose a two-level hybrid signature file method(THM) to efficiently deal with large text database that use a term discrimination concept. In addition, we apply Yoo's clustering scheme to the two-level hybrid signature file method. The clustering scheme groups similar signatures together according to the similarity of the highly discriminatory terms so that we may achieve better performance on retrieval. The space-time analytical model of the proposed two-level hybrid method is provided. Based on the analytical model and experiments, we compare it with the existing methods, i.e., the bit-sliced method(BM), the two-level method(TM), and the hybrid method(HM). As a result, we show that THM achieves the best retrieval performance in a large database with 100,000 records when the number of matching records is less than 160.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
Y. J. Soo and K. H. Il, "An Efficient Two-Level Hybrid Signature File Method for Large Text Databases," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 4, no. 4, pp. 923-932, 1997. DOI: 10.3745/KIPSTE.1997.4.4.923.

[ACM Style]
Yoo Jae Soo and Kang Hyung Il. 1997. An Efficient Two-Level Hybrid Signature File Method for Large Text Databases. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 4, 4, (1997), 923-932. DOI: 10.3745/KIPSTE.1997.4.4.923.