Incremental Generation of A Decision Tree Using Global Discretization For Large Data


The KIPS Transactions:PartB , Vol. 12, No. 4, pp. 487-498, Aug. 2005
10.3745/KIPSTB.2005.12.4.487,   PDF Download:

Abstract

Recently, It has focused on decision tree algorithms that can handle large dataset. However, because most of these algorithms for large datasets process data in a batch mode, if new data is added, they have to rebuild the tree from scratch. A more efficient approach to reducing the cost problem of rebuilding is an approach that builds a tree incrementally. Representative algorithms for incremental tree construction methods are BOAT and ITI and most of these algorithms use a local discretization method to handle the numeric data type.However, because a discretization requires sorted numeric data, in situation of processing large data sets, a global discretization method that sorts all data only once is more suitable than a local discretization method that sorts in every node. This paper proposes an incremental tree construction method that efficiently rebuilds a tree using a global discretization method to handle the numeric data type. When new data is added, new categories influenced by the data should be recreated, and then the tree structure should be changed in accordance with category changes. This paper proposes a method that extracts sample points and performs discretization from these sample points to recreate categories efficiently and uses confidence intervals and a tree restructuring method to adjust tree structure to category changes. In this study, an experiment using people database was made to compare the proposed method with the existing one that uses a local discretization.,


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
K. S. Han and S. W. Lee, "Incremental Generation of A Decision Tree Using Global Discretization For Large Data," The KIPS Transactions:PartB , vol. 12, no. 4, pp. 487-498, 2005. DOI: 10.3745/KIPSTB.2005.12.4.487.

[ACM Style]
Kyong Sik Han and Soo Won Lee. 2005. Incremental Generation of A Decision Tree Using Global Discretization For Large Data. The KIPS Transactions:PartB , 12, 4, (2005), 487-498. DOI: 10.3745/KIPSTB.2005.12.4.487.