A Text Mining -based Intrusion Log Recommendation in Digital Forensics


KIPS Transactions on Computer and Communication Systems, Vol. 2, No. 6, pp. 265-276, Jun. 2013
10.3745/KTCCS.2013.2.6.265, Full Text:

Abstract

In digital forensics log files have been stored as a form of large data for the purpose of tracing users` past behaviors. It is difficult for investigators to manually analysis the large log data without clues. In this paper, we propose a text mining technique for extracting intrusion logs from a large log set to recommend reliable evidences to investigators. In the training stage, the proposed method extracts intrusion association words from a training log set by using Apriori algorithm after preprocessing and the probability of intrusion for association words are computed by combining support and confidence, Robinson`s method of computing confidences for filtering spam mails is applied to extracting intrusion logs in the proposed method. As the results, the association word knowledge base is constructed by including the weights of the probability of intrusion for association words to improve the accuracy. In the test stage, the probability of intrusion logs and the probability of normal logs in a test log set are computed by Fisher`s inverse chi-square classification algorithm based on the association word knowledge base respectively and intrusion logs are extracted from combining the results. Then, the intrusion logs are recommended to investigators. The proposed method uses a training method of clearly analyzing the meaning of data from an unstructured large log data. As the results, it complements the problem of reduction in accuracy caused by data ambiguity. In addition, the proposed method recommends intrusion logs by using Fisher`s inverse chi-square classification algorithm. So, it reduces the rate of false positive(FP) and decreases in laborious effort to extract evidences manually.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. Ko, "A Text Mining -based Intrusion Log Recommendation in Digital Forensics," KIPS Transactions on Computer and Communication Systems, vol. 2, no. 6, pp. 265-276, 2013. DOI: 10.3745/KTCCS.2013.2.6.265.

[ACM Style]
Sujeong Ko. 2013. A Text Mining -based Intrusion Log Recommendation in Digital Forensics. KIPS Transactions on Computer and Communication Systems, 2, 6, (2013), 265-276. DOI: 10.3745/KTCCS.2013.2.6.265.