https://scholars.lib.ntu.edu.tw/handle/123456789/456051
標題: | Relationship of Jaccard and edit distance in malware clustering and online identification (Extended abstract) | 作者: | Dolev, S. Ghanayim, M. Binun, A. Frenkel, S. Sun, Y.S. YEALI SUN |
公開日期: | 2017 | 卷: | 2017-January | 起(迄)頁: | 1-5 | 來源出版物: | 2017 IEEE 16th International Symposium on Network Computing and Applications, NCA 2017 | 摘要: | In this paper, we examine the possibility to utilize the well-known approximations of Jaccard metric in order to reduce computational complexity of Edit Distance metric estimation. The scope of our analytical results is the representing strings rather than the original (raw) textual data, still in practice we obtained a solid indication that the results can be applied to (raw) strings that have low n-gram repetitions. We formulate inequalities between the Jaccard metric and the Edit Distance, that impose upper and lower bounds on the Edit Distance values in terms of the Jaccard values. We validate our inequality over strings of API call traces where (the small) clusters obtained are refined by applying Edit Distance. Jaccard is a measure of similarity between two sets, while Edit Distance is a measure for two strings, such as traces of API calls. The computation associated with creating n-grams and using Jaccard similarity is much more efficient than the computation of Edit Distance (linear versus quadratic time complexity). Thus, our new bounds on the Edit Distance given the Jaccard value are of practical interest. Another new aspect we coped with in our research is the inherent imbalance between malicious and benign API traces that are harvested from the system, as most of the traces are benign. We performed clustering only on the malware traces where each cluster concentrates malware with some specific common essence. The obtained clustering is used with great success in classifying new query traces for being either benign or malware. The traces for our research were obtained from the KVM hypervisor Runtime Execution Introspection and Profiling (REIP) system based on Virtual Machine Introspection (VMI) techniques to profile hooked Windows API calls. © 2017 IEEE. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/456051 | DOI: | 10.1109/NCA.2017.8171380 | SDG/關鍵字: | Complex networks; Computer crime; Analytical results; Extended abstracts; Measure of similarities; On-line identification; Quadratic time; Run-time execution; Upper and lower bounds; Virtual machine introspection; Malware |
顯示於: | 資訊管理學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。