Flattened data in convolutional neural networks: Using malware detection as case study
Journal
Proceedings of the 2016 Research in Adaptive and Convergent Systems, RACS 2016
Pages
130-135
Date Issued
2016
Author(s)
Abstract
Convolutional Neural Networks (CNNs) are very powerful variants of multilayer perceptron models inspired by human's brain neural system to reveal local, spatial correlation in a series of data. While CNNs are popularly used for image recognition nowadays, it is also possible to apply CNNs in other areas, for example, detection of malicious software. In this paper, we show how CNNs may be used to improve the classification of malicious software due to the high-level feature abstraction and equal-variance property against noises. Taking advantages of convolution kernels, CNNs are naturally born for pattern recognition on images only. For this application, we introduce a new transformation technique which converts a series of event logs into flattened data with two-dimensional features so that CNNs can be trained to detect malicious behaviors effectively. With the combination property and the proposed flattened input format, CNN can perform a k-skip-n-gram dimensionality reduction which learns more flexible and complex patterns comparing to the traditional solutions. Our preliminary results show that our latest CNNs-based malware detection engine reaches 93.012% prediction accuracy and 12.9% FNR under 32,000 samples of a training set. To our knowledge, this is the first paper discussing the application and effectiveness of CNNs on malware detection. © 2016 ACM.
SDGs
Other Subjects
Complex networks; Computer crime; Convolution; Dynamic analysis; Image recognition; Learning systems; Metadata; Multilayer neural networks; Neural networks; Pattern recognition; Android; Convolutional neural network; Dimensionality reduction; Prediction accuracy; Spatial correlations; Transformation techniques; Two-dimensional features; Variance properties; Malware
Type
conference paper