Survey Paper | Computer Science & Engineering | India | Volume 4 Issue 12, December 2015
Study of Dataset Feature Filtering of OpCode for Malware Detection Using SVM Training Phase
Abstract: Malware can be defined as any type of malicious code that has the potential to harm a computer or network. To detect unknown malware families, the frequency of the appearance of Opcode (Operation Code) sequences are used through dynamic analysis. Opcode n-gram analysis used to extract features from the inspected files. Opcode n-grams are used as features during the classification process with the aim of identifying unknown malicious code. A support vector machine (SVM) is used to create a reference model, which is used to evaluate two methods of feature reduction, which are area of intersect and subspace analysis using eigenvectors. The SVM is configured to traverse through the dataset searching for Opcodes that have a positive impact on the classification of benign and malicious software. The dataset is constructed by representing each executable file as a set of Opcode density histograms. Classification tasks involve separating dataset into training and test data. The training sets are classified into benign and malicious software. In area of interest the characteristics of benign and malicious Opcodes are plotted as normal distributions. They are grouped into density curves of a single Opcode. The key feature to note is the overlapping area of the two density curves. In Subspace analysis the importance of individual OpCodes, are investigated by the eigenvalues and eigenvectors in subspace. PCA is used for data compression and mapping. The eigenvector filter Opcodes coincides with the SVM classify the malware Opcodes feature.
Keywords: SVM, N-gram analysis, obfuscation, area of intersect
Edition: Volume 4 Issue 12, December 2015,
Pages: 474 - 479
How to Cite this Article?
Bhushan Kinholkar, "Study of Dataset Feature Filtering of OpCode for Malware Detection Using SVM Training Phase", International Journal of Science and Research (IJSR), https://www.ijsr.net/get_abstract.php?paper_id=NOV151981, Volume 4 Issue 12, December 2015, 474 - 479
How to Share this Article?
Similar Articles with Keyword 'SVM'
Profit Contribution of Bank Customer from Different Business Liabilities
Vinod Desai, Shalini B Ullagaddi, Vittal A Odeyar
Parkinson Disease Detection Using Machine Learning Algorithms
Yatharth Nakul, Ankit Gupta, Hritik Sachdeva