Enhancing Malware Detection Through Machine Learning Techniques

Zeina S. Jassim; Mohamad M. Kassir

doi:10.51173/ijds.v1i1.4

Authors

Zeina S. Jassim Department of computer engineering, University of Qom, Qom, Iran
Mohamad M. Kassir Department of computer engineering, University of Qom, Qom, Iran

DOI:

https://doi.org/10.51173/ijds.v1i1.4

Keywords:

Anomaly Detection, Decision Tree, ID3, Machine Learning, Malware Detection

Abstract

Malware detection is important to computer network security since it is the principal attack vector against modern enterprises. As a result, firms must remove viruses from computer systems. Using artificial intelligence, namely machine learning techniques, to function in real-time with an IT system is the ideal solution to this problem. This issue has yet to be fixed, but it is still significant because a lack of processing power and memory constrains these features. The most popular method for evaluating systems and intrusion detection models is using the Application Program Interface (API) calls via the KDD-CUP99 data set to give this solution. KDD-CUP99 has more than three hundred thousand samples, each with 54 features. However, the data set attributes were designed and chosen to provide us with a high malware detection rate. The quality of this data was lowered to produce results. To get the desired results, the attributes of this data were reduced. Data transformation and purification are used in this process. Inaccurate, unnecessary, duplicated, or missing information is eliminated by data cleansing. Data cleaning eliminates inaccurate, excessive, redundant, or lacking information. By comparing this study to earlier research that employed lengthy sequences of software interface (API) calls with the same machine-learning classifiers, data transformation includes discretization, which transforms the continuous process of discretizing continuous data into discrete forms is a type of data transformation. Using more advanced algorithms to do the task at hand with the best precision and the least expense increases accuracy and performance. The data set was divided into two categories using a Support Vector Machine (SVM), Decision Tree (DT), and Iterative Dichotomiser 3 (ID3). The findings revealed that little previous research uses a five-class classification strategy for malware detection. The accuracy of several works is comparable to the accuracy acquired in the proposed work.

Downloads

Download data is not yet available.

References

M. Abdelsalam, R. Krishnan, Y. Huang and R. Sandhu, "Malware Detection in Cloud Infrastructures Using Convolutional Neural Networks," 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), San Francisco, CA, USA, 2018, pp. 162-169, doi: 10.1109/CLOUD.2018.00028.

S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse and T. Yagi, "Malware Detection with Deep Neural Network Using Process Behavior," 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Atlanta, GA, USA, 2016, pp. 577-582, doi: 10.1109/COMPSAC.2016.151.

N. Peiravian and X. Zhu, "Machine Learning for Android Malware Detection Using Permission and API Calls," 2013 IEEE 25th International Conference on Tools with Artificial Intelligence, Herndon, VA, USA, 2013, pp. 300-305, doi: 10.1109/ICTAI.2013.53.

H. Rathore, S. Agarwal, S. K. Sahay, and M. Sewak, “Malware detection using machine learning and deep learning,” in Lecture notes in computer science, 2018, pp. 402–411. doi: 10.1007/978-3-030-04780-1_28.

Z. Xu, S. Ray, P. Subramanyan and S. Malik, "Malware detection using machine learning based analysis of virtual memory access patterns," Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017, Lausanne, Switzerland, 2017, pp. 169-174, doi: 10.23919/DATE.2017.7926977.

J. Lee, H. Jang, S. Ha, and Y. Yoon, “Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm,” Mathematics, vol. 9, no. 21, p. 2813, Nov. 2021, doi: 10.3390/math9212813.

S. HR, "Static Analysis of Android Malware Detection using Deep Learning," 2019 International Conference on Intelligent Computing and Control Systems (ICCS), Madurai, India, 2019, pp. 841-845, doi: 10.1109/ICCS45141.2019.9065765.

R. Feng, S. Chen, X. Xie, G. Meng, S. -W. Lin and Y. Liu, "A Performance-Sensitive Malware Detection System Using Deep Learning on Mobile Devices," in IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1563-1578, 2021, doi: 10.1109/TIFS.2020.3025436.

S. Rkhouya and K. Chougdali, “Malware detection using a Machine-Learning based approach,” International Journal of Information Technology and Applied Sciences IJITAS), vol. 3, no. 4, pp. 167–171, Oct. 2021, doi: 10.52502/ijitas.v3i4.172.

Hussain, M. Asif, M. B. Ahmad, T. Mahmood, and M. A. Raza, “Malware detection using machine learning algorithms for Windows Platform,” in Lecture notes in networks and systems, 2022, pp. 619–632. doi: 10.1007/978-981-16-7618-5_53.

S. Shatnawi, Q. Yassen, and A. Yateem, “An Android malware detection approach based on static feature analysis using machine learning algorithms,” Procedia Computer Science, vol. 201, pp. 653–658, Jan. 2022, doi: 10.1016/j.procs.2022.03.086.

N. B. S. Abdulwahed, N. A. Al-Naji, N. I. Al-Rayahi, N. A. Yahya, and N. A. G. Perera, “Automated Computer Vision System for urine color detection,” Journal of Techniques, vol. 5, no. 1, pp. 66–73, Apr. 2023, doi: 10.51173/jt.v5i1.896.

Sharma, B. B. Gupta, A. K. Singh, and V. K. Saraswat, “Multi-dimensional Hybrid Bayesian belief network based approach for APT malware detection in various systems,” in Lecture notes in networks and systems, 2023, pp. 177–190. doi: 10.1007/978-3-031-22018-0_16.

K. S. Sangher, A. Singh, and H. M. Pandey, “Signature based ransomware detection based on optimizations approaches using RandomClassifier and CNN algorithms,” International Journal of Systems Assurance Engineering and Management, vol. 15, no. 5, pp. 1687–1703, Jul. 2023, doi: 10.1007/s13198-023-02017-9.

H. Torabi, S. L. Mirtaheri, and S. Greco, “Practical autoencoder based anomaly detection by using vector reconstruction error,” Cybersecurity, vol. 6, no. 1, Jan. 2023, doi: 10.1186/s42400-022-00134-9.

M. H. L. Louk and B. A. Tama, “Dual-IDS: A bagging-based gradient boosting decision tree model for network anomaly intrusion detection system,” Expert Systems With Applications, vol. 213, p. 119030, Oct. 2022, doi: 10.1016/j.eswa.2022.119030.

S, S. D, and P. G, “Malicious insider threat detection using variation of sampling methods for anomaly detection in cloud environment,” Computers & Electrical Engineering, vol. 105, p. 108519, Dec. 2022, doi: 10.1016/j.compeleceng.2022.108519.

G. M and S. C. Sethuraman, “A comprehensive survey on deep learning based malware detection techniques,” Computer Science Review, vol. 47, p. 100529, Dec. 2022, doi: 10.1016/j.cosrev.2022.100529.

M. Ashraf, M. Asif, M. B. Ahmad, A. Ayaz, A. Nasir and U. Ahmad, "Towards Classification and Analysis of Ransomware Detection Techniques," 2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), Sukkur, Pakistan, 2023, pp. 1-5, doi: 10.1109/iCoMET57998.2023.10099204.

J. Zhang, Z. Qin, H. Yin, L. Ou, and K. Zhang, “A feature-hybrid malware variants detection using CNN based opcode embedding and BPNN based API embedding,” Computers & Security, vol. 84, pp. 376–392, Apr. 2019, doi: 10.1016/j.cose.2019.04.005.

W. Zhao, I. Abdelaziz, J. Dolby, K. Srinivas, M. Helali, and E. Mansour, “Serenity: library based Python code analysis for code completion and automated machine learning,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2301.05108.

S. Rkhouya and K. Chougdali, “Malware detection using a Machine-Learning based approach,” International Journal of Information Technology and Applied Sciences (IJITAS), vol. 3, no. 4, pp. 167–171, Oct. 2021, doi: 10.52502/ijitas.v3i4.172.

F. Kazemi, N. Asgarkhani, and R. Jankowski, “Predicting seismic response of SMRFs founded on different soil types using machine learning techniques,” Engineering Structures, vol. 274, p. 114953, Oct. 2022, doi: 10.1016/j.engstruct.2022.114953.

Roy and S. Chakraborty, “Support vector machine in structural reliability analysis: A review,” Reliability Engineering & System Safety, vol. 233, p. 109126, Jan. 2023, doi: 10.1016/j.ress.2023.109126.