Enhancing Malware Detection Through Machine Learning Techniques

Authors

  • Zeina S. Jassim Department of computer engineering, University of Qom, Qom, Iran
  • Mohamad M. Kassir Department of computer engineering, University of Qom, Qom, Iran

DOI:

https://doi.org/10.51173/ijds.v1i1.4

Keywords:

Anomaly Detection, Decision Tree, ID3, Machine Learning, Malware Detection

Abstract

Malware detection is important to computer network security since it is the principal attack vector against modern enterprises. As a result, firms must remove viruses from computer systems. Using artificial intelligence, namely machine learning techniques, to function in real-time with an IT system is the ideal solution to this problem. This issue has yet to be fixed, but it is still significant because a lack of processing power and memory constrains these features. The most popular method for evaluating systems and intrusion detection models is using the Application Program Interface (API) calls via the KDD-CUP99 data set to give this solution. KDD-CUP99 has more than three hundred thousand samples, each with 54 features. However, the data set attributes were designed and chosen to provide us with a high malware detection rate. The quality of this data was lowered to produce results. To get the desired results, the attributes of this data were reduced. Data transformation and purification are used in this process. Inaccurate, unnecessary, duplicated, or missing information is eliminated by data cleansing. Data cleaning eliminates inaccurate, excessive, redundant, or lacking information. By comparing this study to earlier research that employed lengthy sequences of software interface (API) calls with the same machine-learning classifiers, data transformation includes discretization, which transforms the continuous process of discretizing continuous data into discrete forms is a type of data transformation. Using more advanced algorithms to do the task at hand with the best precision and the least expense increases accuracy and performance. The data set was divided into two categories using a Support Vector Machine (SVM), Decision Tree (DT), and Iterative Dichotomiser 3 (ID3). The findings revealed that little previous research uses a five-class classification strategy for malware detection. The accuracy of several works is comparable to the accuracy acquired in the proposed work.

References

Abdelsalam, M., Krishnan, R., Huang, Y., & Sandhu, R. (2018, July) “Malware detection in cloud infrastructures using convolutional neural networks” In 2018 IEEE 11th International Conference on cloud computing (CLOUD) (pp. 162-169). IEEE.‏

Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T., & Yagi, T. (2016, June)” Malware detection with deep neural network using process behavior” In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC) (Vol. 2, pp. 577-582). IEEE.‏

Peiravian, N., & Zhu, X. (2013, November)” Machine learning for Android malware detection using permission and API calls” In 2013 IEEE 25th International Conference on Tools with Artificial Intelligence (pp. 300-305). IEEE.‏

Rathore, H., Agarwal, S., Sahay, S. K., & Sewak, M. (2018). Malware detection using machine learning and deep learning. In Big Data Analytics: 6th International Conference, BDA 2018, Warangal, India, December 18–21, 2018, Proceedings 6 (pp. 402-411). Springer International Publishing.‏

Xu, Z., Ray, S., Subramanyan, P., & Malik, S. (2017, March). Malware detection using machine learning-based analysis of virtual memory access patterns. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017 (pp. 169-174). IEEE.‏

Lee, J., Jang, H., Ha, S., & Yoon, Y. (2021). Android malware detection using machine learning with feature selection based on the genetic algorithm. Mathematics, 9(21), 2813.‏

Sandeep, H. R. (2019, May). Static analysis of Android malware detection using deep learning. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS) (pp. 841-845). IEEE.‏

Feng, R., Chen, S., Xie, X., Meng, G., Lin, S. W., & Liu, Y. (2020). A performance-sensitive malware detection system using deep learning on mobile devices. IEEE Transactions on Information Forensics and Security, 16, 1563-1578.‏

Rkhouya, S., & Chougdali, K. (2021). Malware detection using a machine-learning-based approach. Int. J. Inf. Technol. Appl. Sci, 3(4), 167-171.‏

Hussain, A., Asif, M., Ahmad, M. B., Mahmood, T., & Raza, M. A. (2022, April). Malware detection using machine learning algorithms for the Windows platform. In Proceedings of International Conference on Information Technology and Applications: ICITA 2021 (pp. 619-632). Singapore: Springer Nature Singapore.‏

Shatnawi, A. S., YaSassen, Q., & Yateem, A. (2022). An android malware detection approach based on static feature analysis using machine learning algorithms. Procedia Computer Science, 201, 653-658.‏

Ban Shamil Abdulwahed, Ali Al-Naji, Izzat Al-Rayahi, Ammar Yahya, and Asanka G. Perera, “Automated Computer Vision System for Urine Color Detection,” J. Tech., vol. 5, no. 1, pp. 66–73, 2023, doi: 10.51173/jt.v5i1.896.

Sharma, A., Gupta, B. B., Singh, A. K., & Saraswat, V. K. (2021, September). Multi-dimensional hybrid Bayesian belief network-based approach for apt malware detection in various systems. In International Conference on Cyber Security, Privacy and Networking (pp. 177-190). Cham: Springer International Publishing.‏

Sangher, K. S., Singh, A., & Pandey, H. M. (2023). Signature-based ransomware detection based on optimization approaches using RandomClassifier and CNN algorithms. International Journal of System Assurance Engineering and Management, 1-17.‏

Torabi, H., Mirtaheri, S. L., & Greco, S. (2023). Practical autoencoder-based anomaly detection by using vector reconstruction error. Cybersecurity, 6(1), 1.‏

Louk, M. H. L., & Tama, B. A. (2023). Dual-IDS: A bagging-based gradient boosting decision tree model for network anomaly intrusion detection system. Expert Systems with Applications, 213, 119030.‏

Asha, S., Shanmugapriya, D., & Padmavathi, G. (2023). Malicious insider threat detection using a variation of sampling methods for anomaly detection in a cloud environment. Computers and Electrical Engineering, 105, 108519.‏

Gopinath, M., & Sethuraman, S. C. (2023). A comprehensive survey on deep learning-based malware detection techniques. Computer Science Review, 47, 100529.‏

Ashraf, M., Asif, M., Ahmad, M. B., Ayaz, A., Nasir, A., & Ahmad, U. (2023, March). Towards Classification and Analysis of Ransomware Detection Techniques. In 2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (pp. 1-5). IEEE.‏

Zhang, J., Qin, Z., Yin, H., Ou, L., & Zhang, K. (2019). A feature-hybrid malware variants detection using CNN-based opcode embedding and BPNN-based API embedding. Computers & Security, 84, 376-392.‏

Zhang, J., Qin, Z., Yin, H., Ou, L., & Zhang, K. (2019). A feature-hybrid malware variants detection using CNN-based opcode embedding and BPNN-based API embedding. Computers & Security, 84, 376-392.‏

Zhao, W., Abdelaziz, I., Dolby, J., Srinivas, K., Helali, M., & Mansour, E. (2023). Serenity: Library-Based Python Code Analysis for Code Completion and Automated Machine Learning. arXiv preprint arXiv:2301.05108.‏

Rkhouya, S., & Chougdali, K. (2021). Malware detection using a machine-learning-based approach. Int. J. Inf. Technol. Appl. Sci, 3(4), 167-171.‏

Kazemi, F., Asgarkhani, N., & Jankowski, R. (2023). Predicting the seismic response of SMRFs founded on different soil types using machine learning techniques. Engineering Structures, 274, 114953.‏

Roy, A., & Chakraborty, S. (2023). Support vector machine in structural reliability analysis: A review. Reliability Engineering & System Safety, 233, 109126.‏

Downloads

Published

2024-06-30

How to Cite

Jassim , Z. S., & Kassir , M. M. (2024). Enhancing Malware Detection Through Machine Learning Techniques. InfoTech Spectrum: Iraqi Journal of Data Science , 1(1), 1–15. https://doi.org/10.51173/ijds.v1i1.4

Issue

Section

Information Security and Cybersecurity