Improved network traffic classification using hashing techniques in machine and deep learning

Mohammed Altaimimi

	IJAAS
	International Journal of ADVANCED AND APPLIED SCIENCES EISSN: 2313-3724, Print ISSN: 2313-626X Frequency: 12





Volume 12, Issue 5 (May 2025), Pages: 255-261 ---------------------------------------------- Original Research Paper Improved network traffic classification using hashing techniques in machine and deep learning Author(s): Mohammed Altaimimi * Affiliation(s): Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha’il, Ha’il, Saudi Arabia Full text Full Text - PDF * Corresponding Author. Corresponding author's ORCID profile: https://orcid.org/0000-0002-4170-6910 Digital Object Identifier (DOI) https://doi.org/10.21833/ijaas.2025.05.024 Abstract The rapid global growth of the internet, driven by advancements in fiber and 5G technology, multi-device access, and affordable services, has increased the pressure on internet service providers to classify network traffic efficiently. Accurate traffic classification and protocol identification are critical for detecting malicious activity. This study introduces a new method that enhances machine learning and deep learning models by applying hashing techniques to convert string-based IP addresses into numerical values. The improved models demonstrate a significant boost in accuracy, increasing from 76% to 83%, along with better recall and F1-scores in key categories. These findings highlight the potential of hashing techniques to improve the performance of machine learning models in network traffic classification tasks. © 2025 The Authors. Published by IASE. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/). Keywords Network traffic, Machine learning, Deep learning, Hashing techniques, Traffic classification Article history Received 31 December 2024, Received in revised form 8 May 2025, Accepted 20 May 2025 Acknowledgment This research has been funded by the Scientific Research Deanship at the University of Ha’il, Saudi Arabia, through project number BA-2207. Compliance with ethical standards Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Citation: Altaimimi M (2025). Improved network traffic classification using hashing techniques in machine and deep learning. International Journal of Advanced and Applied Sciences, 12(5): 255-261 Permanent Link to this page Figures Fig. 1 Tables Table 1 Table 2 ---------------------------------------------- References (21) Azab A, Khasawneh M, Alrabaee S, Choo KKR, and Sarsour M (2024). Network traffic classification: Techniques, datasets, and challenges. Digital Communications and Networks, 10(3): 676-692. https://doi.org/10.1016/j.dcan.2022.09.009 [Google Scholar] Cao J, Wang D, Qu Z, Sun H, Li B, and Chen CL (2020). An improved network traffic classification model based on a support vector machine. Symmetry, 12(2): 301. https://doi.org/10.3390/sym12020301 [Google Scholar] Chi L and Zhu X (2017). Hashing techniques: A survey and taxonomy. ACM Computing Surveys (CSUR), 50(1): 11. https://doi.org/10.1145/3047307 [Google Scholar] Davis J and Goadrich M (2006). The relationship between precision-recall and ROC curves. In the 23rd International Conference on Machine learning, Association for Computing Machinery, Pittsburgh, USA: 233-240. https://doi.org/10.1145/1143844.1143874 [Google Scholar] PMCid:PMC3242122 de Menezes NAT and de Mello FL (2021). Flow feature-based network traffic classification using machine learning. Journal of Information Security and Cryptography (Enigma), 8(1): 12-16. https://doi.org/10.17648/jisc.v8i1.79 [Google Scholar] Guerra JL, Catania C, and Veas E (2022). Datasets are not enough: Challenges in labeling network traffic. Computers and Security, 120: 102810. https://doi.org/10.1016/j.cose.2022.102810 [Google Scholar] Hancock JT and Khoshgoftaar TM (2020). Survey on categorical data for neural networks. Journal of Big Data, 7: 28. https://doi.org/10.1186/s40537-020-00305-w [Google Scholar] Jenefa A and Moses MB (2018). An upgraded C5.0 algorithm for network application identification. In the 2nd International Conference on Trends in Electronics and Informatics, IEEE, Tirunelveli, India: 789-794. https://doi.org/10.1109/ICOEI.2018.8553826 [Google Scholar] Lewis DD (1998). Naive (Bayes) at forty: The independence assumption in information retrieval. In: Nédellec C and Rouveirol C (Eds.), European conference on machine learning: 4-15. Springer, Berlin, Germany. https://doi.org/10.1007/BFb0026666 [Google Scholar] Liu C, He L, Xiong G, Cao Z, and Li Z (2019). FS-Net: A flow sequence network for encrypted traffic classification. In the IEEE Conference on Computer Communications, IEEE, Paris, France: 1171-1179. https://doi.org/10.1109/INFOCOM.2019.8737507 [Google Scholar] Lotfollahi M, Jafari Siavoshani MJ, Shirali Hossein Zade R, and Saberian M (2020). Deep packet: A novel approach for encrypted traffic classification using deep learning. Soft Computing, 24(3): 1999-2012. https://doi.org/10.1007/s00500-019-04030-2 [Google Scholar] Maxwell P, Alhajjar E, and Bastian ND (2019). Intelligent feature engineering for cybersecurity. In the IEEE International Conference on Big Data, IEEE, Los Angeles, USA: 5005-5011. https://doi.org/10.1109/BigData47090.2019.9006122 [Google Scholar] Moore AW and Papagiannaki K (2005). Toward the accurate identification of network applications. In: Dovrolis C (Ed.), International workshop on passive and active network measurement: 41-54. Springer, Berlin, Germany. https://doi.org/10.1007/978-3-540-31966-5_4 [Google Scholar] Moore AW and Zuev D (2005). Internet traffic classification using Bayesian analysis techniques. In the International Conference on Measurement and Modeling of Computer Systems, Association for Computing Machinery, Banff, Canada: 50-60. https://doi.org/10.1145/1064212.1064220 [Google Scholar] Nadler A, Aminov A, and Shabtai A (2019). Detection of malicious and low throughput data exfiltration over the DNS protocol. Computers and Security, 80: 36-53. https://doi.org/10.1016/j.cose.2018.09.006 [Google Scholar] Oloyede AA, Faruk N, Noma N, Tebepah E, and Nwaulune AK (2023). Measuring the impact of the digital economy in developing countries: A systematic review and meta-analysis. Heliyon, 9(7): e17654. https://doi.org/10.1016/j.heliyon.2023.e17654 [Google Scholar] PMid:37501966 PMCid:PMC10368767 Sharafaldin I, Lashkari AH, and Ghorbani AA (2018). Toward generating a new intrusion detection dataset and intrusion traffic characterization. In the 4th International Conference on Information Systems Security and Privacy, Funchal-Madeira, Portugal: 108-116. https://doi.org/10.5220/0006639801080116 [Google Scholar] Stallings W and Brown L (2015). Computer security: Principles and practice. Pearson, London, UK. [Google Scholar] Sun G, Chen T, Su Y, and Li C (2018). Internet traffic classification based on incremental support vector machines. Mobile Networks and Applications, 23: 789-796. https://doi.org/10.1007/s11036-018-0999-x [Google Scholar] Wang P, Ye F, Chen X, and Qian Y (2018). Datanet: Deep learning based encrypted network traffic classification in SDN home gateway. IEEE Access, 6: 55380-55391. https://doi.org/10.1109/ACCESS.2018.2872430 [Google Scholar] Wang Z (2015). The applications of deep learning on traffic identification. BlackHat, USA. https://doi.org/10.54097/hset.v39i.6689 [Google Scholar]

Improved network traffic classification using hashing techniques in machine and deep learning

Full text

Digital Object Identifier (DOI)

Abstract

Keywords

Article history

Citation:

References (21)