
Volume 12, Issue 5 (May 2025), Pages: 255-261
----------------------------------------------
Original Research Paper
Improved network traffic classification using hashing techniques in machine and deep learning
Author(s):
Mohammed Altaimimi *
Affiliation(s):
Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha’il, Ha’il, Saudi Arabia
Full text
Full Text - PDF
* Corresponding Author.
Corresponding author's ORCID profile: https://orcid.org/0000-0002-4170-6910
Digital Object Identifier (DOI)
https://doi.org/10.21833/ijaas.2025.05.024
Abstract
The rapid global growth of the internet, driven by advancements in fiber and 5G technology, multi-device access, and affordable services, has increased the pressure on internet service providers to classify network traffic efficiently. Accurate traffic classification and protocol identification are critical for detecting malicious activity. This study introduces a new method that enhances machine learning and deep learning models by applying hashing techniques to convert string-based IP addresses into numerical values. The improved models demonstrate a significant boost in accuracy, increasing from 76% to 83%, along with better recall and F1-scores in key categories. These findings highlight the potential of hashing techniques to improve the performance of machine learning models in network traffic classification tasks.
© 2025 The Authors. Published by IASE.
This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords
Network traffic, Machine learning, Deep learning, Hashing techniques, Traffic classification
Article history
Received 31 December 2024, Received in revised form 8 May 2025, Accepted 20 May 2025
Acknowledgment
This research has been funded by the Scientific Research Deanship at the University of Ha’il, Saudi Arabia, through project number BA-2207.
Compliance with ethical standards
Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Citation:
Altaimimi M (2025). Improved network traffic classification using hashing techniques in machine and deep learning. International Journal of Advanced and Applied Sciences, 12(5): 255-261
Permanent Link to this page
Figures
Fig. 1
Tables
Table 1 Table 2
----------------------------------------------
References (21)
- Azab A, Khasawneh M, Alrabaee S, Choo KKR, and Sarsour M (2024). Network traffic classification: Techniques, datasets, and challenges. Digital Communications and Networks, 10(3): 676-692. https://doi.org/10.1016/j.dcan.2022.09.009 [Google Scholar]
- Cao J, Wang D, Qu Z, Sun H, Li B, and Chen CL (2020). An improved network traffic classification model based on a support vector machine. Symmetry, 12(2): 301. https://doi.org/10.3390/sym12020301 [Google Scholar]
- Chi L and Zhu X (2017). Hashing techniques: A survey and taxonomy. ACM Computing Surveys (CSUR), 50(1): 11. https://doi.org/10.1145/3047307 [Google Scholar]
- Davis J and Goadrich M (2006). The relationship between precision-recall and ROC curves. In the 23rd International Conference on Machine learning, Association for Computing Machinery, Pittsburgh, USA: 233-240. https://doi.org/10.1145/1143844.1143874 [Google Scholar] PMCid:PMC3242122
- de Menezes NAT and de Mello FL (2021). Flow feature-based network traffic classification using machine learning. Journal of Information Security and Cryptography (Enigma), 8(1): 12-16. https://doi.org/10.17648/jisc.v8i1.79 [Google Scholar]
- Guerra JL, Catania C, and Veas E (2022). Datasets are not enough: Challenges in labeling network traffic. Computers and Security, 120: 102810. https://doi.org/10.1016/j.cose.2022.102810 [Google Scholar]
- Hancock JT and Khoshgoftaar TM (2020). Survey on categorical data for neural networks. Journal of Big Data, 7: 28. https://doi.org/10.1186/s40537-020-00305-w [Google Scholar]
- Jenefa A and Moses MB (2018). An upgraded C5.0 algorithm for network application identification. In the 2nd International Conference on Trends in Electronics and Informatics, IEEE, Tirunelveli, India: 789-794. https://doi.org/10.1109/ICOEI.2018.8553826 [Google Scholar]
- Lewis DD (1998). Naive (Bayes) at forty: The independence assumption in information retrieval. In: Nédellec C and Rouveirol C (Eds.), European conference on machine learning: 4-15. Springer, Berlin, Germany. https://doi.org/10.1007/BFb0026666 [Google Scholar]
- Liu C, He L, Xiong G, Cao Z, and Li Z (2019). FS-Net: A flow sequence network for encrypted traffic classification. In the IEEE Conference on Computer Communications, IEEE, Paris, France: 1171-1179. https://doi.org/10.1109/INFOCOM.2019.8737507 [Google Scholar]
- Lotfollahi M, Jafari Siavoshani MJ, Shirali Hossein Zade R, and Saberian M (2020). Deep packet: A novel approach for encrypted traffic classification using deep learning. Soft Computing, 24(3): 1999-2012. https://doi.org/10.1007/s00500-019-04030-2 [Google Scholar]
- Maxwell P, Alhajjar E, and Bastian ND (2019). Intelligent feature engineering for cybersecurity. In the IEEE International Conference on Big Data, IEEE, Los Angeles, USA: 5005-5011. https://doi.org/10.1109/BigData47090.2019.9006122 [Google Scholar]
- Moore AW and Papagiannaki K (2005). Toward the accurate identification of network applications. In: Dovrolis C (Ed.), International workshop on passive and active network measurement: 41-54. Springer, Berlin, Germany. https://doi.org/10.1007/978-3-540-31966-5_4 [Google Scholar]
- Moore AW and Zuev D (2005). Internet traffic classification using Bayesian analysis techniques. In the International Conference on Measurement and Modeling of Computer Systems, Association for Computing Machinery, Banff, Canada: 50-60. https://doi.org/10.1145/1064212.1064220 [Google Scholar]
- Nadler A, Aminov A, and Shabtai A (2019). Detection of malicious and low throughput data exfiltration over the DNS protocol. Computers and Security, 80: 36-53. https://doi.org/10.1016/j.cose.2018.09.006 [Google Scholar]
- Oloyede AA, Faruk N, Noma N, Tebepah E, and Nwaulune AK (2023). Measuring the impact of the digital economy in developing countries: A systematic review and meta-analysis. Heliyon, 9(7): e17654. https://doi.org/10.1016/j.heliyon.2023.e17654 [Google Scholar] PMid:37501966 PMCid:PMC10368767
- Sharafaldin I, Lashkari AH, and Ghorbani AA (2018). Toward generating a new intrusion detection dataset and intrusion traffic characterization. In the 4th International Conference on Information Systems Security and Privacy, Funchal-Madeira, Portugal: 108-116. https://doi.org/10.5220/0006639801080116 [Google Scholar]
- Stallings W and Brown L (2015). Computer security: Principles and practice. Pearson, London, UK. [Google Scholar]
- Sun G, Chen T, Su Y, and Li C (2018). Internet traffic classification based on incremental support vector machines. Mobile Networks and Applications, 23: 789-796. https://doi.org/10.1007/s11036-018-0999-x [Google Scholar]
- Wang P, Ye F, Chen X, and Qian Y (2018). Datanet: Deep learning based encrypted network traffic classification in SDN home gateway. IEEE Access, 6: 55380-55391. https://doi.org/10.1109/ACCESS.2018.2872430 [Google Scholar]
- Wang Z (2015). The applications of deep learning on traffic identification. BlackHat, USA. https://doi.org/10.54097/hset.v39i.6689 [Google Scholar]
|