A novel approach to database selection using feedforward neural networks

Nidal A. Al-Dmour; Hani Al-Zoubi; Ghazi Al Naymat; Hanan Hussain

doi:10.21833/ijaas.2025.10.017

	IJAAS
	International Journal of ADVANCED AND APPLIED SCIENCES EISSN: 2313-3724, Print ISSN: 2313-626X Frequency: 12





Volume 12, Issue 10 (October 2025), Pages: 150-158 ---------------------------------------------- Original Research Paper A novel approach to database selection using feedforward neural networks Author(s): Nidal A. Al-Dmour ^1,, Hani Al-Zoubi ¹, Ghazi Al Naymat ², Hanan Hussain ³ Affiliation(s):* ¹Department of Computer Engineering, College of Engineering, Mutah University, Karak, Jordan ²Artificial Intelligence Research Center (AIRC), College of Engineering and IT, Ajman University, Ajman, United Arab Emirates ³College of Engineering and IT, University of Dubai, Dubai, United Arab Emirates Full text Full Text - PDF * Corresponding Author. Corresponding author's ORCID profile: https://orcid.org/0000-0002-2898-3905 Digital Object Identifier (DOI) https://doi.org/10.21833/ijaas.2025.10.017 Abstract Selecting an appropriate database is a common challenge for professionals, including web developers and machine learning engineers. Choosing the most suitable database for an application is important for maximizing its performance. However, because many features of different databases overlap, manually predicting the best database is difficult and prone to errors. To address this issue, a new approach is proposed using a Feedforward Neural Network (FFNN) for database selection. This method involves four steps: feature selection, dataset generation, neural network modeling, and database prediction. In the feature selection step, important features of seven major relational databases—MySQL, MS SQL Server, Oracle, IBM DB2, PostgreSQL, SQLite, and Microsoft Access—are gathered through web searches. These features are used to create a ground truth table. During dataset generation, 2,400 combinations of 75 features are generated, and labels for each instance are calculated using a weighted average method. The neural network modeling step involves selecting an optimal feedforward neural network based on its parameters and performance. The network is then trained using the Levenberg-Marquardt backpropagation algorithm. In testing, user input is provided (a selected set of features is fed into the pre-trained network), and the system predicts the best database with a mean squared error (MSE) of 5.16E-14. © 2025 The Authors. Published by IASE. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/). Keywords Database selection, Feature selection, Neural network, Dataset generation, Prediction accuracy Article history Received 11 May 2024, Received in revised form 12 September 2025, Accepted 1 October 2025 Acknowledgment No Acknowledgment. Compliance with ethical standards Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Citation: Al-Dmour NA, Al-Zoubi H, Al Naymat G, and Hussain H (2025). A novel approach to database selection using feedforward neural networks. International Journal of Advanced and Applied Sciences, 12(10): 150-158 Permanent Link to this page Figures Fig. 1 Fig. 2 Fig. 3 Fig. 4 Tables Table 1 Table 2 Table 3 Table 4 ---------------------------------------------- References (19) Abadi DJ, Madden SR, and Hachem N (2008). Column-stores vs. row-stores: How different are they really? In the Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Association for Computing Machinery, Vancouver, Canada: 967-980. https://doi.org/10.1145/1376616.1376712 [Google Scholar] Abbasi M, Bernardo MV, Váz P, Silva J, and Martins P (2024). Adaptive and scalable database management with machine learning integration: A PostgreSQL case study. Information, 15(9): 574. https://doi.org/10.3390/info15090574 [Google Scholar] Arlot S and Celisse A (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4: 40–79. https://doi.org/10.1214/09-SS054 [Google Scholar] Bousalem Z, Guabassi IE, and Cherti I (2019). Relational databases versus HBase: An experimental evaluation. Advances in Science, Technology and Engineering Systems Journal, 4(2): 395-401. https://doi.org/10.25046/aj040249 [Google Scholar] Burden F and Winkler D (2008). Bayesian regularization of neural networks. In: Livingstone DJ (Eds.), Artificial neural networks: Methods and applications: 23-42. Humana Press, Totowa, USA. https://doi.org/10.1007/978-1-60327-101-1_3 [Google Scholar] PMid:19065804 El-Helw A, Ross KA, Bhattacharjee B, Lang CA, and Mihaila GA (2011). Column-oriented query processing for row stores. In the Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, Association for Computing Machinery, Glasgow, UK: 67-74. https://doi.org/10.1145/2064676.2064689 [Google Scholar] Jarke M and Vassiliou Y (1985). A framework for choosing a database query language. ACM Computing Surveys, 17(3): 313-340. https://doi.org/10.1145/5505.5506 [Google Scholar] Kanade AS and Gopal A (2013). Choosing right database system: Row or column-store. In the International Conference on Information Communication and Embedded Systems, IEEE, Chennai, India: 16-20. https://doi.org/10.1109/ICICES.2013.6508217 [Google Scholar] Kepner J, Gadepally V, Hutchison D, Jananthan H, Mattson T, Samsi S, and Reuther A (2016). Associative array model of SQL, NoSQL, and NewSQL databases. In the IEEE High Performance Extreme Computing Conference, IEEE, Waltham, USA: 1-9. https://doi.org/10.1109/HPEC.2016.7761647 [Google Scholar] Kim W (2014). Web data stores (aka NoSQL databases): A data model and data management perspective. International Journal of Web and Grid Services, 10(1): 100-110. https://doi.org/10.1504/IJWGS.2014.058774 [Google Scholar] Levenberg K (1944). A method for the solution of certain non-linear problems in least squares. Quarterly of Applied Mathematics, 2(2): 164-168. https://doi.org/10.1090/qam/10666 [Google Scholar] Marquardt DW (1963). An algorithm for least-squares estimation of nonlinear parameters. Journal of the Society for Industrial and Applied Mathematics, 11(2): 431-441. https://doi.org/10.1137/0111030 [Google Scholar] McLachlan GJ, Do KA, and Ambroise C (2005). Analyzing microarray gene expression data. John Wiley and Sons, Hoboken, USA. https://doi.org/10.1002/047172842X [Google Scholar] Møller MF (1993). A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks, 6(4): 525-533. https://doi.org/10.1016/S0893-6080(05)80056-5 [Google Scholar] Mouhiha M and Mabrouk A (2025). NoSQL data warehouse optimizing models: A comparative study of column-oriented approaches. Big Data Research, 40: 100523. https://doi.org/10.1016/j.bdr.2025.100523 [Google Scholar] Okman L, Gal-Oz N, Gonen Y, Gudes E, and Abramov J (2011). Security issues in NoSQL databases. In the 10th International Conference on Trust, Security and Privacy in Computing and Communications, IEEE, Changsha, China: 541-547. https://doi.org/10.1109/TrustCom.2011.70 [Google Scholar] Salunke SV and Ouda A (2024). A performance benchmark for the PostgreSQL and MySQL databases. Future Internet, 16(10): 382. https://doi.org/10.3390/fi16100382 [Google Scholar] Vanwinckelen G and Blockeel H (2012). On estimating model accuracy with repeated cross-validation. In the Proceedings of the 21st Belgian-Dutch Conference on Machine Learning, Ghent, Belgium: 39-44. [Google Scholar] Yassien AW and Desouky AF (2016). RDBMS, NoSQL, Hadoop: A performance-based empirical analysis. In the 2nd Africa and Middle East Conference on Software Engineering, Association for Computing Machinery, Cairo, Egypt: 52-59. https://doi.org/10.1145/2944165.2944174 [Google Scholar]

A novel approach to database selection using feedforward neural networks

Full text

Digital Object Identifier (DOI)

Abstract

Keywords

Article history

Citation:

References (19)