Performance comparison of second order conjugate algorithms in neural networks for predictive data mining

Sehgal, et al.

International Journal of Advanced and Applied Sciences

Int. j. adv. appl. sci.

EISSN: 2313-3724

Print ISSN: 2313-626X

Volume 4, Issue 8 (August 2017), Pages: 68-73

Title: Performance comparison of second order conjugate algorithms in neural networks for predictive data mining

Author(s): Parveen Sehgal ^1,*, Sangeeta Gupta ², Dharminder Kumar ³

Affiliation(s):

¹Department of Computer Science and Engineering, NIMS University, Jaipur, Rajasthan-303121, India
²Guru Nanak Institute of Management, Guru Gobind Singh Indraprastha University, New Delhi-110026, India
³Guru Jambheshwar University of Science and Technology, Hisar, Haryana-125001, India

https://doi.org/10.21833/ijaas.2017.08.010

Full Text - PDF XML

Abstract:

In this paper, a performance comparison of several variations of the non-linear conjugate gradient method has been investigated. Neural Network-based prediction models for life insurance sector have been developed and their training has been done with a variety of first and second order algorithms to find an efficient training algorithm, but keeping the focus on conjugate gradient based methods. Traditional second order methods require computation of second order derivatives and need to compute hessian for quadratic termination; which is a tedious and memory consuming task. Here we employ conjugate gradient methods which bypass the computation of hessian, but still achieve quadratic termination and thus prove to be memory efficient.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Keywords: Artificial neural networks, Conjugate gradient, Line search, Numerical optimization, Prediction modeling

Article History: Received 19 March 2017, Received in revised form 15 May 2017, Accepted 10 June 2017

Digital Object Identifier:

https://doi.org/10.21833/ijaas.2017.08.010

Citation:

Sehgal P, Gupta S, and Kumar D (2017). Performance comparison of second order conjugate algorithms in neural networks for predictive data mining. International Journal of Advanced and Applied Sciences, 4(8): 68-73

http://www.science-gate.com/IJAAS/V4I8/Sehgal.html

References:

Andrei N (2006). Conjugate gradient algorithms for molecular formation under pairwise potential minimization. In the 5th Workshop of Mathematical modeling of environmental and life sciences problems, Constanta, Romania: 7–26. Available online at: http://www.csm.ro/home/mmelsp_series/mmelsp_08_papers/nandrei_08.pdf
Antoniou A and Lu WS (2007). Practical optimization: algorithms and engineering applications. Springer Science and Business Media, Berlin, Germany.
Borkar P, Sarode MV, and Malik LG (2016). Employing Speeded Scaled Conjugate Gradient Algorithm for Multiple Contiguous Feature Vector Frames: An Approach for Traffic Density State Estimation. In the International Conference on Information Security and Privacy (ICISP'15), Nagpur, India, 78: 740–747.
Castillo E, Guijarro-Berdi-as B, Fontenla-Romero O, and Alonso-Betanzos A (2006). A very fast learning method for neural networks based on sensitivity analysis. Journal of Machine Learning Research, 7: 1159-1182.
Chel H, Majumder A, and Nandi D (2011). Scaled conjugate gradient algorithm in neural network based approach for handwritten text recognition. In: Nagamalai D, Renault E, and Dhanuskodi M (Eds.), Trends in Computer Science, Engineering and Information Technology: 196-210. Springer Berlin Heidelberg, Heidelberg, Germany. https://doi.org/10.1007/978-3-642-24043-0_21
Fletcher R (2013). Practical methods of optimization. John Wiley and Sons, New Jersey, USA.
Hagan MT, Demuth HB, and Beale MH (1996). Neural network design. PWS Publishing Company, Boston, USA.
Hager WW and Zhang H (2006a). Algorithm 851: CG_DESCENT, a conjugate gradient method with guaranteed descent. ACM Transactions on Mathematical Software (TOMS), 32(1): 113-137. https://doi.org/10.1145/1132973.1132979
Hager WW and Zhang H (2006b). A survey of nonlinear conjugate gradient methods. Pacific journal of Optimization, 2(1): 35-58.
Haykin S (1994). Neural networks: A comprehensive foundation. Prentice Hall, New Delhi, India.
Keramati A, Jafari-Marandi R, Aliannejadi M, Ahmadian I, Mozaffari M, and Abbasi U (2014). Improved churn prediction in telecommunication industry using data mining techniques. Applied Soft Computing, 24: 994-1012. https://doi.org/10.1016/j.asoc.2014.08.041
MathWorks (2012). Neural network toolbox. Available online at: https://www.mathworks.com/products/neural-network.html
Meza JC (2010). Steepest descent. Wiley Interdisciplinary Reviews: Computational Statistics, 2(6): 719-722. https://doi.org/10.2172/983240
Møller MF (1993). A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks, 6(4): 525-533. https://doi.org/10.1016/S0893-6080(05)80056-5
Nocedal J and Wright SJ (2006). Numerical optimization. Springer Science and Business Media, Berlin, Germany.
Pujol J (2007). The solution of nonlinear inverse problems and the Levenberg-Marquardt method. Geophysics, 72(4): 1-16. https://doi.org/10.1190/1.2732552
Rehman MZ and Nawi NM (2012). Studying the Effect of adaptive momentum in improving the accuracy of gradient descent back propagation algorithm on classification problems. International Journal of Modern Physics: Conference Series, World Scientific Publishing Company, 9: 432-439. https://doi.org/10.1142/s201019451200551x
Robitaille B, Marcos B, Veillette M, and Payre G (1993). Quasi-Newton methods for training neural networks. WIT Transactions on Information and Communication Technologies, 2: 323-335
Rojas R (2013). Neural networks: A systematic introduction. Springer Science and Business Media, Berlin, Germany.
Sandhu PS and Chhabra S (2011). A comparative analysis of Conjugate Gradient algorithms and PSO based neural network approaches for reusability evaluation of procedure based software systems. Chiang Mai Journal of Science, 38(2): 123-135.
Shanthi D, Sahoo G, and Saravanan N (2009). Comparison of neural network training algorithms for the prediction of the patient's post-operative recovery area. Journal of Convergence Information Technology, 4(1): 24-32. https://doi.org/10.4156/jcit.vol4.issue1.shanthi
Shepherd AJ (2012). Second-order methods for neural networks: Fast and reliable training methods for multi-layer perceptrons. Springer Science and Business Media, Berlin, Germany.
Slavici T, Maris S, and Pirtea M (2016). Usage of artificial neural networks for optimal bankruptcy forecasting (Case study: Eastern European small manufacturing enterprises). Quality & Quantity, 50(1): 385-398. https://doi.org/10.1007/s11135-014-0154-0
Sundarkumar GG and Ravi V (2015). A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance. Engineering Applications of Artificial Intelligence, 37(1): 368-377. https://doi.org/10.1016/j.engappai.2014.09.019
Tezel G and Buyukyildiz M (2016). Monthly evaporation forecasting using artificial neural networks and support vector machines. Theoretical and Applied Climatology, 124(1-2): 69-80. https://doi.org/10.1007/s00704-015-1392-3
Yu H and Wilamowski BM (2012). Neural network training with second order algorithms. In: Hippe ZS, Kulikowski JL, and Mroczek T (Eds.), Human–computer systems interaction: Backgrounds and applications 2: 463-476. Springer Berlin Heidelberg, Heidelberg, Germany. https://doi.org/10.1007/978-3-642-23172-8_30