An application of multilayer neural network on

Full text

Turn on search term navigation

Headnote

ABSTRACT

Objective: Implementation of multilayer neural network (MLNN) with sigmoid activation function for the diagnosis of hepatitis disease.

Methods: Artificial neural networks (ANNs) are efficient tools currently in common use for medical diagnosis. In hardware based architectures activation functions play an important role in ANN behavior. Sigmoid function is the most frequently used activation function because of its smooth response. Thus, sigmoid function and its close approximations were implemented as activation function. The dataset is taken from the UCI machine learning database.

Results: For the diagnosis of hepatitis disease, MLNN structure was implemented and Levenberg Morquardt (LM) algorithm was used for learning. Our method of classifying hepatitis disease produced an accuracy of 91.9% to 93.8% via 10 fold cross validation.

Conclusion: When compared to previous work that diagnosed hepatitis disease using artificial neural networks and the identical data set, our results are promising in order to reduce the size and cost of neural network based hardware. Thus, hardware based diagnosis systems can be developed effectively by using approximations of sigmoid function.

Key words: Hepatitis disease diagnosis, multilayer neural network, 10-fold cross validation, approximations of sigmoid activation function

ÖZET

Amaç: Hepatit hastaliginin teshisi için çok katmanli sinir agi (MLNN) ve sigmoid aktivasyon fonksiyonu uygulanmistir.

Yöntemler: Yapay sinir aglari (YSA) tibbi tani için halen yaygin olarak kullanilan etkili araçlardir. Donanim tabanli mimarilerde aktivasyon fonksiyonlari YSA davranisinda önemli rol oynamaktadir. Sigmoid fonksiyonu yumusak tepkisi nedeniyle en sik kullanilan aktivasyon fonksiyonudur. Bu nedenle, sigmoid fonksiyonu ve yaklasimlari aktivasyon fonksiyonu olarak uygulanmistir. Veri kümesi UCI makine ögrenme veri tabanindan alinmistir.

Bulgular: Hepatit hastaliginin tanisi için, MLNN yapisi hayata geçirilmis ve Levenberg Morquardt (LM) algoritmasi ögrenme için kullanilmistir. Hepatit hastaligini siniflandiran yöntemimiz 10-kat çapraz dogrulama yoluyla 91.9%'den 93.8%'e dogruluklar saglamistir.

Sonuç: Yapay sinir aglari ve ayni veri setini kullanarak hepatit hastaligini teshis eden önceki çalisma ile karsilastirildiginda, bizim sonuçlarimiz sinir agi tabanli donanimin boyutunu ve maliyetini azaltmasi bakimindan umut vericidir. Böylece, donanim tabanli tani sistemleri sigmoid fonksiyonu yaklasimlari kullanilarak etkili bir sekilde gelistirilebilir.

Anahtar kelimeler: Hepatit hastaligi tanisi, çok katmanli sinir agi, 10-kat çapraz dogrulama, sigmoid aktivasyon fonksiyonu yaklasimlari

INTRODUCTION

Liver is the largest organ which is responsible for carrying out the most important functions within the body [1]. Hepatitis is characterized by soreness of the liver. Bacterial infections, viruses, drugs or toxins can cause Hepatitis Disease [2].

Medical diagnosis of the diseases is one of the main problems in medicine. Artificial neural networks (ANNs) are efficient tools currently in common use for this purpose [2]. Many techniques for classification of hepatitis disease diagnosis presented in the literature [1-16]. Chen et al. proposed a hybrid system named LFDA-SVM which consists of two integrated methods; a feature extraction method (Local Fisher Discriminant Analysis-LFDA) and a classification algorithm (Supporting Vector Machine-SVM), and an accuracy of 96.8% was obtained [1]. Polat and Gunes used a medical diagnosis method which involves three stages; feature selection program, fuzzy weighted pre-processing and Artificial Immune Recognition System (AIRS) and obtained 94.1% classification accuracy in test phase [3]. Dogantekin et al. proposed a hepatitis disease diagnosis system based on LDA and Adaptive Network based on Fuzzy Inference System (ANFIS). The classification accuracy of LDA-ANFIS system was obtained 94.1% [4]. Calisir and Dogantekin have obtained 95.0% classification accuracy using a method based on Principle Component Analysis (PCA) and Least Square Support Vector Machine (LSSVM) classifier (PCA LSSVM) [5]. Sartakhti et al. used a method (SVM-SA) which hybridizes SVM and Simulated Annealing (SA) techniques and obtained 96.2% classification accuracy [6].

In the techniques above, hybrid systems were proposed which involves feature extraction methods and classification algorithms. The hardware implementations of hybrid systems require large scale multipliers and chip resources. For the disease diagnosis systems, multilayer neural networks (MLNNs) have been the most commonly used tools [17]. Different types of learning algorithms can be used to train MLNN [18,22]. Levenberg Morquardt (LM) algorithm, which regarded as one of the most efficient algorithms, is a second order algorithm and converges much faster than first order algorithms. In this study, we used LM algorithm, uses Hessian matrix in order to perform better estimations and improve convergence, to determine the weights of the connections [22,29].

Our aim here is to diagnose hepatitis disease using MLNN through the sigmoid activation function and its approximations. Activation function plays an important role to determine the outputs. The sigmoid activation function contains the exponential expression ex, so it's difficult to perform for hardware based architectures and requires large chip resources [30]. In this study, the approximations of sigmoid function were used in order to improve the calculation speed of activation function and reduce the size of the hardware. We took the dataset from University of California at Irvine (UCI) machine learning repository [31]. 10 fold cross validation, a widely used performance technique, was used to obtain the classification accuracy [9].

METHODS

Hepatitis disease dataset

Hepatitis disease dataset taken from the UCI machine learning repository was used to compare the performance of our classification system with previous studies which used same dataset [31]. This dataset which was donated by Jozef Stefan Institute, Yugoslavia, is commonly used to check the performance of the networks [1,8]. The dataset comprises of two classes including 155 samples: Class 1 - death cases (32) and Class 2 - alive cases (123). 19 attributes were included in all samples, which are shown in Table 1.

Multilayer neural network

Nowadays, ANNs are efficient applications currently in common use for medical diagnosis [2]. For the diagnosis of hepatitis disease the MLNN was used consisted of an input layer, two hidden layers and an output layer. The structure of MLNN is shown in Figure 1.

The output layer had 2 neurons and the hidden layers had 30 and 15 neurons respectively. All neurons in the MLNN architecture used sigmoid function or its approximations. In this study, we used Levenberg Morquardt (LM) learning algorithm to determine the weights of the connections. The LM algorithm, which is a second order algorithm and an approximation of Newton method, uses Hessian matrix in order to perform better estimations and improve convergence. The sum of the mean squared error is calculated by [29,32]:

... (1)

where l is the difference between actual value and desired value, is the mean squared error function, P is the number of training pattern and N is the number of output. The weights are updated by;

... (2)

... (3)

... (4)

where is the gradient of the mean squared error, is the Hessian, is an unit matrix, k is the iteration number, λ is a scalar value, is the weight vector and is the weight vector in the preceding iteration. The applications of LM learning algorithm for MLNN can be found in [29, 31].

10-fold cross validation was used to obtain the classification accuracy. The dataset is randomly partitioned into k subsets, and the process is repeated k times in k-fold cross validation, which is a com - monly used performance method [9]. Every time, for testing the model a single subset is used and for training the remaining k-1 subsets are used.

To produce a single estimation the k results from the folds then can be averaged (or otherwise combined). All data points are used for both training and validation which is the advantage of this method. For classification accuracy, we used the following equations [9,21,33]:

... (5)

... (6)

where N is the set of data items to be classified (the test set), n ? N, nc is the class of the item n, and classify (n) returns the classification of n by neural networks.

Sigmoid activation function

Neural networks require the use of an activation function at the output of each neuron [34].

The most frequently used activation function, the sigmoid function, can be formulized by:

... (7)

where x defines the number of artificial neurons in the network and y represents the artificial neuron output. The sigmoid function is shown in Figure 2.

Sigmoid activation function is difficult to perform for digital implementation because it consists of an infinite exponential series but there are different ways to implement sigmoid function or its close approximations.

Dataflow implementation of sigmoid function

The sigmoid activation function contains the exponential expression ex, which is difficult to calculate. For this reason, dataflow approximation can be used instead of sigmoid function. This function is a simple polynomial that does not involve any transcendentals, and can be formulized by [35]:

... (8)

The sigmoid function and dataflow approximation are shown in Figure 3.

Piecewise linear approximation

Various approximations of the sigmoid activation function for MLNNs are discussed in the literature [36]. The piecewise linear approximation, proposed here and plotted in.

The piecewise linear technique, which gives a close approximation to the sigmoid function, is presented in Table 2. Detailed computational issues about the piecewise linear approximation of sigmoid function can be found in [37,38]. Figure 4.

Taylor series expansion

A Taylor series is a way of approximating an analytic function around a single point using only the derivatives of the function at that point. It is a suitable method which used for approximating functions. The sigmoid function has also been generated by Taylor series expansion. This implementation used 3 intervals to generate sigmoid function and formulized by [39]:

... (9)

Taylor series expansion gives the closest approximation to the sigmoid function and provides much higher accuracy than previous approximations. The approximated function is shown in Figure 5.

RESULTS

In this paper a number of approaches of sigmoid function for hepatitis disease diagnosis were presented. For this purpose, MLNN structure was implemented and LM algorithm was used for learning. 10-fold cross validation was used to obtain the classification accuracy.

Sigmoid function and its approximations were applied respectively as activation function to obtain classification results. These activation functions are; sigmoid function, dataflow implementation of sigmoid function, piecewise linear approximation and Taylor series expansion approximation. The classification accuracies obtained by mentioned approximations were presented in Table 3. It is clear that the Taylor series expansion gives the closest approximation to the sigmoid function.

The classification accuracies of this study were also compared with the previous studies results on the diagnosis of hepatitis disease which used the same dataset. The comparison of the previous studies and our study are given in Table 4.

In this study, we used LM algorithm, uses Hessian matrix in order to perform better estimations and improve convergence, to obtain promising results. According to Table 4, the classification accuracies of MLNN, implemented in this study provide better results than the accuracies of the other MLNN (MLP) structures. This can be because of that, LM algorithm converges much faster than first order algorithms but it can cause the memorization effect when the over-training occurs. So, an over-trained MLNN with LM algorithm can impact performance negatively. The accuracy values can be checked during the training process to prevent the memorization effect [40]. On the other hand, Bascil and Temurtas reported 91.9% classification accuracy using MLNN with LM. This result is quite similar to the results obtained by our study. But we used the approximations of sigmoid activation function that do not involve any transcendentals. The approximations of sigmoid function can easily be performed and our results are promising in order to reduce the size and cost of neural network based hardware. One can see that the FS-Fuzzy-AIRS, LDA-ANFIS, PCA-LSSVM and SVM-SA classification accuracies which use hybrid methods are better than the results of this study. However, the mentioned methods which are specific for classification, provide better classification accuracies; these methods are too complex for digital implementations.

DISCUSSION

In this study sigmoid function and its close approximations were implemented for digital applications. In hardware based architectures activation functions play an important role in ANN behavior. The sigmoid function and its approximations are suitable for training because of their smooth response. These approximations, mentioned above, can be used to develop learning strategies for implementation of ANNs on adaptive hardware. Additionally, approximations of sigmoid function can be used instead of sigmoid function. Because, it is hard to perform sigmoid function on adaptive hardware. The results showed that the approximations of the sigmoid function can easily be performed for hardware based architectures. Having compared obtained results, Table 3 shows that approximated functions perform classification accuracy as good as sigmoid function. When compared to previous work that diagnosed hepatitis disease using artificial neural networks and the identical data set, it was observed that hybrid methods achieved the best classification accuracies. On the other hand, the hardware implementations of hybrid systems require large scale multipliers and chip resources.

Acknowledgement: We sincerely thank UCI machine learning repository for providing hepatitis disease dataset.

References

REFERENCES

1. Chen H-L, et al. A new hybrid method based on local fisher discriminant analysis and support vector machines for hepatitis disease diagnosis. Expert Syst Applicat 2011;38:11796-11803.

2. Ansari S, et al. Diagnosis of liver disease induced by hepatitis virus using artificial neural networks. Multitopic Conference (INMIC), 2011 IEEE 14th International 2011;8-12.

3. Polat K, Gunes S. A hybrid approach to medical decision support systems: combining feature selection, fuzzy weighted pre-processing and AIRS. Comput Methods Programs Biomed 2007;88:164-174.

4. Dogantekin E, Dogantekin A, Avci D. Automatic hepatitis diagnosis system based on Linear Discriminant Analysis and Adaptive Network based on Fuzzy Inference System. Expert Syst Applicat 2009;36:11282-11286.

5. Calisir D, Dogantekin E. A new intelligent hepatitis diagnosis system: PCA LSSVM. Expert Syst Applicat 2011;38:10705-10708.

6. Sartakhti JS, et al. Hepatitis disease diagnosis using a novel hybrid method based on support vector machine and simulated annealing (SVM-SA). Comput Methods and Programs in Biomed 2011.

7. Ozyilmaz L, Yildirim T. Artificial neural networks for diagnosis of hepatitis disease, in: International Joint Conference on Neural Networks (IJCNN) 2003;1:586-589.

8. http://www.is.umk.pl/projects/datasets.html

9. Polat K, Gunes S. Hepatitis disease diagnosis using a new hybrid system based on feature selection (FS) and artificial immune recognition system with fuzzy resource allocation. Digital Signal Process 2006;16:889-901.

10. Polat K, Gunes S. Medical decision support system based on artificial immune recognition immune system (AIRS), fuzzy weighted pre-processing and feature selection. Expert Syst Applicat 2007;33:484-490.

11. Bascil MS, Temurtas F. A study on hepatitis disease diagnosis using multilayer neural network with Levenberg Marquardt Training Algorithm. J Med Syst 2011;35:433-436.

12. ich W, et al. Minimal distance neural methods. Neural Networks Proceedings, 1998. IEEE World Congress on Computational Intelligence. The 1998 IEEE International Joint Conference on, 1998;2:1299-1304.

13. Duch W, Adamczak R, Grabczewski K. Optimization of logical rules derived by neural procedures. Neural Networks, 1999. IJCNN '99. International Joint Conference on, 1999;1:669-674.

14. Ster B, Dobnikar A. Neural Networks in Medical Diagnosis: Comparison with Other Methods. Proceedings of the International Conference EANN96 1996;1:427-430.

15. Tan KC, et al. A hybrid evolutionary algorithm for at - tribute selection in data mining. Expert Syst Applicat 2009;36:8616-8630.

16. Bascil MS, Oztekin H. A study on hepatitis disease diagnosis using probabilistic neural network. J Med Syst 2010.

17. Er O, Tanrikulu AC, Abakay A. Use of artificial intelligence techniques for diagnosis of malignant pleural mesothelioma. Dicle Medical Journal 2015;42:467-470.

18. Haykin S. Neural Networks: A Comprehensive Foundation. New York, Macmillan Publishing 1994.

19. Kayaer K, Yildirim T. Medical diagnosis on Pima Indian Diabetes using general regression neural networks. In Proc. of International Conference on Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP), Istanbul:181-184.

20. Delen D, Walker G, Kadam A. Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine 2005;34:113-127.

21. Temurtas F. A comparative study on thyroid disease diagnosis using neural networks. Expert Syst Applicat 2009;36:944-949.

22. Er O, Temurtas F. A study on chronic obstructive pulmonary disease diagnosis using multilayer neural networks. J Med Syst 2008;32:429-432.

23. Rumelhart DE, Hinton GE. Williams RJ. Learning internal representations by error propagation. In Rumelhart DE, and McClelland JL. (Eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, MIT Press, Cambridge, MA, 1986;1:318-362.

24. Brent RP. Fast training algorithms for multilayer neural nets. IEEE Trans. Neural Networks 1991;2:346-354.

25. Gori M, Tesi A. On the problem of local minima in backpropagation. IEEE Trans Pattern Anal Machine Intell 1992;14:76-86.

26. Hagan MT, Menhaj M. Training feed forward networks with the Marquardt algorithm. IEEE Trans Neural Networks 1994;5:989-993.

27. Hagan MT, Demuth HB, Beale MH. Neural Network Design, PWS Publishing, Boston, MA, 1996.

28. Gulbag A, Temurtas F. A study on quantitative classification of binary gas mixture using neural networks and adaptive neuro fuzzy inference systems. Sens Actuators B 2006;115:252-262.

29. Rumelhart DE, et al. Backpropagation: The basic theory. In: Smolensky P, Mozer MC, Rumelhart DE. (Eds.) Mathematical Perspectives on Neural Networks, Hillsdale, NJ, Erlbaum, 1996;533-566.

30. Ozdemir AT, Danisman K. Fully parallel ANN-based arrhythmia classifier on a single-chip FPGA: FPAAC. Turkish Journal of Elec Eng and Computer Sci 2011;19:667 687.

31. http://archive.ics.uci.edu/ml/datasets/Hepatitis, last accessed: 20 March 2013.

32. Wilamowski BM, Yu H. Improved Computation for Levenberg-Marquardt Training IEEE Trans Neural Networks 2010;21:930-937.

33. Watkins A. AIRS: A resource limited artificial immune classifier. Master Thesis, Mississippi State University, 2001.

34. Myers DJ, Hutchinson RA. Efficient implementation of piecewise linear activation function for digital VLSI neural Networks. Electronics Letters 1989;25:1662-1663.

35. Bharkhada BK. Efficient FPGA implementation of a generic function approximator and its application to neural net computation. Master Thesis, University of Cincinnati, 2003.

36. Nordström T, Svensson B. Using and designing massively parallel computers for artificial neural networks. Journal of Parallel and Distributed Computing 1992;14:260 285.

37. Amin H, Curtis KM, Hayes-Gill BR. Piecewise linear approximation applied to nonlinear function of a neural network. IEE Proceedings-Circuits Devices and Systems 1997;144:313-317.

38. Tommiska MT. Efficient digital implementation of the sigmoid function for reprogrammable logic. IEE ProceedingsComputers and Digital Techniques 2003;150:403-411.

39. Arroyo Leon MAA, Ruiz Castro A, Leal Ascencio RR. An artificial neural network on a field programmable gate array as a virtual sensor. Design of Mixed-Mode Integrated Circuits and Applications, 1999. Third International Workshop on, 1999;114-117.

40. Temurtas H, Yumusak N, Temurtas F. A comparative study on diabetes disease diagnosis using neural networks. Expert Syst Applicat 2009;36:8610-8615.

AuthorAffiliation

Onursal Çetin1, Feyzullah Temurtas1, Senol Gülgönül2

1 Bozok University, Department of Electrical and Electronics Engineering, Yozgat, Turkey

2 TURKSAT Satellite Communication and Cable TV AS, Ankara, Turkey

Yazisma Adresi /Correspondence: Onursal Çetin,

Bozok University, Dept. Electrical and Electronics Engineering Email: [email protected]

Gelis Tarihi / Received: 25.03.2015, Kabul Tarihi / Accepted: 20.04.2015

Word count: 3047

Show less

Details

Title

An application of multilayer neural network on hepatitis disease diagnosis using approximations of sigmoid activation function/Sigmoid aktivasyon fonksiyonu kestirimi kullanilarak karaciger hastaligi tanisinda çok katmanli sinir agi uygulamasi

Author

Çetin, Onursal; Temurtas, Feyzullah; Gülgönül, Senol

Pages

150-157

Section

ORIGINAL ARTICLE / ÖZGÜN ARASTIRMA

Publication year

2015

Publication date

Jun 2015

Publisher

Dicle University

ISSN

13002945

e-ISSN

13089889

Source type

Scholarly Journal

Language of publication

English

ProQuest document ID

1699538690

An application of multilayer neural network on hepatitis disease diagnosis using approximations of sigmoid activation function/Sigmoid aktivasyon fonksiyonu kestirimi kullanilarak karaciger hastaligi tanisinda çok katmanli sinir agi uygulamasi

Jump to:

Full text

Details

Suggested sources