Classification techniques in data mining:Since credit scoring depend on classification of borrowers to defaulters and non-defaulters, it considered as data mining classification problem 1. Classification is the most popular task in data mining, it contain a wide range of techniques such as Nearest neighbor classifier, Logistic Regression(LR), Decision Trees (DT), Artificial Neural Network(ANN), Support Vector Machine(SVM), Rough set, Genetic Algorithms (GAs), Fuzzy Sets, K-NEAREST NEIGHBOR CLASSIFIERS, BAYESIAN NETWORKS,…. etc. most of them have been used to develop credit scoring models.Bellow we review some classification techniques in data mining:Artificial Neural Network (ANN)The brain is one of the most complex part in the human body. Neurons form the basis of the brain. Therefore a typical human brain consists of billion neurons and each neurons having thousands connections with other neurons.DANN a mathematical Non-linear predictive technique that mimic neurophysiology of human .An ANN consists of number of very simple high connected processors (neurons).These neurons are connected by weighted links passing signals from one neurons to another. Weights are express the importance of each neuron input, ANN learns through repeated adjusted of these weights C. ANN made up of multiple layers (input, hidden, output), whereas the input layer first processes the input features to the hidden layer. The hidden layer then calculates the adequate weights by using the transfer function such as logistic function before sending to the output layer. A The neurons of network arranged along this layers.The main steps to build ANN B:Design a specific network architecture (that includes a specific number of “layers” each consisting of a certain number of “neurons” and how they are connected).Network is then training, neurons apply an iterative process to the number of inputs to adjust the weights of the network in order to optimally predict the sample data on which the “training” is performed. After the phase of learning from an existing data set, the new network is ready and it can then be used to generate predictions. The resulting “network” developed in the process of “learning” represents a pattern detected in the data.Recently, ANN has been widely used in credit scoring problems such as:(Eiman kambal, et al 2013) in their paper develop efficient and suitable credit scoring models for the Sudanese credit data set. By comparing the most widely used data mining techniques such as ANN, PCA-ANN, GA-ANN, DT, PCA-DT and GA-DT. There were two experiments the first use Holdout and the second 10-fold cross-validation as assessing classifier accuracy. These experiments showed that GA-ANN credit scoring model outperformed all other models in terms of accuracy.Decision Trees (DT)A decision tree is a flowchart-like tree structure, where each node denotes a test on an attribute value, each branch represents an outcome of the test, and tree leaves represent classes or class distributions 4. It represent sets of Decisions. DTs are understandable and easy interpret, work with both numerical and categorical data 10.
Hi!
I'm Barry!
Would you like to get a custom essay? How about receiving a customized one?
Check it out