Melden

Teilen

•0 gefällt mir•283 views

•0 gefällt mir•283 views

Melden

Teilen

Downloaden Sie, um offline zu lesen

Neural Networks in Data Mining

- 1. Neural Networks in Data Mining - “An Overview”
- 2. Agenda u Introduction u Data Mining Techniques u Neural Networks for Data Mining? Neural Networks Classification Neural Networks Pruning Neural Networks Rule Extraction u Conclusion u Questions?
- 3. Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases It is an essential step in the process of knowledge discovery. Data Mining
- 4. • data cleaning • data integration • data selection • data transformation • data mining • pattern evaluation • knowledge presentation. Steps of Knowledge Discovery
- 5. Data Mining: A KDD Process Data mining—core of knowledge discovery process Data Cleaning Data Integration Databases Data Warehouse Task-relevant Data Selection Data Mining Pattern Evaluation
- 6. Why Data mining? Data explosion problem Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories We are drowning in data, but starving for knowledge! Solution: Data warehousing and data mining
- 7. Tasks of data mining Concept Description Association Classification Prediction Cluster Analysis Outlier Analysis
- 8. Classification It is the process of finding a model that is able to predict the class of objects whose label is unknown. For eg. It can classify the customers who can pay the loan based on the existing records in the bank database.
- 9. Decision trees Bayesian classification Neural networks Genetic algorithms Memory Based Reasoning etc., Classification methods
- 10. high tolerance of noisy data. ability to classify patterns on which they have not been trained. can be used when there is little knowledge of the relationships between attribute and classes. Why Neural networks?
- 11. well suited for continuous valued inputs and outputs unlike most decision tree algorithms. rules can be extracted easily by available techniques from trained neural network. Why Neural networks? - Contd.
- 12. Neural Networks It is the study of how to make computers to make sensible decisions and to learn by ordinary experience as we do.
- 13. Neurons The human brain consists has about 100 billion neurons and 100 trillion connections (synapses) between them. Here is what a typical neuron looks like: Many highly specialized types of neurons exist, and these differ widely in appearance. Characteristically, neurons are highly asymmetric in shape.
- 14. It consists of an input layer, one or more hidden layers and an output layer. Input Layer Hidden Layer Output Layer Structure of Multi layer feed forward neural network Multi layer feed forward neural network
- 15. Backpropagation Backpropagation is the neural network learning algorithm. It learns by iteratively processing a dataset of training examples, comparing the network's prediction for each example with the actual known target value.
- 16. Overview of BP The backpropagation algorithm learns the network by iteratively processing the np training examples of a dataset, comparing the networks result ok for each example with the desired known target value dk for each target class k in a dataset.
- 17. Consider a fully connected three layer feedforward neural network as in figure , X1 X2 Xi Xl … … … h1 O1 … w11 w12 wl1 wlm hm On v11 v12 vm1 vmn Overviewof BP – Contd. Bias (-1) Bias (-1) h2
- 18. Consists of l input neurons, m hidden neurons and n output neurons np be the number of examples consider for training. Let xip be the ith input unit of pth example in a dataset, where i= 1, 2,… l. Wij be the weight between input unit neuron i to hidden unit neuron j, where j=1,2…m, Overview of BP – Contd.
- 19. vjk be the weight between hidden neuron j to output neuron k, where k=1, 2,… n. initially the weights wij and vjk takes the random value between -1 to 1. Let hj be the activation value of the hidden neuron j ok be the actual output of the kth neuron. Overview of BP – Contd.
- 20. Bias • It is a threshold value that serves to vary the activity of the neuron. • The bias input is fixed and always equals -1. Overview of BP – Contd.
- 21. The activation value of hidden neuron hj for pth examples can be calculated by, Overview of BP – Contd.
- 22. The actual output ok can be calculated by, Overview of SBP – Contd.
- 23. Weights are modified for each example so as to minimize the mean squared error (mse). The value of mse can be calculated according to the following equation Overview of BP – Contd.
- 24. Weight updation are made in the backward direction i.e., from the output layer through hidden layer and to input layer. Overview of BP – Contd.
- 25. Learning Rate λ avoids local minimum (where the weights appear to converge but are not at the optimal solution). encourages finding global minimum. Typically having a value between 0.0 to 1.0. Overview of BP – Contd.
- 26. For each unit k in the output layer compute the Error using Errk = ok(1-ok)(dk-ok) For each weight vjk in network compute weight increment using Δvjk=(λ) Errk*hj update the weight vjk using vjk = vjk + Δvjk Overview of BP – Contd.
- 27. For each unit j in the hidden layers, from the last to the first hidden layer compute the Error using Errj = hj (1-hj) Σ Errk*vjk; For each weight wij in network compute weight increment using Δwij=(λ)*Errj*xip update the weight wij using wij = wij + Δwij Overview of BP – Contd.
- 28. Overview of BP – Contd. For each bias Ǿj in network compute the bias increment using Δ Ǿj = (λ)*Errj update the bias weight using Ǿj = Ǿj + Δ Ǿj
- 29. The algorithm stops the learning when, • The mean squared error is below a threshold value. • A pre specified number of epochs has expired Overview of BP – Contd.
- 30. Random data selection method The training and testing examples are taken randomly from each class. K-fold cross validation method Example The iris dataset is having 3 classes with 50 examples for each class. From each class 25 examples are taken randomly for training and another 25 examples are taken randomly for testing the network. Data selection method.
- 31. Performance Measures Accuracy It is the percentage of test dataset that are correctly classified by the classifier. Speed It refers to computational time and cost involved in generating and using given classifier.
- 32. Evolving Network Architectures The success of ANNs largely depends on their architecture. Small networks require long training time and can be easily get trapped into a Local Minima. Large networks able to learn fast and avoids local minima but with poor generalization. Optimal architecture is a network that is large enough to learn the problem and is small enough to generalize well.
- 33. approaches for optimizing Neural Networks Constructive methods - new hidden units are added during the training process, also called as Growing methods. Destructive methods - a large network is trained and then unimportant nodes or weights are removed, also called as Pruning methods. Hybrid methods - can both add and remove.
- 34. Pruning is defined as a network trimming within the assumed initial architecture. This can be accomplished by estimating the sensitivity of the total error to the exclusion of each weight or neuron in the network. The weights or neurons which are insensitive to error changes can be discarded after each step of training. The trimmed network is of smaller size and is likely give higher accuracy than before its trimming. What is Pruning ?
- 35. Hepatitis Pruning Results Step Current Architecture Acctest % Epochs Pruned Neurons 1 19-25-2 78.2 200 18 hidden neurons 2 19-7-2 80.5 50 5 hidden neurons 3 19-2-2 83.95 50 Pruning stops Original network with architecture 19-25-2 with accuracy 78.2% is reduced to the architecture 19-2-2. Requires 0.76 seconds to obtain the pruned network.
- 36. Rule Extraction Why Rule extraction? An important drawback of neural networks is their lack of explanation capability i.e., it is very difficult to understand how an ANN has solved a problem. To overcome this problem various rule extraction algorithms have been developed. Rule extraction : It changes a black box system into a white box system by translating the internal knowledge of a neural network into a set of symbolic rules . The classification process of a neural networks can be described by a set of simple rules.
- 37. Extracted Rules of 6 real datasets.
- 38. •robots that can see, feel, and predict the world around them •improved stock prediction •common usage of self-driving cars •composition of music •handwritten documents to be automatically transformed into formatted word processing documents •trends found in the human genome to aid in the understanding of the data compiled by the Human Genome Project •self-diagnosis of medical problems using neural networks and much more! NNs might, in the future, allow:
- 40. QUESTIONS