Web mining describes the practice, of conservative; data mining techniques onto the web resources and has facilitated the further development of these techniques to consider the specific structures of web data.
The analysed web resources contain the actual web site and the hyperlinks connecting these sites and the path that online users take on the web to reach a distinct site.
2. Outlines
● Introduction to Web Mining For Extraction
● Purpose
● Method 1- Supervised Learning
● Method 2- Unsupervised Learning
● Comparison
3. Introduction to Web Mining for
Extraction
● Web mining describes the practice, of conservative;
data mining techniques onto the web resources
and has facilitated the further development of these
techniques to consider the specific structures of web
data.
● The analysed web resources contain the actual web
site and the hyperlinks connecting these sites and the
path that online users take on the web to reach a
distinct site.
4. Continue..
● Web usage mining then refers to the deduction of
useful knowledge from the data inputs. While the input
data are mostly web server logs and other primarily
technically position data, the expected output is an
understanding of user behaviour in the domain of
online data search, online shopping, online
learning etc.
5. Purpose
● web usage mining that helps to deal with certain web
scaling problems such as user trend analysis of
surfing, traffic flow analysis, distributed control
handling, web traffic management and many more.
● Session tracking and website reorganization,
distributed traffic sharing on distributed servers can be
identified and analysis based on web data can be
possible using concepts of neural network.
6. Continue..
● Neural network is far different from static networks in
which each node is self-intelligent, hence the
network becomes intelligent. So, web users can use
this network more and more.
7. Method 1- Supervised Learning
● In supervised learning the task is to automatically
induce a model based on a set of N instances, called
training data.
● This model then will be used to assign labels to new
instances with unknown labels using only the value of
their predictor variables.
● Artificial neuronal network is based on simulating the
structure and behaviour of the biological neuronal
networks.
8. Two Approach for Web Mining in AI
Approach 1- Neuro-Fuzzy Approach for Web Mining
Approach 2- Reduction of Stages on Neuro-Fuzzy after
Backpropagation implementation
11. Continue..
● If any Web-mining researches apply this Back
propagations, then can easily obtained best result
than any implemented Web mining techniques
because of top down and bottom-up weights.
● Also using Back Propagation, it is more beneficiary to
minimize the number of steps in Web mining as
compare to neuro-fuzzy approach.
12. Continue..
● As neuro-fuzzy approach uses five major steps to
produce the Webusage pattern forecast, and Web-
usage data analyzer; named Web-log data collection,
data preprocessing, self-organizing map, Web-usage
data cluster, and fuzzy inference system
● But Back propagations use only three steps as Web-
log data collection, data pre-processing, and Back
propagations itself.
13. Method 2- Unsupervised Learning
❖ Clustering using SOM
The self-organizing maps (SOM) introduced ]are deemed
as being highly effective as a sophisticated
visualization tool for visualizing high dimensional,
complex data with inherent relationships between the
various features comprising the data.
The SOM‟s output emphasizes the salient features of the
data and subsequently leads to the automatic
formation of clusters of similar data items.
14. Continue..
● The Self-Organizing Map (SOM) has proven to be one
of the most powerful algorithms in data visualization
and exploration. Application areas include various
fields of science and technology, e.g., complex
industrial processes, telecommunications systems,
document and image databases, and even financial
applications.
● The SOM maps the high- dimensional input vectors
onto a two-dimensional grid of prototype vectors and
orders them.
15. Continue..
● For a human interpreter, the ordered prototype
vectors are easier to visualize and explore than the
original data. The SOM has been widely implemented
in various software tools, Post-processing the SOM
extracts qualitative or quantitative information of the
data.
17. Table 1. Comparison with respect
to SSE with Different clusters and
cases of K-Means and SOM
18. Continue..
● K-Means cover more Urls but SOM works
better for larger number of cases. With
increase in data, learning process of SOM
becomes more accurate and we can consider
larger number of clusters. SOM is also efficient
in time as compared to K-Means. Thus we can
conclude that SOM has better performance
than K-Mean
19. Comparison
● supervised learning is much effective than
unsupervised learning. Previously, unsupervised
extraction used extraction patterns that make
assumptions about the regularity of the structure in
the data. We relax this assumption by exploiting
reference sets to aid the extraction.
20. Continue..
● SOM used for clustering is much faster and
accurate which helps us further in artificial neural
network mining which going to analyse pattern defined
in the training set and further will be compared many
unorganised testing set. The comparison will go under
the process of pre-processing, classification ,clustering
and analysing.