Unsupervised Slides

5 - Unsupervised Learning Introduction

• Unsupervised Learning
• Learner receives no explicit information about
• Introduction classification of input examples.
• Statistical Clustering • Information is implicit.
• Aim of learning process - to discover regularities in the input
• Conceptual Clustering data.
• UNIMEM • Typically, consists of partitioning instances into classes
(based on some similarity metric).
• COBWEB • ie finding clusters of instances in the instance space.
• Not surprising that unsupervised learning systems sometimes
closely resemble statistical clustering systems.

What is Clustering ? Simple Clustering
Algorithm
• Initialize
• Common problem - construction of meaningful • Set D to be the set of singleton sets such that each
classifications of observed objects or situations. set contains a unique set.
• Often known as numerical taxonomy - since it • Until D contains only 1 element, do the following:
involves production of a class hierarchy • Form a matrix of similarity values for all
(classification scheme) using a mathematical elements of D
measure of similarity over the instances. • Using some given similarity function
• Merge those elements of D which have a
maximum similarity value.

• Often known as agglomerative clustering.
• Works bottom-up - trying to build larger clusters.
• Alternative - divisive clustering.
• Works top-down (cf ID3)

Clustering Clustering

• Traditional techniques • Consider this example:
• Often inadequate - as they arrange objects into classes solely
on the basis of a numerical measure of object similarity.
• Only information used is that contained in the instances A B
themselves.
• Algorithms unable to take account of semantic relationships
among instance attributes or global concepts that might be of
relevance in forming a classification scheme.
• Conceptual Clustering • WE would not cluster A and B together - but would
• Idea first introduced by R S Michalski - 1980 cluster them into the 2 diamonds.
• Defined as process of constructing a concept network • Partitioning using concept membership rather
characterizing a collection of objects with nodes marked by than distance.
concepts describing object classes & links marked by the • Points are placed in the same cluster if
relationships between the classes. collectively they represent the same concept.
• This is basis of conceptual clustering

Conceptual Clustering Conceptual Clustering

• Can be regarded as:
name body-cover heart-chamber body-temp fertilisation
mammal hair four regulated internal
• Given: • Given animal bird feathers four regulated internal
• A set of objects descriptors: reptile cornified-skin imperfect-four unregulated internal
• A set of attributes to be used to characterise objects amphibian moist-skin three unregulated external
fish scales two unregulated external
• A body of background knowledge - includes problem
constraints, properties of attributes, criteria for evaluating
quality of constructed classifications. animals

• Find: • Classification
• A hierarchy of object classes hierarchy mammals/bird reptile amphibian/fish
• Each node should form a coherent concept produced:
• Compact
• Easily represented in terms of a definition or rule that mammal bird amphibian fish
has a natural interpretation for humans

Conceptual Clustering UNIMEM

• Lebowitz - 1987
• Michalski - 1980
• Essentially a divisive clustering algorithm
• Conjunctive conceptual clustering • Uses a decision tree structure as its basic representation.
• Concept class consists of conjunctive statements
involving relations on selected object attributes.
• If asked to classify an instance - searches down through the
• Method arranges objects into a hierarchy of classes. tree, testing attributes & returns a classification based on the
• CLUSTER/2 relevant leaf nodes.
• Used to construct classification hierarchy of a large
collection of Spanish folk songs. • If asked to update the tree so as to represent a new instance
- searches down through the tree looking for a suitable place
to add in new structure.

UNIMEM UNIMEM

• Basic clustering principle:
• Add new nodes into tree as & when they appear
to be warranted by the presented instances. • Instance matches a node if it is covered by that node (concept)
• UNIMEM actually stores each presented instance • Matching determined by testing to see what proportion of
at all nodes which cover it. the instance's attributes are associated with the node.
• Search process returns all the most specific nodes that explain
• If two instances stored at a node that are (cover) the new instance.
particularly similar - then create an extra child • UNIMEM then generalizes each node in this set as necessary
node whose definition covers the two instances in in order to account for the new instance.
question. • The new instance is then classified with all other instances
• Two instances are then relocated to this node. stored at the node.
• As new instances are processed - new nodes are
created & hierarchy grows downwards.

UNIMEM Algorithm UNIMEM as Memory

• UNIMEM actually stores new instances inside the tree.
• Initialize decision tree to be an empty root node. • Can thus be viewed as a type of memory.
• Apply following steps to each instance: • GBM - Generalisation-Based Memory
• Search the tree depth-first for most specific concept • Structure of hierarchy enables classes of instances to be
nodes that the instance matches. accessed much more efficiently than would be the case
• Add new instance to the tree at or below these nodes if all instances were stored in a linear memory
• Involves comparing new instance to ones already structure.
stored there & creating new subnodes if appropriate.

COBWEB COBWEB

• Incremental system for hierarchical conceptual
• Fisher - 1987 clustering
• Based on principle that a good clustering should • Carries out hill-climbing search through a space of
minimize distance between two points within a cluster & hierarchical classification schemes using operators
maximize distance between points in different clusters. which enable bidirectional travel through this space.
• Good clustering defined as: • Features of COBWEB:
• One which maximizes intra-cluster similarity & • Heuristic evaluation function to guide search.
minimizes inter-cluster similarity.
• State representation - structure of hierarchies &
representation of concepts.
• Goal of COBWEB - to find optimum tradeoff between
these two ! • Operators used to build classification schemes
• Control strategy.

Category Utility Representation

• Can be viewed as a function which rewards • Choice of category utility as heuristic measure dictates a
similarity of objects within same class & concept representation different to logical, typically
dissimilarity of objects in different classes. conjunctive representations used in AI.
• Probabilistic representation of {fish, amphibian, mammal}
• Gluck & Corter - 1985
Attributes Values & Probabilities
• Category utility function:
body-cover scales (0.33), moist-skin (0.33), hair (0.33)
n heart-chamber two (0.33), three (0.33), four (0.33)
∑k=1 P(Ck) [ ∑i ∑j P(Ai = Vij/Ck)2 - ∑i ∑j P(Ai = Vij)2 ]
body-temp unregulated (0.67), regulated (0.33)
n fertilisation external (0.67), internal (0.33)

• Each node in the classification tree is a probabilistic concept
which represents an object class & summarises the objects
classified under the node.

Operators Operators contd ...

• Classifying object in existing class
• Incorporation of a new object into the tree is a process of • To determine which category best "hosts" a new object,
classifying an object by descending the tree along an COBWEB tentatively places the object in each category.
appropriate path & performing one of several operations at • Partition which results from adding object to a given node
each level. is evaluated using category utility function.
• Operators include: • Node which results in the best partition (highest CU) is
• Classifying object with respect to an existing class. identified as the best existing host for the new object.
• Creating a new class. • Creating a new class
• Combining two classes into a single class. • Quality of the partition resulting from placing the object
• Dividing a class into several classes. in the best existing host is compared to partition resulting
from creation of a new singleton class containing the
object.
• Depending on which partition is best - object is placed in the
best existing class or a new class is created.

Example Operators contd ...
• Add "mammal":
P(C0) = 1.0 • While the first two operators are effective in many
P(scales | C0) = 0.33
...
ways - by themselves they are very sensitive to
ordering of input data.
P(C0) = 1.0
P(C1) = 0.33
P(scales | C1) = 1.0
P(C2) = 0.33
P(moist | C2) = 1.0
P(C3) = 0.33
P(hair | C3) = 1.0
• Merging & splitting operators implemented to guard
P(scales | C0) = 0.5 ... ... ... against these effects.
...
• Merging
P(C1) = 0.5 P(C2) = 0.5 • Add "bird": • Two nodes of a level are combined in hope that
P(scales | C1) = 1.0 P(moist | C2) = 1.0
... ...
P(C0) = 1.0 the resultant partition is of better quality.
P(scales | C0) = 0.25
...
• Involves creating a new node
Existing Classification Structure
P(C1) = 0.25
• Two original nodes are made children of newly
P(C2) = 0.25 P(C3) = 0.5
P(scales | C1) = 1.0 P(moist | C2) = 1.0 P(hair | C3) = 0.5 created node.
... ... ...
• Splitting
P(C4) = 0.5 P(C5) = 0.5 • Node may be deleted and its children promoted.
P(hair | C4) = 1.0 P(feath | C5) = 1.0
... ...

Merging & Splitting COBWEB Control
Operators Structure
P
COBWEB ( Object , Root of classification tree )
P
1. Update counts of the Root
• Node Merging
New node 2. IF Root is a leaf
A B THEN Return the expanded leaf to accommodate Object
ELSE Find the child of Root which best hosts Object & perform
A B one of the following:

a. Consider creating a new class & do so if appropriate
b. Consider node merging & do so if appropriate, call
P
COBWEB ( Object, Merged node )
P c. Consider node splitting & do so if appropriate, call
• Node Splitting COBWEB ( Object, Root )
A d. IF None of the above were performed
B
THEN Call COBWEB ( Object, Best child of Root )
A B

AutoClass

• Cheeseman et al - 1988
• Bayesian statistical technique
• Bayes' theorem - formula for combining probabilities
• Technique determines:
• Most probable number of classes
• Their probabilistic descriptions
• Probability that each object is a member of each class
• AutoClass does not do absolute partitioning of data into
classes.
• Calculates the probability of each object's membership in
each class.

Unsupervised Slides

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Unsupervised Slides

Ähnlich wie Unsupervised Slides (20)

Mehr von ESCOM

Mehr von ESCOM (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Unsupervised Slides