GonzalezJonas_M2IAICI-Rapport

Choosing   vague   words   :
A   tool   for   Humanising   Data

By
Gonzalez   Jonas

Department   of   Engineering   Mathematics
UNIVERSITY   OF   BRISTOL

Supervisor   :       Doctor   Jonathan   Lawry      Faculty   of   Engineering   University   of   Bristol

February   2016      July   2016

1/55

Choix   de   mots   vague   :
Un   outil   pour   l’humanisation   des   données

Par
Gonzalez   Jonas

Département   de   Mathématique   et   ingénierie
UNIVERSITE   DE   BRISTOL

Superviseur   :       Docteur   Jonathan   Lawry      Faculté   d’ingénierie   de   l’université   de   Bristol

Février   2016      Juillet   2016

2/55

RESUME

Durant les dernières décennies il y a eu une importante amélioration de la précision dans
la lecture des données disponibles. Par exemple la géolocalisation est devenu de plus en plus
précise et peut maintenant identifier dans quelle pièce d’un bâtiment un utilisateur se trouve.
Pourtant ce niveau de précision est il toujours nécessaire ou même utile ? Par exemple un
utilisateur peut vouloir savoir sa position approximative avec l’aide de phrases tel que “ Vous
êtes   proche   de   la   station   de   train”   ou   “   Vous   êtes   toujours   loin   du   centre”.
Avec ces deux phrases nous pouvons mettre en lumières deux mots particuliers “ proche” et
“loin”, pourquoi ? Car ils sont de nature vague, nous pouvons les lire et les comprendre
cependant il nous est impossible de leurs donner un sens précis. Ces mots ne sont pas fixe
dans leur interprétation, ils dépendent du context, croyances de la personne qui les prononcent
ou   dépendent   d’un   phénomène   jusque   là   inconnu.
La communication humaine est pleine de mots vague, nous les utilisons quotidiennement dans
notre travail, en société sans vraiment y prêter attention. Etonnamment nous parvenons à
communiquer sans créer d'ambiguïté même si nous avons des définitions opposée de ces mots.
De plus l’utilisation de mot vague permet dans une communication d’avoir une liberté
d’interprétation, cette tâche est laissé à l’interlocuteur qui doit adapter la sémantique de ces
mots par rapport à la personne qui les a utilisés. Le phénomène d’être vague peut être vu
comme un degré de croyance à propos d’un concept et dépend donc de la personne qui les
utilises   et   implique   un   traitement   particulier   de   la   part   de   l’interlocuteur.
Ce papier vise à explorer l’utilisation de terme vague comme : grand, petit, gros; pour la
production automatique de résumé à partir de données numériques. De nos jour avec l’ère du
Big Data beaucoup de données numériques sont disponibles mais il y a peu de travaux se
concentrant sur la présentation de ces données. Ce projet explore l’humanisation des données
avec l’aide de termes vagues et d’algorithme d’apprentissage automatique pour proposer un
modèle   élégant   et   adaptatif   pour   le   résumé   linguistique   automatique.
Le résumé automatique essai de généraliser un concept sur une base de données et a pour but
de transmettre uniquement l’information pertinente. C’est pourquoi les mots vagues semblent
être le bon outil et une piste sérieuse à explorer de part leurs nature, l’utilisation de termes
vague permet une plus grande interprétation pour l’interlocuteur et permet donc une
généralisation   d’un   concept.

3/55

Abstract

In the last Decade there has been a major increase in precision of some types of readily
available data. For example geolocalisation has become more and more precise and can now
pinpoint locations to within a particular room of a certain building. Yet this level of precision
always really required or even helpful ? For example an individual may want to know their
approximate locations with sentences like “ You are near the train station “ or “You are still
away from the center”. With this two examples we can highlight two words “near” and “away”,
why ? Because they are vague, you read this words without noticing that in fact you can’t give
a precise definition. They are not fix in their interpretation, they depends of the context,
beliefs   of   the   person   who   utter   it   or   are   the   effect   of   an   unknown   phenomenon.
Human communication is full of vagueness, we use unconsciously everyday of our life, in work
environment, in the society... Surprisingly we manage to communicate without ambiguity even
if we don’t share the same definition on this kind of words. Moreover vagueness allow us in
communication to have a freedom of interpretation, letting to the listener the difficult task of
computing the semantics. Vagueness can be seen has a degree of belief about a particular
concept and thus depend of the speaker state of mind and imply also a particular treatment for
the   listener.
This paper aim to explore the use vagueness for producing automatic summary of
numeric data with the use of vague terms such as : tall,small, big… Nowadays in the big data
era a lot of numeric data are available but yet there is a lack of work for the presentation of
this data. This project explore the humanisation of data with the help of vague term and
machine   learning   in   order   to   propose   an   elegant   and   adaptive   model   for   automatic   summary.
Automatic summary try to generalize a concept over a dataset and aim to transmit only the
useful information. In that sense vagueness seems to be the perfect tools because of its
nature, vague term allow a bigger range of interpretation and thus allow generalization of a
concept.

4/55

HOST   ESTABLISHMENT   PRESENTATION
A. Bristol   University
The University of Bristol is a red brick research university located in Bristol, United
Kingdom. It received its royal charter in 1909, and its predecessor institution, University College
Bristol, had been in existence since 1876. Bristol is organised into six academic faculties
composed of multiple schools and departments running over 200 undergraduate courses
situated in the Clifton area along with three of its nine halls of residence. The other six halls are
located in Stoke Bishop, an outer city suburb located 1.8 miles away. It is the largest
independent   employer   in   Bristol.
The University of Bristol is ranked 11th in the UK for its research, according to the
Research Excellence Framework The University of Bristol is ranked 37th by the QS World
University Rankings 201516, and is ranked amongst the top ten of UK universities. The
University of Bristol is the youngest British university to be ranked among the top 40
institutions in the world according to the QS World University Rankings, and has also been
ranked   at   15th   in   the   world   in   terms   of   reputation.
B.   Laboratory   :   Intelligent   System   Lab
The University of Bristol has a long tradition of excellence in Artificial Intelligence, with
research groups in Engineering dating back to the 1970s and 1980s. Now all these traditions
have converged to form the Intelligent Systems Laboratory (ISL), a leading research unit
counting 15 members of staff (four professors) and about 50 PhD students and postdocs.
Research activities include foundational work in machine learning (many of the ISL members
work in this central area of research), and applications to web intelligence, machine translation,
bioinformatics, semantic image analysis, robotics, as well as natural intelligent systems. Besides
these applications, research in ISL is a key enabler in a number of strategic research directions.
Data Science is one of the main frontiers for modern AI, dealing with vast masses of data, both
enabling their exploitation and benefiting from them. Another key frontier for intelligent
systems research is interacting with modern biology, both taking inspiration by it, and providing
tools   for   it.
5/55

B. Supervisor : Doctor Jonathan Lawry
Dr Lawry is focussed on developing probabilistic models of vagueness (fuzziness) and
applying them across a number of application domains in Artificial Intelligence. His approach is
the identification of vagueness or fuzziness with linguistic (semantic) uncertainty. This
approach allows for a much more flexible representation framework in which both propositions
and valuations can be ordered in terms of their relative vagueness, and in which we can
capture both stronger and weaker versions of an assertion e.g. absolutely short, quite short
etc. This opens the possibility of developing choice models of assertion in which there is a clear
rationale for choosing a vague statement over a (more) crisp one in the presence of
uncertainty.

6/55

Table   of   Contents
I. Definition   of   Vagueness
A. What   is   vagueness   ?   Why   Language   is   vague   ?
B. Reasoning   and   use   of   vagueness
C. Modelisation   of   vagueness
II. Summarization   of   numeric   data   with   words
A. State   of   Art
B. Vagueness   in   automatic   summary,   a   fuzzy   question
III. Mathematical   framework   for   automatic   summary
A. General   presentation
B. Detailed   framework,   end   to   be   vague
IV. Results
V. Case   study
A. Application   to   web
B. Results
VI. Overview
A. Participation   of   this   framework   within   the   axes   of   investigation
B. Future   investigation   and   amelioration
VII. Conclusion
A. Personal   experience
B. Professional   overview
C. Greeting
                              Bibliography

page      8

page      9
page   10
page   18

page   21
page   26

page   31
page   42
page   48

page   49
page   50

page   51
page   53

page   54

page   55
7/55

I. Definition   of   Vagueness
A. What   is   Vagueness   ?   Why   is   language   vague   ?

An unexpected new trend is emerging in apps and web services, indeed the
information   presented   to   the   user   is   growing   vague.   So   what   is   vagueness   ?
Vagueness can be explained with the help of Classic logic where a sentence is either True or
False, this is known as the Boolean logic. But for some words and concepts of Human language
such as “Tall” and “Small” classical logic don’t work well. This words are what we call vague,
that is if you ask two people to classify a set of human according to their height into two
category small and tall. The two person will reach quite the same classification but some
person will be classify as tall for one and small for the other. This is vagueness, even if in some
case you will find no difference with Boolean logic, in some case an object can be tall and not
tall as the same time. This effect of vagueness can’t be represented with Boolean logic, it’s
called borderline case. There is no precise, known height which defines the line between a
person who is tall and a person who is not. That is why borderline term is used because there
is   a   range   where   you   can’t   be   absolutely   sure   that   someone   is   tall   or   not.
Why then we don’t simply adopt as a definition that “tall” will mean above a particular
threshold   ?   How   Human   represent   vagueness   and   process   with   it   ?

In a communication process view we can first think that vagueness is suboptimally
because it can create confusion and thus lead to misunderstanding. But in reality our brain and
communication deal perfectly with vagueness, letting to us the task of interpreting vagueness.
Even if you don’t think that house is tall because of your experience you can imagine and try to
understand the other point of view. Vagueness allow different interpretation depending of the
protagonist of the conversation, but even with this ambiguity they manage to understand and
accomplish complex task based on vague information. This is the reason why vagueness is
treated   in   the   field   of   artificial   intelligence   and   more   particularly   in   reasoning   under   uncertainty.

“    A   conceptexpression   is   vague   if   it   is   indeterminate,   for   some   objects,   whether   or   not
the   concept   applies   “
Gottlob   Frege

8/55

B. Reasoning   and   use   of   Vagueness

We saw in the previous section the nature of vagueness and its particularity, but
what can we do with vagueness ? Vagueness is proper to language so to Human
communication, then the first use we might think for application of vagueness is for
humanising information. For example in a communication process a person will never say “ I
just bought a new shirt for one hundred twenty five pounds and fifty five penny” instead you
will rather say something like “ I just bought an expensive shirt” or “ I just buy a shirt for an
hundred of pounds”. In this example we see the difference between the crisp assertion in the
first place followed by the vague assertion just by reading. One seems more natural, could
have been uttered by an Human the other one is clearly unnatural and seems emotionless too
much   mathematical.
Our brain generalize information from our environment every second, and it do the same when
we talk with other people. We sum the information that we want to transmit using vagueness,
this is done naturally, unconsciously and we don’t have issue to understand each other. This is
because we focus on the important information, the information that will transmit the true
concept. In the example the price was not the primary, important information that was the
fact that i buy a shirt, furthermore with the word expansive i can predict with the information
that   i   have   over   the   speaker   in   what   range   of   price   it   could   be.
Reasoning with vagueness as we see in this little example is focused on balancing with
crisp and vague terms to express an information, concept, feelings and emotions. But the
particularity of vagueness is that it's depends of the interpretation of the two people and how
they can reach a common understanding. One hack of human communication well know is the
misunderstanding when both speaker and listener interpret differently a concept but don’t
realise it. Vagueness is always present in communication process and is at the base of human
speech. The modelization of vagueness will always imply a speaker and a listener, because
vagueness   rely   on   the   belief   and   goals   of   the   two   people   involved   in   the   discussion.

9/55

C. Modelisation   of   Vagueness

So the first question we have to explore is : How to modelize vagueness ? We see that
with classical logic you can’t, but some researcher propose variation of Boolean logic. New
mathematical tools emerge for modelize vagueness, they take basics in philosophy and more
especially in the problematic : representation of knowledge. In order to understand this tools
we have first to define the difference between the truth value of vague term and classical logic
term.
Classical term will have only two truth value : True or False, in opposition vague term have this
pattern but in top they admit borderline case. If we take the example of person height and the
characterisation of this feature with small and tall, we can find a threshold under which we are
absolutely sure that these person is small and another threshold above we are definitely sure
that   this   person   is   tall.

Height   of   a   person      :

Tall   is   true   if      :                                                                                                                  Small   is   true   if    :
BorderLine     case      :

Where correspond   to   the   threshold   for   Small   and   Tall
Figure   1   :   Vagueness   with   BorderLine   case

According to figure 1 we can see that we have crisp threshold to statut whether a
person is tall or small. But in the borderline case we can’t determine if the person is Tall or
Small, it’s undefined. Moreover the value of the threshold is problematic, how to choose this
thresholds   ?

10/55

Epistemicists in vagueness want to retain classical logic and they endorse the somewhat
surprising claim that there's actually such a thresholds (they claim we know the existential
generalization `there is an n that such and such' even if there is no Particular n of which we
know that such and such). Many philosophers, however, find this claim something too hard to
swallow   and   take   it   as   evidence   that   classical   logic   should   be   modified.
One problem that all different philosophers try to solve and where classical logic can’t be used
is the sorites paradox. The are a lot of variant in sorite paradox but all of them rely on the
difficulty   of   representing   and   deal   with   vague   term.

Heap   Paradox

1   grain   of   wheat   does   not   make   a   heap.
If   1   grain   of   wheat   does   not   make   a   heap   then   2   grains   of   wheat   do   not.
If   2   grains   of   wheat   do   not   make   a   heap   then   3   grains   do   not.
…
If   9,999   grains   of   wheat   do   not   make   a   heap   then   10,000   do   not.   10,000   grains   of
wheat   do   not   make   a   heap.

Figure   2.   Sorites   Paradox

Solving this problem with classical logic force to reach absurd inference like 1000 grains
doesn’t form a heap. That is why different logic emerge to threat and deal with this kind of
paradox, in every model we will refer to this issue and see how every model propose to solve
it.

( All the following definition of modelisation of vagueness are from the book [1] from the
researcher   Kees   van   Deemter.   )

11/55

1. Supervaluationism

According to supervaluationists, borderline statements lack a truthvalue. This
neatly explains why it is universally impossible to know the truthvalue of a borderline
statement (to recall a truthvalue of a statement p for example, is True or False in standard
logic ). Supervaluationism exploit the law of excluded middle to treat with vague term, for
example instead of having the predicate “Charles is a baby “ they have “ Charles is a baby or is
not the case that charles is a baby”. Thus the method of supervaluationists allows one to
retain all the theorems of standard logic while admitting “truthvalue gaps”. The basic thought
underlying supervaluationism is that vagueness is a matter of underdetermination of meaning.
This thought is captured with the idea that the use we make of an expression does not decide
between a number of admissible candidates for making the expression precise. For example,
we can make it precise by saying that x is a baby just in case x is less than one year old; but
the use of the expression will allow other ways of making precise like `less than one year plus
a second'. If Martin is one year old, the sentence `Martin is a baby' will be true in some ways
of making `baby' precise and false in others. Since our use does not decide which of the ways
of making precise is correct, the truthvalue of the sentence `Martin is a baby' is left unsettled.
By supervaluationist standards, a sentence is true just in case it is true in every way of making
precise the vague expressions contained in it (that is, `truth is supertruth'). A precisification is
a way of making precise all the expressions of the language so that every sentence gets a
truthvalue (true or false but not both) in each precisification. In this sense, a precisification is
a classical truthvalue assignment.
As part of his solution to the sorites paradox, the supervaluationist will assert ‘There was
grains not being an heap when n grains but will be a heap when n+1 grains’. For this
statement comes out true under all admissible precisification of heap. However, when pressed
the supervaluationist will add an unofficial clarification: “Oh, of course I do not mean that there
really is a sharp threshold for an heap.”

12/55

2. Fuzzy   Logic

Fuzzy logic was introduced by the mathematician Lotfi A.Zadeh, is a form of many
valued logic where the truth value of a variable is in the range [0,1] and called a membership.
Fuzzy logic in opposition of Boolean logic deal with set of object where there is no a precise
define criteria of membership. The fuzzy set theory then deal with this kind of classes for
example to define the old set, there is no precise way of decide whether an object is old or not.
Instead fuzzy set works with membership function which assign to each object a continuous
value   traducing   how   much   this   object   is   rely   to   this   set.
So how fuzzy logic can be use in vagueness ? As we said before a vague term is a
particular object where you have borderline case, to illustrate how fuzzy logic can be use to
deal with vagueness we will use the set old. In pure set theory “ The class of old people” can’t
constitute a set or classes in the usual mathematical sense of these terms. But with fuzzy
theory we can because of the continuous value of the membership function, the class of old
people is represented with a degree of confidence. To represent closer the way of human
compute this kind of linguistic term, fuzzy set theory combine different terms, more often the
antonym is taken to represent the class of a vague term. In our example the antonym of old is
young,   then   we   can   define   two   fuzzy   set   as   we   can   see   on   this   figure.
Figure   2.   Fuzzy   set   example   for   vague   term   :    Age

With this representation according to an input x representing the age you will have
different membership value for : young, middle age and old. This theory try to modelize how
our brain works with differents beliefs, every person have a particular mechanism to represents
the vague class such as old people. In that sense you will probably find a matching with
another person for the classification of an old person according to his height, that is you will
share quite the same membership function and threshold. But in some case because of the
threshold taken you will infer that a particular height is considered as old and middle age for
another person. The inference in fuzzy logic is called the defuzzification, it consist on choosing
13/55

with the different membership values which one to retain. The most simple way of doing this is
taking   the   max   of   each   membership   function   to   classify   the   object.
Fuzzy set theory is a promising theory and can be view also with a degree of belief
about classification into sets. Moreover fuzzy set theory propose an elegant way to solve the
sorites paradox. Taking back our example of the heap, with the fuzzy logic the paradox is
solved with the membership function. Indeed as we see a 1000 grains doesn’t form an heap
with the implication rules but with fuzzy logic sorites paradox is not an issue. Your input
variable x representing the number of grains when is high will have a very low membership
value for the class “ Not a heap” and become greater for the class “ Is a heap”. The sorites
paradox is solved without difficulty because in fuzzy logic is just a matter of degree of
appartenance.
Even if fuzzy logic solve the sorites paradox a bias remain, the modelisation of fuzzy set
imply to choose for the membership function a threshold where you are sure that below or
above it, your concept is true or False ( as figure 1 ). The choose of the membership function
and the associate threshold are crucial, this choices are made by empirical experiment and yet
there is not an accepted and shared methodology to do so. What is missing in fuzzy logic is an
emergent process to compute automatically this threshold and the choice of the shape of the
membership   functions.

14/55

3. Manyvalued   Logic

Manyvalued logics come from the field of nonclassical logics and differ from the most
common logic : Boolean. They are similar to classical logic because they accept the principle of
truthfunctionality, namely, that the truth of a compound sentence is determined by the truth
values of its component sentences (and so remains unaffected when one of its component
sentences is replaced by another sentence with the same truth value). But they differ from
classical logic by the fundamental fact that they do not restrict the number of truth values to
only   two:   they   allow   for   a   larger   set    W    several   truth   degrees.
The fuzzy set theory came from this point of view as we already presented it, we will
introduce another theory : The three valued logic. Three valued logic is a good start point to
understand the mechanism behind the many valued logic. This logic was introduced by the
mathematician Kleene and consist of three truth value {0,½,1}, where the ½ correspond to
undefined truth value. This theory can be applied to vagueness where crisp term are always in
the truth value domain {0,1} and vague term in the domain {0,½,1}. The mechanism of
inference in this logic rely on two major principle of disjunction and conjunction. The
disjunction for three value logic works generally with the maximum of the truth value, and for
disjunction   the   lowest   value   is   taken.
But this is only one way to treat this operators, another wild use semantics is to mix with the
probability theory. This figure sum up the difference of semantics for many valued logic and
regroup   the   two   most   used   definition   of   disjunction   and   conjunction.

Figure   3.   Operators   Logic
With the probabilistic interpretation of this operators the sorites paradox can be solve,
the use of recursive implication will decrease the truth value of the concept. The truth value
will be the product of the previous sentences, as at one moment the predicate will have the
truth value ½ the product will then decrease and reach zero at some point. Thus the concept
of   heap   will   have   at   one   moment   a   zero   truth   value   and   then   be      false.
15/55

4. Contextualism

Epistemic contextualism (EC) is a recent and hotly debated position. EC is roughly the
view that what is expressed by a knowledge attribution — a claim to the effect that S ‘knows’
that p — depends partly on something in the context of ‘the attributor’, and hence the view is
often called ‘attributor contextualism’. So EC, of the sort which will concern us here, is a
semantic thesis: it concerns the truth conditions of knowledge sentences, and/or the
propositions expressed by utterances thereof. The thesis is that it is only relative to a
contextuallydetermined standard that a knowledge sentence expresses a complete
proposition: change the standard, and you change what the sentence expresses; acontextually,
however, no such proposition is expressed. In this respect, knowledge utterances are supposed
to resemble utterances involving uncontroversially contextsensitive terms. For instance, just
what   proposition   is   expressed   by   an   utterance   of   :

1. He   is   a   tall   person
2. That’s   red
3. He   is   nice   person

depends in certain obvious ways upon such facts as the location (1) or identity (2) of the
speaker, and/or the referent of the demonstrative (in 3). Contextualism rely in the mere fact
that every truth value of a predicate depend of the context. To illustrate this vision let’s
suppose we have a jury of two person who have to classify person according to their height
and   suppose   that   we   have   two   scenario   :

Every   person   to   judge   pass   on   a   stage   one   by   one   and   leave   the   scene.
Every   person   to   judge   come   on   the   stage   and   stay

In the first scenario the context will not be build dynamically because all the people leave the
stage once they been classify. You will have disparity in the judgement of the jury due to the
fact that they both rely on their personal knowledge, interpretation of the predicate tall and
small. Whilst in the other hand in the second scenario they will have all the people/input in
front of them, the context will then be built dynamically depending to the set of people
presented. So they will reach more or less the same classification because they built the same
context,   with   this   reflexion   the   sorites   paradox   it’s   solved.
16/55

But it is solved theoretically, contextualism doesn’t propose a mathematical model to modelize
this view. This more a philosophical point of view on which you can start to built a more
concrete   implementation   by   using   mathematical   tools.
Many critic come to this solution because contextualism only displace the issue of vagueness
into the mind of Human. For contextualist all the definition we can give to vague term depend
entirely to our psychological state, which was made by our experiment in life. A good example
is   comparing   two   child   :
One   who   grew   up      in   a   healthy   family.
The   second      in   a   poor   one.
When they will be teenager they won't have the same definition of vague term such as
“expansive”. They built two different knowledge about this term according to their development
and the environment they were confronted. It’s like during our childhood development
especially in the first year of our life, the brain learn and fix by habits threshold to represent
vague term. That is why when you are more aged it’s more complicated to change the meaning
over   vague   predicate,   you   are   conditioned   from   your   personal   experience.

Personal   choice

There is a lot of other possibility to deal with vagueness but we choose to explore the
most used one. After a careful reflexion we choose in our project to explore the use of Fuzzy
Logic mixed with contextualism. This motivation is argued with the wish of developing an
adaptive autonomous framework to represent vagueness. The contextualism point of view can
be explored with non supervised learning algorithm in order to propose an emergent
mechanism to learn and find the right threshold for vague term. Furthermore we think that this
methodology is closer to the mechanism involved in the human brain, this is support with
lecture   in   psychology   and   cognitive   science   over   the   child   development   of   language.

17/55

II. Summarization   of   numeric   Data   with   words
The summarization of data is an active field in research especially in artificial
intelligence and data mining. With the increase of data a lot of information can be extracted,
yet we struggle with the presentation of this information. Actually a lot of improvement are
made but they focus on machine learning, classification task and are reserved to specialist for
understanding and deal with the knowledge extracted. A regain of interest is made now in the
presentation of data, the goal is to make the information understandable by majority of human
and didn’t require special knowledge. This field regroup researcher from natural language
processing, artificial intelligence, mathematics… Vagueness is widely exploited in this field as
is a base of human language, and especially in computer science. We will introduce and
present the different methods explored by researcher in this field and see how vagueness play
an   important   role   in   this   area.
A. State   of   Art
Automatic summary provide a new tool to extract knowledge from a large set of
information. It is a communication process where according to some inputs, information must
be extracted and transmitted to another. To represent this process a two agent model can be
used composed with a speaker and a listener. The speaker is the major actor, he is in charge
to compute the inputs information in order to extract patterns. This patterns are then
translated into a communication channel and transmitted to the second agent, the listener. The
listener is a passive actor and only receive the information from the speaker and interpret it. In
some application he had to choose an action to perform depending to the message he
received. This kind of issues require a more complex modelization, the field of Game theory is
often used where the model ( speaker and listener ) is represented as a game. The goal being
to find the best summary/information to transmit in order to maximize the award as typical
game   model,   the   difficulty   is   to   find   the   right   award   according   to   the   application.

The speaker is the important part while the listener in most of the case are human and
no works have to be made on this part. The modelisation of the speaker agent depends on the
application, it is composed of different and specific part. For example the speaker must
implement an algorithm to treat the input information and extract pattern, knowledge, this is
where   the   field   of   Machine   Learning   is   used.
18/55

The information infer from this part is indigest to present for an human, that is why the
speaker have to also implement a linguistic process to communicate the extracted patterns.
The field of Natural language generation with the field of game theory propose elegant
architecture and mathematical framework. In the case where the listener use the information
from the speaker to do an action the speaker must take in account this issue for the process of
summarization. He have to endorse a mechanism to exploit the beliefs and actions of the
listener to produce and choose the right message. This kind of problematic rely on the
treatment and decision under uncertainty, game theory algorithm again are widely explored.
To sum up and make more precise what have been said, Yager in [2] propose an elegant and
modulable   model   for   automatic   summary      :
Figure   4.   Yager   Summary   model

The summarizer is a linguistic sentence and most of the time is a vague one, to illustrate this
model   let’s   take   a   easy   example   :

S       =   middle   age
Q       =   most
T degree of truth ( have to be calculated according to the Dataset, fuzzy logic is used
by   Yader   to   compute   it   ).

Yager pose the basics for summarization and propose a general architecture that can be
complexified depending to the application. In the case of T, fuzzy logic is most of the time
used, P. Villar and al in [2] use Yager model to propose an automatic summarization of the
opinion of tourism hotel data. Villar and al go deeper in the use of linguistic term to describe
heterogeneous data and propose a fuzzy model based on semantic translation as a tool to
produce linguistic summary. Their goal is to benefit from the use of linguistic term in order for
a data analyst and a non specialist to understand and exploit the inferred information. Their
works start from the Yager model with in upstream a process to identify and classify the input
information   in   order   to   extract   in   their   application   sentiment   classification.
19/55

This classification is important because it’s use to determine the fame of an hotel according to
different   textual   and   numeric   input.
To modelize vagueness they choose fuzzy logic, trapezoidal curve were chosen for membership
function. Their main works focus in the mix and calculation of degree of truth from different
vague term. They go much deeper in this way and propose several method for the
aggregation of vague term and different approaches to balance with the lost of information
induced   from   vagueness.

A.Ramos and al in [3] also explore the use of fuzzy logic to produce automatic textual
shortterm weather forecast for the region of Galicia. Their approach is mainly based on fuzzy
operators and crisp one but they innove with the use of intermediate language to capture
vagueness and produce linguistic term. In opposition at the model of Yader the architecture
of A.Ramos implements a pre process where information is extracted and computed into
intermediate code for finally with natural language templates produce linguistic weather
forecast.
Figure   5.   Architecture   for   short   term   weather   forecast

Even if a lot of architecture for linguistic summarization exist they all take basics with the
Yader architecture. The two other architecture we saw complexify and focus on some specific
part of the summary as a finer modelisation of vagueness and are dependent of the
application.
In the next part we will focus and go deeper in the mathematical formalism adopted to
modelize and represent vagueness in this two articles. Fuzzy logic will be more explained with
the   presentation   of   the   works   of   D.Dubois   a   referent   in   this   field.
20/55

B.    Vagueness   in   automatic   summary,   a   fuzzy   question
In this part we will focus on the formalism adopted in [2],[3],[4] and in particular
the use of Fuzzy Logic. Fuzzy logic was introduced by Zadeh to formalize the human
knowledge, Dubois in [5] explain how fuzzy set theory and vagueness are rely even if Zadeh
wanted a distinction. The claim that fuzzy sets are a basic tool for addressing vagueness of
linguistic terms has been around for a long time. But some researcher as Novak make the
opposition of vagueness with uncertainty, thus vague term must fulfill three features : The
existence   of   BorderLine   Case,   Unsharp   boundaries,   Susceptibility   to   sorites   paradox.

Fuzzy logic have been controversial for philosophers, many of them are reluctant to consider a
different truth value system than Boolean one. One of the reasons for the misunderstanding
between fuzzy sets and the philosophy of vagueness may lie in the fact that Zadeh was trained
in engineering mathematics, not in the area of philosophy. In particular, vagueness is often
understood as a defect of natural language (since it is not appropriate for devising formal
proofs, it questions usual rational forms of reasoning). Actually, the vagueness of linguistic
terms was considered as a logical nightmare for early 20th century philosophers. In contrast,
for Zadeh, going from Boolean logic to fuzzy logic is viewed as a positive move: it captures
tolerance to errors (softening blunt threshold effects in algorithms) and may account for the
flexible use of words by people. It also help for information summarisation: detailed
descriptions are sometimes hard to make sense of, while summaries, even if imprecise, are
easier to grasp. The link between fuzzy set theory and vague term can be argue with the idea
that it’s natural to represent incomplete knowledge with set. But fuzzy logic was understood in
various way and help to modelize uncertainty, degree of belief and can even be connected with
modal   logic.

Vagueness is a phenomenon observed in the way people use language, and is
characterized by variability in the use of some concepts between the listener and speaker. It
may be that one cause of such variability is the gradual perception of some concepts or some
words in natural language. This variability of interpretation, perception can be use in automatic
summary to capture a concept like in [4] where A.RamosSoto and al use it to generate
weather forecast. In [3], [5] they also use vagueness to produce automatic summary, even if
they   all   use   Fuzzy   Logic   to   represent   vagueness   they   all   differ   in   the   way   they   implemented   it.

21/55

RamosSoto   :   Weather   forecast
Ramos and al in [4] proposed an architecture to automatically produce weather forecast
summary over the 315 Galician municipalities. Formally, each municipality M has an associated
forecast data series : which includes data series
for the input variables considered: sky state ( ), wind ( ) and maximum ( )
and minimum ( ) temperature. For clarity reasons in follow we will consider a single
municipality   data   series.
For each forecast data series , Ramos and al obtains linguistic descriptions about
seven forecast variables, namely cloud coverage, precipitation, wind, maximum and minimum
temperature variation and maximum and minimum temperature climatic behavior. For this,
they have devised a computational method divided in several linguistic description generation
operators. Here is the process where fuzzy logic is used to translate this features into vague
term,   we   will   take   the   sky   data   to   illustrate   :
The first stage in Ramos and al application is to first transform the chronological data
serie into temporal linguistic term. To do so they use fuzzy set to represent linguistic temporal
term   {    Beginning,   Half,   End   }    with   an   associated   membership   function.
The second stage is to catch the concept associated with the main feature, here sky
data   are   traduced   into   CCL   =   {C,   P   C,   V   C}   (“clear”,“partly   cloudy”,   “very   cloudy”)   fuzzy   sets.

The procedure then is to concatenate all this temporal description taking the maximum degree
of membership in the fuzzy sets. The output is then an intermediate code with vague term
which describe the weather in precise time window. The global process is repeated for each
features, Ramos and al choose to first translate the numeric data into vague term in order to
produce   at   the   end   with   template   method   and   NLG   methods   a   linguistic   weather   forecast.
This   figure   can   summarize   the   process   more   clearly   :


22/55

Ramon   A.Carrasco   :   Automatic   summary   for   Tourism   Web   Data

In this paper [3] Ramon and al propose a novel model to aggregate heterogeneous data
from various websites with opinions about hotels. Ramon and al focus on the mathematical
modelization of vague term and how to solve the issue of crisp boundaries. They use the same
architecture as Yager in [1] but go deeper in the representation of vagueness and the
formalism adopted. As in [4] they gather various data from websites about hotels opinions, but
they differ in the way that they use linguistic input from forum or comment section in some
rating website (TripAdvisor…). We will only focus on the way they treat and implement fuzzy
logic with vague term as this is the core issue, to illustrate our talk we will take as an example
the   age   of   the   clients   in   an   Hotel.
A set of seven terms on the age of the hotel guest could be given as follows: = baby,
= child, = teenager, = young, = adult, = mature and = old. The semantic
ie the membership value is calculated with unbalanced trapezoidal functions. Trapezoidal
functions are represented with the 4tuple : { } where and represent the interval
where the concept is totally true ie 1, and represent the two threshold respectively lower
and upper where the concept is false ie 0. The vague interval is represented in the interval
   and       the   membership   value   can   only   be   in   the   [0,1]   interval.
To solve the issue of crisp threshold another metrics is added , it represent the range of
translation for where the concept is True. To deal with this, two set and for
high and low translation are created in order to compute the truth degree of a concept. The
idea of the translation is to catch the different possible interpretation according to the choice of
the thresholds. Furthermore the authors propose a weighted model for example to highlight
metrics from valuable client. They determine two operators to do so and , the
first one is just a weight sum the second is a more complex aggregation and can be viewed as
a quasiarithmetic average. This procedure is applicable for vague term but also for crisp term,
to deal with it the authors propose a grammar G in which they store the space of interpretation
of the terms. For example primary term have no D parameter whereas some have a high/low
comparative   term   that   means   they   have   the    D    parameter.

23/55

This   figure   summarize   the   process   proposed   by   Ramos   and   al:

Figure   5.   Fuzzy   model   based   on   semantic   translation

JANUZS   Kacprzyk   Fuzzy   Logic   for   linguistic   summarization   of   DataBases

Januzs in his paper aim to produce a new query system interface to display the
information from a Database. This system is based on fuzzy logic and take basics with the
Yader architecture { S : summarizer, Q : quantity in agreement, T : degree of truth }. The
innovation in this proposed paper is not in a the way Januzs use fuzzy logic to represent
vagueness   but   in   the   way   he   treated   the   combinational   issue   of   automatic   summary.
To   highlight   this   issue   let   take   the   study   case   of   Januzs   in   his   paper   :   Computer   retailer.
For example to summarize the sell of a computer many options can arise like “most of the sell
are second hand”, but it can be more precise by adding conjunction,disjunction “ Most of the
sell are second hand and/or recent computer”. Given a set of attribute A and a vocabulary to
describe the features V with connector like AND, OR the research space of the possible
summary   are   huge   and   become   an   issue   to   compute   with   a   large   database.
Calculate the validity of each summary is a considerable task, and George and Srikanth
(1996) use a genetic algorithm to find the most appropriate summary in the space search. In
his approach, Januzs use also a genetic algorithm, and the overall quality (goodness) of a
summary is given by the weighted sum of some partial quality indicators, TI, ..., Ts (cf.
Kacprzyk and Yager, 1999) and the weights are derived from expert testimonies from pairwise
24/55

comparisons between the particular indicators using Saaty's AHP method. Thus, basically, the
problem   is   to   find   an   optimal   summary   ,   S*       {    S   }    such   that   :

Where       represent   the   weight   associated   to   the   linguistic   term    .
In this paper the author highlights the combinational issue for the automatic
summarization, in our work a special focus is made on this part in order to propose a fast and
reliable   algorithm   to   generate   the   optimal   summary.

Vagueness   to   Classify

We saw in all this different paper that vagueness is a wide use phenomenon especially
in summarization task. This is because in the summarization task the goal is to find the most
global summary that englobe different concepts such as for example :” Well paid workers”. In
this summary the interpretation of “ well paid” allow a greater set of possible object than the
sentence “ workers paid 120$ per hour”. That can be viewed as classification problem where
the goal is to create set using vague concept, this is why vague adjective are good way to do
so. They allow a larger range of interpretation and thus can theoretically catch fuzzy concept
and   carry   out   more   information.
In our work we take basics with all the techniques we saw previously, our wish was to
propose a modulable architecture for automatic summary. But we focus more on classification
task as final goal like for example the discrimination between diabetic and non diabetic patients
with the use of vagueness. Moreover our work was also focused on proposing a solution for the
bias that exist in modelisation of vagueness like finite threshold, decision making. In order to
be closer of the Human thinking we explore paper on psychology, psycholinguistic in childhood
development of language, to propose a mathematical model directly inspired from human brain
behavior.

25/55

III. Mathematical   Framework   for   automatic   summary
A. General   presentation
We saw in the last section the nature of vagueness and the way to use it for automatic
summary. In the papers [3],[4],[5] the authors propose new model and architecture for
different application of summarization with vagueness, more precisely with fuzzy logic. In [2]
Yader pose a basic architecture which can be complexified according to the application.
Furthermore we saw that the modelisation of vagueness depending on the choice of the logic
imply some bias, the most critical one is the choice of the membership function and the
threshold when used with fuzzy logic. To be closer of the Human thinking we explore the
development of language and meaning of vague adjective in the development of the child.
S.Andersen in the paper [7] explore the process involved by child to treat vague concept such
as words cup or glass. Thus the idea is to take inspiration from child language emergence, how
do   they   treat   vague   concept   ?   How   do   they   learn   the   boundary   involved   in   vague   concept   ?
The overall process explored in this work take basics from the progress in the field of cognitive
science, that is the mix of different point of view from mathematics, psychology, logic to
understand and modelize the human cognition. We wanted to follow this process of reasoning
to   propose   a   general   and   reliable   framework   inspired   from   the   Human   brain   behavior.
To do so we split the framework into different part each focused on specific treatment, the goal
was   to   propose   based   on   Yager   model   [1]   a      inspired   cognitive   architecture.
Figure   6.   Framework   Architecture

26/55

Extraction   of   Data
This part deal with the extraction of the information from Dataset, we use in this work
the UCI repository a famous repository for ML, the extraction of data is made for the format
CSV. CSV format state for comma separate value, in top of this convention the class attribute
have to be the last one in the row. The nature of the informations are not restricted to numeric
data,   we   extend   with   nominal   value   but   only   one   with   less   than   8   possible   values.
This is not a crucial part of the project but is essential for the rest of the framework, it began to
be essential when we propose a concrete application : a Web extractor data. The idea is to
extract the data directly from a website using a web crawler, in this case Celenium library was
used. In the interest of having a modular framework a inheritance model was built, to adapt
the process for a specific website a few parameters have to be settled. This mechanism allow
to test our framework on real data from the internet and from regular Dataset like UCI
repository.

Model
In this framework a model can be view as a definition of an object, that is all the
features that describe this object. All the name of the attribute, the number of class, their
name and the range values for nominal features. The model part is the basics of the
framework, it’s a definition of the object and give the right linguistic term to produce at the end
for the summary. For example to describe the diabeticPima Dataset several features are taken
like the concentration of plasma glucose in the blood after 2 hours, the age…. This attributes
are used to produce sentences at the end of the summary process with connectors ( And, OR)
to rely and produce a more convincing summary. This is inspired from Ramos and al in [4]
where they use template for the generation of linguistic summary, the motivation by doing that
is   to   focus   only   in   vague   term   and   not   in   crisp   term.
Furthermore this modelisation allow to adapt the framework to different model, for any
Dataset   only   one   class   have   to   be   written   to   describe   the   object.

27/55

Math   :   Fuzzy   set
This part is the mathematical definition of fuzzy set, it gather all the fuzzy term with
their mathematical definition. The choice of membership function and other parameters proper
to Fuzzy set theory are set in this part of the framework. With the other math part they both
constitute   the   base   to   treat   with   vagueness   according   to   contextualism   point   of   view.

Math   :   Statistics   /   Machine   learning
Given a Dataset to summarize a lot of statistics metrics can be extracted such as
distribution over one particular feature or several, or learning of some dependency between set
of objects... In this framework this part focus on extracting distribution graphics like cumulative
histogram, representation of object in several dimension... This part rely more on computer
modelisation rather than algorithmic process, the role of this part is to translate the input data
into mathematical presentation. We use framework such as Scipy or numpy to allow an easy
manipulation. Library such as sktlearn have been widely used to test and include machine
learning algorithm like clustering along several dimensions. To follow our thought of “ inspired
cognitive architecture” this part would be the traduction one, where brut data are treated to
allow the extraction of knowledge. This part is linked with the context part as we can see in
figure 6 , this two part represented the contextualism point of view. The context is dynamically
built from the input data which are traduced and computed to produce the mathematical
context   to   treat   with   vagueness.

Math   :   Context
To recall with the contextualism point of view, to treat with vagueness this logic propose
that every threshold proper to vague term exist but are not fixed. This threshold are computed
dynamically according to the set of object presented. This class is directly inspired with this
philosophy, the context is represented with a fraction of the dataset thus for every vague term
the   threshold   are   computed   actively.

28/55

Language
This part regroup all the vague vocabulary used to produce summary, such as : big,
low, high, normal...To follow Zadeh fuzzy theory like in [8] where fuzzy quantifiers are used
to modify the sense of vague term, we used : very, most. For example very is a quantifiers
which can be used to amplify the sense of vague term such as tall, and thus influence the
membership function. Others quantifiers are used to capture the group characterised with the
summary depending to the distribution, fraction of object represented by the summary. For
example when someone tell “ some of the birds are smart” the some attribute refer to a portion
of a set but not a precise one. That is why it can be used when the summary didn’t fit all the
target concept (ie the class) for balancing the summary and be true in the interpretation.
Moreover it can be combined as in Zadeh proposed in [8] with other distribution vague term
such as “most of all”, “ the majority” in order to cover and transmit the maximum information
with   the   summary   over   the   target   concept.

Summarizer
This part follow the architecture proposed by Yader in [2], it regroup the 3tuple :
Summarizer,Quantity agreement, Truth value : { S,Q,T }. The difference is that in this
framework the summarizer is viewed as a classification task, the goal being to find the most
accurate linguistic summary to discriminate the different class present in the input dataset. In
our model the quantity agreement and truth value used the same mathematical metrics :
distribution of a given summary over the dataset. That is that according to the probability
assigned to a given summary a quantitative agreement is computed in order to catch and
transmit with linguistic term the distribution. This process can be view with the information
theory as a choice problem where according to a specific distribution a quantity agreement
have   to   be   chosen      in   order   to   transmit   with   the   summary   the   right      interpretation.

Decision   making
Here the mathematical branch of deciding under uncertainty ie with probability are
widely use. But in a computational process two method was explored to deal with this issue,
the first one consist of traducing numeric features with word, for example : “ 1.86 meters “ >
“ high meters” before the summary process. This method allow to keep a very fast framework
but in the other hand restrict and delete some useful information for the decision making part.
That is why with the second method all the membership values are kept and treated at the end
where   a   mathematical   decision   equation   is   used   to   perform   the   choice   classification.
29/55

Display Summary
In order to produce a linguistic summary a specific part is used to do so, with the output
of the Decision making part and the Model part a summary is generated. A very basic flow of
NLG is used here to produce the sentences, most of the production is done with the template ie
the Model.

We describe here the general part of the framework, the behavior and role of the
different part composing it. All the architecture was inspired by neurofunctional knowledge, in
the next part a direct link will be made, with the evolution of the different version of the
framework.

30/55

B.   Detailed   Framework,   end   to   be   vague
In the following part, the architecture will be detailed with argumentation about the
choice   of   conception   and   the   mathematics   tools   used   in   every   part.

1. Framework   version   1.0
The first version of the framework was very simple and was more focus on trying
different hypothesis before to start modeling all the architecture, we start with this very simple
architecture   :

Figure   7.   FrameworkV1   architecture

The first architecture as shown in the figure 7 is quite basics and was test on the
IrisDataset from the UCL repository. The idea in this first version is to test the following
hypothesis   :
Suppose that our data has attributes so that takes values in for
. Let and we let denote the vector of attributes values
.   A   data   set   takes   then   the   form   of       where    .
Now we define a language with propositional variables : which are in our
case vague term : High , Low. Then for every attributes of the data set we match each
propositional   variables   of       according   to   this   formula   :
Given an alpha , two threshold are computed over the cumulative histogram . The
low   threshold   is   given   by       .

31/55

The upper threshold is given by then we have , the condition on
alpha   is   :       due   to   the   upper   bound   given   by    .

Figure   8.   Calcul   of   threshold   example

Given this threshold over all the attribute of we can with fuzzy set theory calcul
the membership value for the propositional variables . To modelize the membership
function we choose to use trapezoidal curve of type Right for lower bound and Left for upper
bound,   this   choice   is   totally   empirical.

With this two membership function for each data in the dataset a truth
value for the propositional value is calculated. The decision maker in this version is based on a
maximum average, to illustrate this we will take the Iris dataset. Given a class and a
attribute with the language the goal is to two find the best sentence to
describe   a   specific   attribute   according   to   a   given   class.

32/55

For the class Irissetosa with the attribute sepal length the formula to find the right vague
term          to   describe   it   follow   this   formula   :

Figure   9.   Decision   making   FrameworkV1

This first version of the framework was not guided by cognitive perspective and the goal
was to see the discrimination offered by the restricted language . We didn’t
explore the classification of class using this linguistic description it will be done in the next
version   of   the   framework.

33/55

2.   Framework   version   Two

After the result highlighted with the first version of the framework we decided to take
inspiration from the neuroscience and particularly in the treatment of language. In the human
brain two different area are majoritary involve in the production and comprehension of
language. The first one the Broca area was discovered by Paul Broca in the nineteenth century,
he examined the brain of a dead patient who had an unusual disorder. The patient was not
able to talk even if no motor lesion on his tongue or mouth was noticed. Broca examined the
brain and discover lesion in the posterior portion of the frontal lobe of the left hemisphere.
Years later Carl Wernicke a German neurologist, discovered another part of the brain, this one
involved in understanding language, in the posterior portion of the left temporal lobe. People
who had a lesion at this location could speak, but their speech was often incoherent and made
no   sense.
From this we decided to built an architecture that try to mimic this process by
dissociating the language and his semantics. In our model this was done by adding two part, a
Language one which represent the Broca area and the Context which represent the Wernicke
area. The first part gather all the vocabulary and the rules to produce the rights sentences. The
second one was involve in our framework in making sense of this word, in our case to put
semantics   of   vague   term   with   the   use   of   fuzzy   logic.

Figure   10.   FrameworkV2   Architecture

34/55

The method used to put semantics on vague term is the same as the first version of the
framework, with threshold computed from the cumulative distribution of attribute. Furthermore
we focus on to use summarization and vagueness for a classification task, the goal was to find
the summary that allow the best discrimination between the different class. The problem
became then to find the best summary given a vocabulary that have the higher probability to
be true for the target class and lower probability to be true for the other class. This problematic
can be view as a game and then algorithms from game theory can be used and especially the
tree   search   using   an   heuristic.
Taking the same definition as previously with a dataset with a vector of features , and a
language       composed   of   propositional   value,      we   add   connectives   logic   symbol   :    .
The task became to find the conjunction or disjunction of attributes that can well discriminate a
specific   class       using   this   heuristic   for   the   research   in   the   space   search   :

Theta is the sentence explored in the tree search, is a combination of conjunction or disjunction of
attributes with vague terms. The heuristic is directly translated with the idea that a sentence have to be
most   true   for   the   target   class       and   a   little   true   for   the   other   class       .

We choose for a computational and performance issue to first try to translate all the
numeric data with our language in this case . Following Zadeh in [8] we
introduce quantifiers to our language , this quantifiers act directly on the
membership function. The quantifier very put to the power the membership value of each
word, for the quite quantifiers the square root is used. The same membership function was
taken to modelize Low and High, for the word Normal we choose a gaussian membership. This
choice was made because even if you can’t describe how a single random thing happens, a
whole   mess   of   them   together   will   act   like   a   gaussian.

So given a dataset with features describing by the vector , we translate the numeric
data   into   linguistic   data   with   :


In this figure we can see the results of this process with the translation into word of numeric
input       :

Figure   11.   Example   of   numeric   to   words   translation

This translation allow in the search process to have better computational time, the algorithm
will   exhaustively   search   word   matching   using   several   sentence   (    )   given   by   the   vocabulary.
35/55

The search algorithm with the heuristic, calcul for a given class the maximum summary
that discriminate this class. This is an example of the output generated by this version of the
framework   :

Figure   12.   Iris   Summary   Class   Irisversicolor   (   Class   1   )

The other issue treated in this version of this framework is the choice of the threshold
over the cumulative distribution for the upper and lower bound of our vague vocabulary. One
approach explored took inspiration from Januzs in [5] where he used genetic algorithm to find
the right summary where in our framework we use a tree space search. We took inspiration
from   Januzs   and   try   to   apply   genetic   algorithm   to   find   the   right   threshold   where   :

       The   population   is   the   threshold   vector   for   all   the   attributes
   The   crossing   function   is   based      on   taking   a   max/mean   of   all   the   selected   individuals
   The   fitness   function   is   the   probability   computed   given   a   summary   for   a   specific   class

The application of genetic algorithm to find threshold was not a complete success, the
computational time was to important according to the results. We choose to explore another
method to dynamically find the right threshold for vague term by exploring achievements made
in   child   development   of   language.

36/55

3. Framework   Version   3

In the last version of the framework we decide to focus first on the calcul of the
threshold for vague term. After several lectures about the child development of language
especially in psycholinguistic we find an interesting alternative to threshold. In [7] the study
highlights that the child to learn new words concepts such as cup, works like a clustering
algorithm. That is that to catch the link between concept and a word the child first create set of
different concepts and refine them during his development. When new words with new
concepts arise in his vocabulary conflict of words and meaning force the child to refine his
definition. The conjunction of this concepts create more precise boundary and thus a concept
become more precise by learning and error making. To treat with vague term we explore a
similar mechanism, we state as the contextualism that the semantics of vague term came from
the experience. During his development the child learn vague concept such as distance (away,
close) or about size (small, tall) by his environment. In our framework the environment where
the knowledge can be first extracted will be on a fraction of the input data ( Training Set) .
This TS will be use to find the threshold with the use of a non supervised algorithm : kmeans.
In kmeans algorithm the goal is to find the centroids that better balance our data. In our case
the vocabulary is , so a 3means algorithm will be use for every TS, the
goal being to output the three value of the centroids find by the algorithm. This centroids will
be use as previously like threshold for membership function, the motivation of doing this is the
adaptability   of   the   method.
Moreover we focus on the decision part, on the last version the numeric data were first
translate in the language L, the main issue is that we loose information with the max operator.
Given a summary over a class , for each data to classify we keep the membership value
[01]   and   inject   it   into   a   decision   equation.
The final model proposed is the evolution of the version two but incorporating the dynamic
calcul   of   threshold   and   a   new   decision   process,   the   model   is   presented   in    Figure   6.

37/55

GonzalezJonas_M2IAICI-Rapport

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie GonzalezJonas_M2IAICI-Rapport

Ähnlich wie GonzalezJonas_M2IAICI-Rapport (20)

GonzalezJonas_M2IAICI-Rapport