SlideShare ist ein Scribd-Unternehmen logo
1 von 55
Downloaden Sie, um offline zu lesen
 
Choosing   vague   words   : 
A   tool   for   Humanising   Data 
 
By 
Gonzalez   Jonas 
 
 
 
 
 
Department   of   Engineering   Mathematics 
UNIVERSITY   OF   BRISTOL 
 
 
 
Supervisor   :       Doctor   Jonathan   Lawry   ­   Faculty   of   Engineering   University   of   Bristol 
 
February   2016   ­   July   2016 
 
 
 
1/55 
 
Choix   de   mots   vague   : 
Un   outil   pour   l’humanisation   des   données 
 
Par 
Gonzalez   Jonas 
 
 
 
 
 
 
Département   de   Mathématique   et   ingénierie  
UNIVERSITE   DE   BRISTOL 
 
 
Superviseur   :       Docteur   Jonathan   Lawry   ­   Faculté   d’ingénierie   de   l’université   de   Bristol 
 
Février   2016   ­   Juillet   2016 
 
   
2/55 
RESUME 
 
Durant les dernières décennies il y a eu une importante amélioration de la précision dans                             
la lecture des données disponibles. Par exemple la géolocalisation est devenu de plus en plus                             
précise et peut maintenant identifier dans quelle pièce d’un bâtiment un utilisateur se trouve.                           
Pourtant ce niveau de précision est il toujours nécessaire ou même utile ? Par exemple un                               
utilisateur peut vouloir savoir sa position approximative avec l’aide de phrases tel que “ Vous                             
êtes   proche   de   la   station   de   train”   ou   “   Vous   êtes   toujours   loin   du   centre”.  
Avec ces deux phrases nous pouvons mettre en lumières deux mots particuliers “ proche” et                             
“loin”, pourquoi ? Car ils sont de nature vague, nous pouvons les lire et les comprendre                               
cependant il nous est impossible de leurs donner un sens précis. Ces mots ne sont pas fixe                                 
dans leur interprétation, ils dépendent du context, croyances de la personne qui les prononcent                           
ou   dépendent   d’un   phénomène   jusque   là   inconnu. 
La communication humaine est pleine de mots vague, nous les utilisons quotidiennement dans                         
notre travail, en société sans vraiment y prêter attention. Etonnamment nous parvenons à                         
communiquer sans créer d'ambiguïté même si nous avons des définitions opposée de ces mots.                           
De plus l’utilisation de mot vague permet dans une communication d’avoir une liberté                         
d’interprétation, cette tâche est laissé à l’interlocuteur qui doit adapter la sémantique de ces                           
mots par rapport à la personne qui les a utilisés. Le phénomène d’être vague peut être vu                                 
comme un degré de croyance à propos d’un concept et dépend donc de la personne qui les                                 
utilises   et   implique   un   traitement   particulier   de   la   part   de   l’interlocuteur. 
Ce papier vise à explorer l’utilisation de terme vague comme : grand, petit, gros; pour la                               
production automatique de résumé à partir de données numériques. De nos jour avec l’ère du                             
Big Data beaucoup de données numériques sont disponibles mais il y a peu de travaux se                               
concentrant sur la présentation de ces données. Ce projet explore l’humanisation des données                         
avec l’aide de termes vagues et d’algorithme d’apprentissage automatique pour proposer un                       
modèle   élégant   et   adaptatif   pour   le   résumé   linguistique   automatique. 
Le résumé automatique essai de généraliser un concept sur une base de données et a pour but                                 
de transmettre uniquement l’information pertinente. C’est pourquoi les mots vagues semblent                     
être le bon outil et une piste sérieuse à explorer de part leurs nature, l’utilisation de termes                                 
vague permet une plus grande interprétation pour l’interlocuteur et permet donc une                       
généralisation   d’un   concept. 
 
3/55 
Abstract 
 
In the last Decade there has been a major increase in precision of some types of readily                                 
available data. For example geo­localisation has become more and more precise and can now                           
pinpoint locations to within a particular room of a certain building. Yet this level of precision                               
always really required or even helpful ? For example an individual may want to know their                               
approximate locations with sentences like “ You are near the train station “ or “You are still                                 
away from the center”. With this two examples we can highlight two words “near” and “away”,                               
why ? Because they are vague, you read this words without noticing that in fact you can’t give                                   
a precise definition. They are not fix in their interpretation, they depends of the context,                             
beliefs   of   the   person   who   utter   it   or   are   the   effect   of   an   unknown   phenomenon. 
Human communication is full of vagueness, we use unconsciously everyday of our life, in work                             
environment, in the society... Surprisingly we manage to communicate without ambiguity even                       
if we don’t share the same definition on this kind of words. Moreover vagueness allow us in                                 
communication to have a freedom of interpretation, letting to the listener the difficult task of                             
computing the semantics. Vagueness can be seen has a degree of belief about a particular                             
concept and thus depend of the speaker state of mind and imply also a particular treatment for                                 
the   listener. 
This paper aim to explore the use vagueness for producing automatic summary of                         
numeric data with the use of vague terms such as : tall,small, big… Nowadays in the big data                                   
era a lot of numeric data are available but yet there is a lack of work for the presentation of                                       
this data. This project explore the humanisation of data with the help of vague term and                               
machine   learning   in   order   to   propose   an   elegant   and   adaptive   model   for   automatic   summary. 
Automatic summary try to generalize a concept over a dataset and aim to transmit only the                               
useful information. In that sense vagueness seems to be the perfect tools because of its                             
nature, vague term allow a bigger range of interpretation and thus allow generalization of a                             
concept. 
 
 
 
 
 
 
 
4/55 
 
HOST   ESTABLISHMENT   PRESENTATION 
A. Bristol   University  
The University of Bristol is a red brick research university located in Bristol, United                           
Kingdom. It received its royal charter in 1909, and its predecessor institution, University College                           
Bristol, had been in existence since 1876. Bristol is organised into six academic faculties                           
composed of multiple schools and departments running over 200 undergraduate courses                     
situated in the Clifton area along with three of its nine halls of residence. The other six halls are                                     
located in Stoke Bishop, an outer city suburb located 1.8 miles away. It is the largest                               
independent   employer   in   Bristol. 
The University of Bristol is ranked 11th in the UK for its research, according to the                               
Research Excellence Framework The University of Bristol is ranked 37th by the QS World                           
University Rankings 2015­16, and is ranked amongst the top ten of UK universities. The                           
University of Bristol is the youngest British university to be ranked among the top 40                             
institutions in the world according to the QS World University Rankings, and has also been                             
ranked   at   15th   in   the   world   in   terms   of   reputation. 
B.   Laboratory   :   Intelligent   System   Lab 
The University of Bristol has a long tradition of excellence in Artificial Intelligence, with                           
research groups in Engineering dating back to the 1970s and 1980s. Now all these traditions                             
have converged to form the Intelligent Systems Laboratory (ISL), a leading research unit                         
counting 15 members of staff (four professors) and about 50 PhD students and postdocs.                           
Research activities include foundational work in machine learning (many of the ISL members                         
work in this central area of research), and applications to web intelligence, machine translation,                           
bioinformatics, semantic image analysis, robotics, as well as natural intelligent systems. Besides                       
these applications, research in ISL is a key enabler in a number of strategic research directions.                               
Data Science is one of the main frontiers for modern AI, dealing with vast masses of data, both                                   
enabling their exploitation and benefiting from them. Another key frontier for intelligent                       
systems research is interacting with modern biology, both taking inspiration by it, and providing                           
tools   for   it. 
5/55 
B. Supervisor   :   Doctor   Jonathan   Lawry 
Dr Lawry is focussed on developing probabilistic models of vagueness (fuzziness) and                       
applying them across a number of application domains in Artificial Intelligence. His approach is                           
the identification of vagueness or fuzziness with linguistic (semantic) uncertainty. This                     
approach allows for a much more flexible representation framework in which both propositions                         
and valuations can be ordered in terms of their relative vagueness, and in which we can                               
capture both stronger and weaker versions of an assertion e.g. absolutely short, quite short                           
etc. This opens the possibility of developing choice models of assertion in which there is a clear                                 
rationale for choosing a vague statement over a (more) crisp one in the presence of                             
uncertainty. 
 
   
6/55 
Table   of   Contents 
I. Definition   of   Vagueness 
A. What   is   vagueness   ?   Why   Language   is   vague   ? 
B. Reasoning   and   use   of   vagueness 
C. Modelisation   of   vagueness 
II. Summarization   of   numeric   data   with   words  
A. State   of   Art 
B. Vagueness   in   automatic   summary,   a   fuzzy   question 
III. Mathematical   framework   for   automatic   summary 
A. General   presentation 
B. Detailed   framework,   end   to   be   vague 
IV. Results 
V. Case   study 
A. Application   to   web 
B. Results 
VI. Overview 
A. Participation   of   this   framework   within   the   axes   of   investigation 
B. Future   investigation   and   amelioration 
VII. Conclusion 
A. Personal   experience 
B. Professional   overview 
C. Greeting 
                              Bibliography 
 
page      8 
 
page      9 
page   10 
page   18 
 
page   21 
page   26 
 
page   31 
page   42 
page   48 
 
page   49 
page   50 
 
page   51 
page   53 
 
page   54 
 
page   55 
7/55 
I. Definition   of   Vagueness  
A. What   is   Vagueness   ?   Why   is   language   vague   ?  
 
An unexpected new trend is emerging in apps and web services, indeed the                         
information   presented   to   the   user   is   growing   vague.   So   what   is   vagueness   ?  
Vagueness can be explained with the help of Classic logic where a sentence is either True or                                 
False, this is known as the Boolean logic. But for some words and concepts of Human language                                 
such as “Tall” and “Small” classical logic don’t work well. This words are what we call vague,                                 
that is if you ask two people to classify a set of human according to their height into two                                     
category small and tall. The two person will reach quite the same classification but some                             
person will be classify as tall for one and small for the other. This is vagueness, even if in some                                       
case you will find no difference with Boolean logic, in some case an object can be tall and not                                     
tall as the same time. This effect of vagueness can’t be represented with Boolean logic, it’s                               
called borderline case. There is no precise, known height which defines the line between a                             
person who is tall and a person who is not. That is why borderline term is used because there                                     
is   a   range   where   you   can’t   be   absolutely   sure   that   someone   is   tall   or   not. 
Why then we don’t simply adopt as a definition that “tall” will mean above a particular                               
threshold   ?   How   Human   represent   vagueness   and   process   with   it   ? 
 
In a communication process view we can first think that vagueness is suboptimally                         
because it can create confusion and thus lead to misunderstanding. But in reality our brain and                               
communication deal perfectly with vagueness, letting to us the task of interpreting vagueness.                         
Even if you don’t think that house is tall because of your experience you can imagine and try to                                     
understand the other point of view. Vagueness allow different interpretation depending of the                         
protagonist of the conversation, but even with this ambiguity they manage to understand and                           
accomplish complex task based on vague information. This is the reason why vagueness is                           
treated   in   the   field   of   artificial   intelligence   and   more   particularly   in   reasoning   under   uncertainty. 
 
“    A   concept­expression   is   vague   if   it   is   indeterminate,   for   some   objects,   whether   or   not 
the   concept   applies   “ 
Gottlob   Frege 
   
8/55 
 
B. Reasoning   and   use   of   Vagueness  
 
We saw in the previous section the nature of vagueness and its particularity, but                           
what can we do with vagueness ? Vagueness is proper to language so to Human                             
communication, then the first use we might think for application of vagueness is for                           
humanising information. For example in a communication process a person will never say “ I                             
just bought a new shirt for one hundred twenty five pounds and fifty five penny” instead you                                 
will rather say something like “ I just bought an expensive shirt” or “ I just buy a shirt for an                                         
hundred of pounds”. In this example we see the difference between the crisp assertion in the                               
first place followed by the vague assertion just by reading. One seems more natural, could                             
have been uttered by an Human the other one is clearly unnatural and seems emotionless too                               
much   mathematical.  
Our brain generalize information from our environment every second, and it do the same when                             
we talk with other people. We sum the information that we want to transmit using vagueness,                               
this is done naturally, unconsciously and we don’t have issue to understand each other. This is                               
because we focus on the important information, the information that will transmit the true                           
concept. In the example the price was not the primary, important information that was the                             
fact that i buy a shirt, furthermore with the word expansive i can predict with the information                                 
that   i   have   over   the   speaker   in   what   range   of   price   it   could   be. 
Reasoning with vagueness as we see in this little example is focused on balancing with                             
crisp and vague terms to express an information, concept, feelings and emotions. But the                           
particularity of vagueness is that it's depends of the interpretation of the two people and how                               
they can reach a common understanding. One hack of human communication well know is the                             
misunderstanding when both speaker and listener interpret differently a concept but don’t                       
realise it. Vagueness is always present in communication process and is at the base of human                               
speech. The modelization of vagueness will always imply a speaker and a listener, because                           
vagueness   rely   on   the   belief   and   goals   of   the   two   people   involved   in   the   discussion. 
 
 
   
9/55 
 
C. Modelisation   of   Vagueness 
 
So the first question we have to explore is : How to modelize vagueness ? We see that                                   
with classical logic you can’t, but some researcher propose variation of Boolean logic. New                           
mathematical tools emerge for modelize vagueness, they take basics in philosophy and more                         
especially in the problematic : representation of knowledge. In order to understand this tools                           
we have first to define the difference between the truth value of vague term and classical logic                                 
term.  
Classical term will have only two truth value : True or False, in opposition vague term have this                                   
pattern but in top they admit borderline case. If we take the example of person height and the                                   
characterisation of this feature with small and tall, we can find a threshold under which we are                                 
absolutely sure that these person is small and another threshold above we are definitely sure                             
that   this   person   is   tall.  
 
Height   of   a   person      :       
 
Tall   is   true   if      :                                                                                                                  Small   is   true   if    :   
BorderLine     case      :  
 
Where  correspond   to   the   threshold   for   Small   and   Tall 
Figure   1   :   Vagueness   with   BorderLine   case 
 
 
According to figure 1 we can see that we have crisp threshold to statut whether a                               
person is tall or small. But in the borderline case we can’t determine if the person is Tall or                                     
Small, it’s undefined. Moreover the value of the threshold is problematic, how to choose this                             
thresholds   ?  
 
 
 
 
 
10/55 
Epistemicists in vagueness want to retain classical logic and they endorse the somewhat                         
surprising claim that there's actually such a thresholds (they claim we know the existential                           
generalization `there is an n that such and such' even if there is no Particular n of which we                                     
know that such and such). Many philosophers, however, find this claim something too hard to                             
swallow   and   take   it   as   evidence   that   classical   logic   should   be   modified. 
One problem that all different philosophers try to solve and where classical logic can’t be used                               
is the sorites paradox. The are a lot of variant in sorite paradox but all of them rely on the                                       
difficulty   of   representing   and   deal   with   vague   term.  
 
Heap   Paradox 
 
1   grain   of   wheat   does   not   make   a   heap. 
If   1   grain   of   wheat   does   not   make   a   heap   then   2   grains   of   wheat   do   not. 
If   2   grains   of   wheat   do   not   make   a   heap   then   3   grains   do   not. 
… 
If   9,999   grains   of   wheat   do   not   make   a   heap   then   10,000   do   not.   10,000   grains   of 
wheat   do   not   make   a   heap.  
 
Figure   2.   Sorites   Paradox 
 
Solving this problem with classical logic force to reach absurd inference like 1000 grains                           
doesn’t form a heap. That is why different logic emerge to threat and deal with this kind of                                   
paradox, in every model we will refer to this issue and see how every model propose to solve                                   
it. 
 
 
 
( All the following definition of modelisation of vagueness are from the book [1] from the                               
researcher   Kees   van   Deemter.   ) 
 
   
11/55 
 
1. Supervaluationism 
 
According to supervaluationists, borderline statements lack a truth­value. This                 
neatly explains why it is universally impossible to know the truth­value of a borderline                           
statement (to recall a truth­value of a statement p for example, is True or False in standard                                 
logic ). Supervaluationism exploit the law of excluded middle to treat with vague term, for                             
example instead of having the predicate “Charles is a baby “ they have “ Charles is a baby or is                                       
not the case that charles is a baby”. Thus the method of supervaluationists allows one to                               
retain all the theorems of standard logic while admitting “truth­value gaps”. The basic thought                           
underlying supervaluationism is that vagueness is a matter of underdetermination of meaning.                       
This thought is captured with the idea that the use we make of an expression does not decide                                   
between a number of admissible candidates for making the expression precise. For example,                         
we can make it precise by saying that x is a baby just in case x is less than one year old; but                                             
the use of the expression will allow other ways of making precise like `less than one year plus                                   
a second'. If Martin is one year old, the sentence `Martin is a baby' will be true in some ways                                       
of making `baby' precise and false in others. Since our use does not decide which of the ways                                   
of making precise is correct, the truth­value of the sentence `Martin is a baby' is left unsettled.                                 
By supervaluationist standards, a sentence is true just in case it is true in every way of making                                   
precise the vague expressions contained in it (that is, `truth is supertruth'). A precisification is                             
a way of making precise all the expressions of the language so that every sentence gets a                                 
truth­value (true or false but not both) in each precisification. In this sense, a precisification is                               
a   classical   truth­value   assignment. 
As part of his solution to the sorites paradox, the supervaluationist will assert ‘There was                             
grains not being an heap when n grains but will be a heap when n+1 grains’. For this                                   
statement comes out true under all admissible precisification of heap. However, when pressed                         
the supervaluationist will add an unofficial clarification: “Oh, of course I do not mean that there                               
really   is   a   sharp   threshold   for   an   heap.” 
 
 
   
12/55 
2. Fuzzy   Logic 
 
Fuzzy logic was introduced by the mathematician Lotfi A.Zadeh, is a form of many                           
valued logic where the truth value of a variable is in the range [0,1] and called a membership.                                   
Fuzzy logic in opposition of Boolean logic deal with set of object where there is no a precise                                   
define criteria of membership. The fuzzy set theory then deal with this kind of classes for                               
example to define the old set, there is no precise way of decide whether an object is old or not.                                       
Instead fuzzy set works with membership function which assign to each object a continuous                           
value   traducing   how   much   this   object   is   rely   to   this   set. 
So how fuzzy logic can be use in vagueness ? As we said before a vague term is a                                     
particular object where you have borderline case, to illustrate how fuzzy logic can be use to                               
deal with vagueness we will use the set old. In pure set theory “ The class of old people” can’t                                       
constitute a set or classes in the usual mathematical sense of these terms. But with fuzzy                               
theory we can because of the continuous value of the membership function, the class of old                               
people is represented with a degree of confidence. To represent closer the way of human                             
compute this kind of linguistic term, fuzzy set theory combine different terms, more often the                             
antonym is taken to represent the class of a vague term. In our example the antonym of old is                                     
young,   then   we   can   define   two   fuzzy   set   as   we   can   see   on   this   figure. 
Figure   2.   Fuzzy   set   example   for   vague   term   :    Age 
 
With this representation according to an input x representing the age you will have                           
different membership value for : young, middle age and old. This theory try to modelize how                               
our brain works with differents beliefs, every person have a particular mechanism to represents                           
the vague class such as old people. In that sense you will probably find a matching with                                 
another person for the classification of an old person according to his height, that is you will                                 
share quite the same membership function and threshold. But in some case because of the                             
threshold taken you will infer that a particular height is considered as old and middle age for                                 
another person. The inference in fuzzy logic is called the defuzzification, it consist on choosing                             
13/55 
with the different membership values which one to retain. The most simple way of doing this is                                 
taking   the   max   of   each   membership   function   to   classify   the   object. 
Fuzzy set theory is a promising theory and can be view also with a degree of belief                                 
about classification into sets. Moreover fuzzy set theory propose an elegant way to solve the                             
sorites paradox. Taking back our example of the heap, with the fuzzy logic the paradox is                               
solved with the membership function. Indeed as we see a 1000 grains doesn’t form an heap                               
with the implication rules but with fuzzy logic sorites paradox is not an issue. Your input                               
variable x representing the number of grains when is high will have a very low membership                               
value for the class “ Not a heap” and become greater for the class “ Is a heap”. The sorites                                       
paradox is solved without difficulty because in fuzzy logic is just a matter of degree of                               
appartenance.  
Even if fuzzy logic solve the sorites paradox a bias remain, the modelisation of fuzzy set                               
imply to choose for the membership function a threshold where you are sure that below or                               
above it, your concept is true or False ( as figure 1 ). The choose of the membership function                                     
and the associate threshold are crucial, this choices are made by empirical experiment and yet                             
there is not an accepted and shared methodology to do so. What is missing in fuzzy logic is an                                     
emergent process to compute automatically this threshold and the choice of the shape of the                             
membership   functions.  
   
14/55 
3. Many­valued   Logic 
 
Many­valued logics come from the field of non­classical logics and differ from the most                           
common logic : Boolean. They are similar to classical logic because they accept the principle of                               
truth­functionality, namely, that the truth of a compound sentence is determined by the truth                           
values of its component sentences (and so remains unaffected when one of its component                           
sentences is replaced by another sentence with the same truth value). But they differ from                             
classical logic by the fundamental fact that they do not restrict the number of truth values to                                 
only   two:   they   allow   for   a   larger   set    W    several   truth   degrees. 
The fuzzy set theory came from this point of view as we already presented it, we will                                 
introduce another theory : The three valued logic. Three valued logic is a good start point to                                 
understand the mechanism behind the many valued logic. This logic was introduced by the                           
mathematician Kleene and consist of three truth value {0,½,1}, where the ½ correspond to                           
undefined truth value. This theory can be applied to vagueness where crisp term are always in                               
the truth value domain {0,1} and vague term in the domain {0,½,1}. The mechanism of                             
inference in this logic rely on two major principle of disjunction and conjunction. The                           
disjunction for three value logic works generally with the maximum of the truth value, and for                               
disjunction   the   lowest   value   is   taken. 
But this is only one way to treat this operators, another wild use semantics is to mix with the                                     
probability theory. This figure sum up the difference of semantics for many valued logic and                             
regroup   the   two   most   used   definition   of   disjunction   and   conjunction. 
 
 
 
 
 
 
 
Figure   3.   Operators   Logic  
With the probabilistic interpretation of this operators the sorites paradox can be solve,                         
the use of recursive implication will decrease the truth value of the concept. The truth value                               
will be the product of the previous sentences, as at one moment the predicate will have the                                 
truth value ½ the product will then decrease and reach zero at some point. Thus the concept                                 
of   heap   will   have   at   one   moment   a   zero   truth   value   and   then   be      false. 
15/55 
4. Contextualism  
 
Epistemic contextualism (EC) is a recent and hotly debated position. EC is roughly the                           
view that what is expressed by a knowledge attribution — a claim to the effect that S ‘knows’                                   
that p — depends partly on something in the context of ‘the attributor’, and hence the view is                                   
often called ‘attributor contextualism’. So EC, of the sort which will concern us here, is a                               
semantic thesis: it concerns the truth conditions of knowledge sentences, and/or the                       
propositions expressed by utterances thereof. The thesis is that it is only relative to a                             
contextually­determined standard that a knowledge sentence expresses a complete                 
proposition: change the standard, and you change what the sentence expresses; acontextually,                       
however, no such proposition is expressed. In this respect, knowledge utterances are supposed                         
to resemble utterances involving uncontroversially context­sensitive terms. For instance, just                   
what   proposition   is   expressed   by   an   utterance   of   : 
 
1. He   is   a   tall   person 
2. That’s   red 
3. He   is   nice   person 
 
depends in certain obvious ways upon such facts as the location (1) or identity (2) of the                                 
speaker, and/or the referent of the demonstrative (in 3). Contextualism rely in the mere fact                             
that every truth value of a predicate depend of the context. To illustrate this vision let’s                               
suppose we have a jury of two person who have to classify person according to their height                                 
and   suppose   that   we   have   two   scenario   :  
 
­ Every   person   to   judge   pass   on   a   stage   one   by   one   and   leave   the   scene. 
­ Every   person   to   judge   come   on   the   stage   and   stay  
 
In the first scenario the context will not be build dynamically because all the people leave the                                 
stage once they been classify. You will have disparity in the judgement of the jury due to the                                   
fact that they both rely on their personal knowledge, interpretation of the predicate tall and                             
small. Whilst in the other hand in the second scenario they will have all the people/input in                                 
front of them, the context will then be built dynamically depending to the set of people                               
presented. So they will reach more or less the same classification because they built the same                               
context,   with   this   reflexion   the   sorites   paradox   it’s   solved.  
16/55 
But it is solved theoretically, contextualism doesn’t propose a mathematical model to modelize                         
this view. This more a philosophical point of view on which you can start to built a more                                   
concrete   implementation   by   using   mathematical   tools. 
Many critic come to this solution because contextualism only displace the issue of vagueness                           
into the mind of Human. For contextualist all the definition we can give to vague term depend                                 
entirely to our psychological state, which was made by our experiment in life. A good example                               
is   comparing   two   child   :  
­ One   who   grew   up      in   a   healthy   family. 
­ The   second      in   a   poor   one.  
When they will be teenager they won't have the same definition of vague term such as                               
“expansive”. They built two different knowledge about this term according to their development                         
and the environment they were confronted. It’s like during our childhood development                       
especially in the first year of our life, the brain learn and fix by habits threshold to represent                                   
vague term. That is why when you are more aged it’s more complicated to change the meaning                                 
over   vague   predicate,   you   are   conditioned   from   your   personal   experience. 
 
 
 
Personal   choice  
 
There is a lot of other possibility to deal with vagueness but we choose to explore the                                 
most used one. After a careful reflexion we choose in our project to explore the use of Fuzzy                                   
Logic mixed with contextualism. This motivation is argued with the wish of developing an                           
adaptive autonomous framework to represent vagueness. The contextualism point of view can                       
be explored with non supervised learning algorithm in order to propose an emergent                         
mechanism to learn and find the right threshold for vague term. Furthermore we think that this                               
methodology is closer to the mechanism involved in the human brain, this is support with                             
lecture   in   psychology   and   cognitive   science   over   the   child   development   of   language. 
   
17/55 
II. Summarization   of   numeric   Data   with   words 
The summarization of data is an active field in research especially in artificial                         
intelligence and data mining. With the increase of data a lot of information can be extracted,                               
yet we struggle with the presentation of this information. Actually a lot of improvement are                             
made but they focus on machine learning, classification task and are reserved to specialist for                             
understanding and deal with the knowledge extracted. A regain of interest is made now in the                               
presentation of data, the goal is to make the information understandable by majority of human                             
and didn’t require special knowledge. This field regroup researcher from natural language                       
processing, artificial intelligence, mathematics… Vagueness is widely exploited in this field as                       
is a base of human language, and especially in computer science. We will introduce and                             
present the different methods explored by researcher in this field and see how vagueness play                             
an   important   role   in   this   area.   
A. State   of   Art 
Automatic summary provide a new tool to extract knowledge from a large set of                           
information. It is a communication process where according to some inputs, information must                         
be extracted and transmitted to another. To represent this process a two agent model can be                               
used composed with a speaker and a listener. The speaker is the major actor, he is in charge                                   
to compute the inputs information in order to extract patterns. This patterns are then                           
translated into a communication channel and transmitted to the second agent, the listener. The                           
listener is a passive actor and only receive the information from the speaker and interpret it. In                                 
some application he had to choose an action to perform depending to the message he                             
received. This kind of issues require a more complex modelization, the field of Game theory is                               
often used where the model ( speaker and listener ) is represented as a game. The goal being                                   
to find the best summary/information to transmit in order to maximize the award as typical                             
game   model,   the   difficulty   is   to   find   the   right   award   according   to   the   application. 
 
The speaker is the important part while the listener in most of the case are human and                                 
no works have to be made on this part. The modelisation of the speaker agent depends on the                                   
application, it is composed of different and specific part. For example the speaker must                           
implement an algorithm to treat the input information and extract pattern, knowledge, this is                           
where   the   field   of   Machine   Learning   is   used.  
18/55 
The information infer from this part is indigest to present for an human, that is why the                                 
speaker have to also implement a linguistic process to communicate the extracted patterns.                         
The field of Natural language generation with the field of game theory propose elegant                           
architecture and mathematical framework. In the case where the listener use the information                         
from the speaker to do an action the speaker must take in account this issue for the process of                                     
summarization. He have to endorse a mechanism to exploit the beliefs and actions of the                             
listener to produce and choose the right message. This kind of problematic rely on the                             
treatment and decision under uncertainty, game theory algorithm again are widely explored.                       
To sum up and make more precise what have been said, Yager in [2] propose an elegant and                                   
modulable   model   for   automatic   summary      : 
Figure   4.   Yager   Summary   model  
 
The summarizer is a linguistic sentence and most of the time is a vague one, to illustrate this                                   
model   let’s   take   a   easy   example   : 
 
S       =   middle   age  
Q       =   most  
T degree of truth ( have to be calculated according to the Dataset, fuzzy logic is used                                  
by   Yader   to   compute   it   ). 
 
Yager pose the basics for summarization and propose a general architecture that can be                           
complexified depending to the application. In the case of T, fuzzy logic is most of the time                                  
used, P. Villar and al in [2] use Yager model to propose an automatic summarization of the                                 
opinion of tourism hotel data. Villar and al go deeper in the use of linguistic term to describe                                   
heterogeneous data and propose a fuzzy model based on semantic translation as a tool to                             
produce linguistic summary. Their goal is to benefit from the use of linguistic term in order for                                 
a data analyst and a non specialist to understand and exploit the inferred information. Their                             
works start from the Yager model with in upstream a process to identify and classify the input                                 
information   in   order   to   extract   in   their   application   sentiment   classification.  
19/55 
This classification is important because it’s use to determine the fame of an hotel according to                               
different   textual   and   numeric   input. 
To modelize vagueness they choose fuzzy logic, trapezoidal curve were chosen for membership                         
function. Their main works focus in the mix and calculation of degree of truth from different                               
vague term. They go much deeper in this way and propose several method for the                             
aggregation of vague term and different approaches to balance with the lost of information                           
induced   from   vagueness. 
 
A.Ramos and al in [3] also explore the use of fuzzy logic to produce automatic textual                               
short­term weather forecast for the region of Galicia. Their approach is mainly based on fuzzy                             
operators and crisp one but they innove with the use of intermediate language to capture                             
vagueness and produce linguistic term. In opposition at the model of Yader the architecture                           
of A.Ramos implements a pre process where information is extracted and computed into                         
intermediate code for finally with natural language templates produce linguistic weather                     
forecast. 
Figure   5.   Architecture   for   short   term   weather   forecast 
 
Even if a lot of architecture for linguistic summarization exist they all take basics with the                               
Yader architecture. The two other architecture we saw complexify and focus on some specific                           
part of the summary as a finer modelisation of vagueness and are dependent of the                             
application. 
In the next part we will focus and go deeper in the mathematical formalism adopted to                               
modelize and represent vagueness in this two articles. Fuzzy logic will be more explained with                             
the   presentation   of   the   works   of   D.Dubois   a   referent   in   this   field. 
20/55 
B.    Vagueness   in   automatic   summary,   a   fuzzy   question 
In this part we will focus on the formalism adopted in [2],[3],[4] and in particular                             
the use of Fuzzy Logic. Fuzzy logic was introduced by Zadeh to formalize the human                             
knowledge, Dubois in [5] explain how fuzzy set theory and vagueness are rely even if Zadeh                               
wanted a distinction. The claim that fuzzy sets are a basic tool for addressing vagueness of                               
linguistic terms has been around for a long time. But some researcher as Novak make the                               
opposition of vagueness with uncertainty, thus vague term must fulfill three features : The                           
existence   of   BorderLine   Case,   Unsharp   boundaries,   Susceptibility   to   sorites   paradox. 
 
Fuzzy logic have been controversial for philosophers, many of them are reluctant to consider a                             
different truth value system than Boolean one. One of the reasons for the misunderstanding                           
between fuzzy sets and the philosophy of vagueness may lie in the fact that Zadeh was trained                                 
in engineering mathematics, not in the area of philosophy. In particular, vagueness is often                           
understood as a defect of natural language (since it is not appropriate for devising formal                             
proofs, it questions usual rational forms of reasoning). Actually, the vagueness of linguistic                         
terms was considered as a logical nightmare for early 20th century philosophers. In contrast,                           
for Zadeh, going from Boolean logic to fuzzy logic is viewed as a positive move: it captures                                 
tolerance to errors (softening blunt threshold effects in algorithms) and may account for the                           
flexible use of words by people. It also help for information summarisation: detailed                         
descriptions are sometimes hard to make sense of, while summaries, even if imprecise, are                           
easier to grasp. The link between fuzzy set theory and vague term can be argue with the idea                                   
that it’s natural to represent incomplete knowledge with set. But fuzzy logic was understood in                             
various way and help to modelize uncertainty, degree of belief and can even be connected with                               
modal   logic. 
 
Vagueness is a phenomenon observed in the way people use language, and is                         
characterized by variability in the use of some concepts between the listener and speaker. It                             
may be that one cause of such variability is the gradual perception of some concepts or some                                 
words in natural language. This variability of interpretation, perception can be use in automatic                           
summary to capture a concept like in [4] where A.Ramos­Soto and al use it to generate                               
weather forecast. In [3], [5] they also use vagueness to produce automatic summary, even if                             
they   all   use   Fuzzy   Logic   to   represent   vagueness   they   all   differ   in   the   way   they   implemented   it. 
 
21/55 
Ramos­Soto   :   Weather   forecast 
Ramos and al in [4] proposed an architecture to automatically produce weather forecast                         
summary over the 315 Galician municipalities. Formally, each municipality M has an associated                         
forecast data series :        which includes data series       
for the input variables considered: sky state ( ), wind ( ) and maximum ( )                             
and minimum ( ) temperature. For clarity reasons in follow we will consider a single                             
municipality   data   series.  
For each forecast data series , Ramos and al obtains linguistic descriptions about                         
seven forecast variables, namely cloud coverage, precipitation, wind, maximum and minimum                     
temperature variation and maximum and minimum temperature climatic behavior. For this,                     
they have devised a computational method divided in several linguistic description generation                       
operators. Here is the process where fuzzy logic is used to translate this features into vague                               
term,   we   will   take   the   sky   data   to   illustrate   : 
­ The first stage in Ramos and al application is to first transform the chronological data                               
serie into temporal linguistic term. To do so they use fuzzy set to represent linguistic temporal                               
term   {    Beginning,   Half,   End   }    with   an   associated   membership   function. 
­ The second stage is to catch the concept associated with the main feature, here sky                               
data   are   traduced   into   CCL   =   {C,   P   C,   V   C}   (“clear”,“partly   cloudy”,   “very   cloudy”)   fuzzy   sets.  
 
The procedure then is to concatenate all this temporal description taking the maximum degree                           
of membership in the fuzzy sets. The output is then an intermediate code with vague term                               
which describe the weather in precise time window. The global process is repeated for each                             
features, Ramos and al choose to first translate the numeric data into vague term in order to                                 
produce   at   the   end   with   template   method   and   NLG   methods   a   linguistic   weather   forecast.  
This   figure   can   summarize   the   process   more   clearly   : 
  
 
 
 
 
 
 
 
22/55 
Ramon   A.Carrasco   :   Automatic   summary   for   Tourism   Web   Data  
 
In this paper [3] Ramon and al propose a novel model to aggregate heterogeneous data                             
from various websites with opinions about hotels. Ramon and al focus on the mathematical                           
modelization of vague term and how to solve the issue of crisp boundaries. They use the same                                 
architecture as Yager in [1] but go deeper in the representation of vagueness and the                             
formalism adopted. As in [4] they gather various data from websites about hotels opinions, but                             
they differ in the way that they use linguistic input from forum or comment section in some                                 
rating website (TripAdvisor…). We will only focus on the way they treat and implement fuzzy                             
logic with vague term as this is the core issue, to illustrate our talk we will take as an example                                       
the   age   of   the   clients   in   an   Hotel. 
A set of seven terms on the age of the hotel guest could be given as follows: = baby,                                         
= child, = teenager, = young, = adult, = mature and = old. The semantic                                       
ie the membership value is calculated with unbalanced trapezoidal functions. Trapezoidal                     
functions are represented with the 4­tuple : { } where and represent the interval                               
where the concept is totally true ie 1, and represent the two threshold respectively lower                                 
and upper where the concept is false ie 0. The vague interval is represented in the interval                                 
   and       the   membership   value   can   only   be   in   the   [0,1]   interval. 
To solve the issue of crisp threshold another metrics is added , it represent the range of                                   
translation for where the concept is True. To deal with this, two set and for                                   
high and low translation are created in order to compute the truth degree of a concept. The                                 
idea of the translation is to catch the different possible interpretation according to the choice of                               
the thresholds. Furthermore the authors propose a weighted model for example to highlight                         
metrics from valuable client. They determine two operators to do so and , the                             
first one is just a weight sum the second is a more complex aggregation and can be viewed as                                     
a quasiarithmetic average. This procedure is applicable for vague term but also for crisp term,                             
to deal with it the authors propose a grammar G in which they store the space of interpretation                                   
of the terms. For example primary term have no D parameter whereas some have a high/low                               
comparative   term   that   means   they   have   the    D    parameter.  
 
 
 
 
 
23/55 
 
This   figure   summarize   the   process   proposed   by   Ramos   and   al:  
 
Figure   5.   Fuzzy   model   based   on   semantic   translation 
 
JANUZS   Kacprzyk   Fuzzy   Logic   for   linguistic   summarization   of   DataBases 
 
Januzs in his paper aim to produce a new query system interface to display the                             
information from a Database. This system is based on fuzzy logic and take basics with the                               
Yader architecture { S : summarizer, Q : quantity in agreement, T : degree of truth }. The                                   
innovation in this proposed paper is not in a the way Januzs use fuzzy logic to represent                                 
vagueness   but   in   the   way   he   treated   the   combinational   issue   of   automatic   summary. 
To   highlight   this   issue   let   take   the   study   case   of   Januzs   in   his   paper   :   Computer   retailer. 
For example to summarize the sell of a computer many options can arise like “most of the sell                                   
are second hand”, but it can be more precise by adding conjunction,disjunction “ Most of the                               
sell are second hand and/or recent computer”. Given a set of attribute A and a vocabulary to                                 
describe the features V with connector like AND, OR the research space of the possible                             
summary   are   huge   and   become   an   issue   to   compute   with   a   large   database. 
Calculate the validity of each summary is a considerable task, and George and Srikanth                           
(1996) use a genetic algorithm to find the most appropriate summary in the space search. In                               
his approach, Januzs use also a genetic algorithm, and the overall quality (goodness) of a                             
summary is given by the weighted sum of some partial quality indicators, TI, ..., Ts (cf.                               
Kacprzyk and Yager, 1999) and the weights are derived from expert testimonies from pairwise                           
24/55 
comparisons between the particular indicators using Saaty's AHP method. Thus, basically, the                       
problem   is   to   find   an   optimal   summary   ,   S*       {    S   }    such   that   :   
 
Where       represent   the   weight   associated   to   the   linguistic   term    . 
In this paper the author highlights the combinational issue for the automatic                       
summarization, in our work a special focus is made on this part in order to propose a fast and                                     
reliable   algorithm   to   generate   the   optimal   summary. 
 
Vagueness   to   Classify  
 
We saw in all this different paper that vagueness is a wide use phenomenon especially                             
in summarization task. This is because in the summarization task the goal is to find the most                                 
global summary that englobe different concepts such as for example :” Well paid workers”. In                             
this summary the interpretation of “ well paid” allow a greater set of possible object than the                                 
sentence “ workers paid 120$ per hour”. That can be viewed as classification problem where                             
the goal is to create set using vague concept, this is why vague adjective are good way to do                                     
so. They allow a larger range of interpretation and thus can theoretically catch fuzzy concept                             
and   carry   out   more   information. 
In our work we take basics with all the techniques we saw previously, our wish was to                                 
propose a modulable architecture for automatic summary. But we focus more on classification                         
task as final goal like for example the discrimination between diabetic and non diabetic patients                             
with the use of vagueness. Moreover our work was also focused on proposing a solution for the                                 
bias that exist in modelisation of vagueness like finite threshold, decision making. In order to                             
be closer of the Human thinking we explore paper on psychology, psycholinguistic in childhood                           
development of language, to propose a mathematical model directly inspired from human brain                         
behavior. 
 
 
 
 
 
 
 
25/55 
III. Mathematical   Framework   for   automatic   summary 
A. General   presentation 
We saw in the last section the nature of vagueness and the way to use it for automatic                                   
summary. In the papers [3],[4],[5] the authors propose new model and architecture for                         
different application of summarization with vagueness, more precisely with fuzzy logic. In [2]                         
Yader pose a basic architecture which can be complexified according to the application.                         
Furthermore we saw that the modelisation of vagueness depending on the choice of the logic                             
imply some bias, the most critical one is the choice of the membership function and the                               
threshold when used with fuzzy logic. To be closer of the Human thinking we explore the                               
development of language and meaning of vague adjective in the development of the child.                           
S.Andersen in the paper [7] explore the process involved by child to treat vague concept such                               
as words cup or glass. Thus the idea is to take inspiration from child language emergence, how                                 
do   they   treat   vague   concept   ?   How   do   they   learn   the   boundary   involved   in   vague   concept   ?  
The overall process explored in this work take basics from the progress in the field of cognitive                                 
science, that is the mix of different point of view from mathematics, psychology, logic to                             
understand and modelize the human cognition. We wanted to follow this process of reasoning                           
to   propose   a   general   and   reliable   framework   inspired   from   the   Human   brain   behavior. 
To do so we split the framework into different part each focused on specific treatment, the goal                                 
was   to   propose   based   on   Yager   model   [1]   a      inspired   cognitive   architecture.  
Figure   6.   Framework   Architecture 
 
 
26/55 
Extraction   of   Data 
This part deal with the extraction of the information from Dataset, we use in this work                               
the UCI repository a famous repository for ML, the extraction of data is made for the format                                 
CSV. CSV format state for comma separate value, in top of this convention the class attribute                               
have to be the last one in the row. The nature of the informations are not restricted to numeric                                     
data,   we   extend   with   nominal   value   but   only   one   with   less   than   8   possible   values. 
This is not a crucial part of the project but is essential for the rest of the framework, it began to                                         
be essential when we propose a concrete application : a Web extractor data. The idea is to                                 
extract the data directly from a website using a web crawler, in this case Celenium library was                                 
used. In the interest of having a modular framework a inheritance model was built, to adapt                               
the process for a specific website a few parameters have to be settled. This mechanism allow                               
to test our framework on real data from the internet and from regular Dataset like UCI                               
repository.  
 
Model 
In this framework a model can be view as a definition of an object, that is all the                                   
features that describe this object. All the name of the attribute, the number of class, their                               
name and the range values for nominal features. The model part is the basics of the                               
framework, it’s a definition of the object and give the right linguistic term to produce at the end                                   
for the summary. For example to describe the diabetic­Pima Dataset several features are taken                           
like the concentration of plasma glucose in the blood after 2 hours, the age…. This attributes                               
are used to produce sentences at the end of the summary process with connectors ( And, OR)                                 
to rely and produce a more convincing summary. This is inspired from Ramos and al in [4]                                 
where they use template for the generation of linguistic summary, the motivation by doing that                             
is   to   focus   only   in   vague   term   and   not   in   crisp   term. 
Furthermore this modelisation allow to adapt the framework to different model, for any                         
Dataset   only   one   class   have   to   be   written   to   describe   the   object. 
 
 
 
 
 
 
 
27/55 
Math   :   Fuzzy   set 
This part is the mathematical definition of fuzzy set, it gather all the fuzzy term with                               
their mathematical definition. The choice of membership function and other parameters proper                       
to Fuzzy set theory are set in this part of the framework. With the other math part they both                                     
constitute   the   base   to   treat   with   vagueness   according   to   contextualism   point   of   view.  
 
Math   :   Statistics   /   Machine   learning 
Given a Dataset to summarize a lot of statistics metrics can be extracted such as                             
distribution over one particular feature or several, or learning of some dependency between set                           
of objects... In this framework this part focus on extracting distribution graphics like cumulative                           
histogram, representation of object in several dimension... This part rely more on computer                         
modelisation rather than algorithmic process, the role of this part is to translate the input data                               
into mathematical presentation. We use framework such as Scipy or numpy to allow an easy                             
manipulation. Library such as sktlearn have been widely used to test and include machine                           
learning algorithm like clustering along several dimensions. To follow our thought of “ inspired                           
cognitive architecture” this part would be the traduction one, where brut data are treated to                             
allow the extraction of knowledge. This part is linked with the context part as we can see in                                   
figure 6 , this two part represented the contextualism point of view. The context is dynamically                             
built from the input data which are traduced and computed to produce the mathematical                           
context   to   treat   with   vagueness. 
 
 
Math   :   Context 
To recall with the contextualism point of view, to treat with vagueness this logic propose                             
that every threshold proper to vague term exist but are not fixed. This threshold are computed                               
dynamically according to the set of object presented. This class is directly inspired with this                             
philosophy, the context is represented with a fraction of the dataset thus for every vague term                               
the   threshold   are   computed   actively. 
 
 
 
 
 
 
28/55 
Language  
This part regroup all the vague vocabulary used to produce summary, such as : big,                             
low, high, normal...To follow Zadeh fuzzy theory like in [8] where fuzzy quantifiers are used                             
to modify the sense of vague term, we used : very, most. For example very is a quantifiers                                   
which can be used to amplify the sense of vague term such as tall, and thus influence the                                   
membership function. Others quantifiers are used to capture the group characterised with the                         
summary depending to the distribution, fraction of object represented by the summary. For                         
example when someone tell “ some of the birds are smart” the some attribute refer to a portion                                   
of a set but not a precise one. That is why it can be used when the summary didn’t fit all the                                           
target concept (ie the class) for balancing the summary and be true in the interpretation.                             
Moreover it can be combined as in Zadeh proposed in [8] with other distribution vague term                               
such as “most of all”, “ the majority” in order to cover and transmit the maximum information                                 
with   the   summary   over   the   target   concept. 
 
Summarizer  
This part follow the architecture proposed by Yader in [2], it regroup the 3­tuple :                             
Summarizer,Quantity agreement, Truth value : { S,Q,T }. The difference is that in this                           
framework the summarizer is viewed as a classification task, the goal being to find the most                               
accurate linguistic summary to discriminate the different class present in the input dataset. In                           
our model the quantity agreement and truth value used the same mathematical metrics :                           
distribution of a given summary over the dataset. That is that according to the probability                             
assigned to a given summary a quantitative agreement is computed in order to catch and                             
transmit with linguistic term the distribution. This process can be view with the information                           
theory as a choice problem where according to a specific distribution a quantity agreement                           
have   to   be   chosen      in   order   to   transmit   with   the   summary   the   right      interpretation. 
 
Decision   making 
Here the mathematical branch of deciding under uncertainty ie with probability are                       
widely use. But in a computational process two method was explored to deal with this issue,                               
the first one consist of traducing numeric features with word, for example : “ 1.86 meters “ ­>                                   
“ high meters” before the summary process. This method allow to keep a very fast framework                               
but in the other hand restrict and delete some useful information for the decision making part.                               
That is why with the second method all the membership values are kept and treated at the end                                   
where   a   mathematical   decision   equation   is   used   to   perform   the   choice   classification. 
29/55 
 
Display   Summary 
In order to produce a linguistic summary a specific part is used to do so, with the output                                   
of the Decision making part and the Model part a summary is generated. A very basic flow of                                   
NLG is used here to produce the sentences, most of the production is done with the template ie                                   
the   Model. 
 
 
 
We describe here the general part of the framework, the behavior and role of the                             
different part composing it. All the architecture was inspired by neurofunctional knowledge, in                         
the next part a direct link will be made, with the evolution of the different version of the                                   
framework. 
 
 
 
   
30/55 
B.   Detailed   Framework,   end   to   be   vague 
In the following part, the architecture will be detailed with argumentation about the                         
choice   of   conception   and   the   mathematics   tools   used   in   every   part.  
 
1. Framework   version   1.0  
The first version of the framework was very simple and was more focus on trying                             
different hypothesis before to start modeling all the architecture, we start with this very simple                             
architecture   : 
 
Figure   7.   Framework­V1   architecture 
 
 
The first architecture as shown in the figure 7 is quite basics and was test on the                                 
Iris­Dataset from the UCL repository. The idea in this first version is to test the following                               
hypothesis   :  
Suppose that our data has attributes so that takes values in for                               
. Let and we let denote the vector of attributes values                         
.   A   data   set   takes   then   the   form   of       where    . 
Now we define a language with propositional variables : which are in our                             
case vague term : High , Low. Then for every attributes of the data set we match each                                       
propositional   variables   of       according   to   this   formula   : 
Given an alpha , two threshold are computed over the cumulative histogram . The                             
low   threshold   is   given   by       . 
   
31/55 
The upper threshold is given by then we have , the condition on                             
alpha   is   :       due   to   the   upper   bound   given   by    . 
 
 
Figure   8.   Calcul   of   threshold   example 
 
Given this threshold over all the attribute of we can with fuzzy set theory calcul                                 
the membership value for the propositional variables . To modelize the membership                       
function we choose to use trapezoidal curve of type Right for lower bound and Left for upper                                 
bound,   this   choice   is   totally   empirical. 
 
With this two membership function for each data in the dataset a truth                               
value for the propositional value is calculated. The decision maker in this version is based on a                                 
maximum average, to illustrate this we will take the Iris dataset. Given a class and a                                 
attribute with the language the goal is to two find the best sentence to                               
describe   a   specific   attribute   according   to   a   given   class. 
 
 
 
 
 
32/55 
For the class Iris­setosa with the attribute sepal length the formula to find the right vague                               
term          to   describe   it   follow   this   formula   : 
 
 
 
Figure   9.   Decision   making   Framework­V1 
 
This first version of the framework was not guided by cognitive perspective and the goal                             
was to see the discrimination offered by the restricted language . We didn’t                         
explore the classification of class using this linguistic description it will be done in the next                               
version   of   the   framework. 
 
   
33/55 
 
2.   Framework   version   Two  
 
After the result highlighted with the first version of the framework we decided to take                             
inspiration from the neuroscience and particularly in the treatment of language. In the human                           
brain two different area are majoritary involve in the production and comprehension of                         
language. The first one the Broca area was discovered by Paul Broca in the nineteenth century,                               
he examined the brain of a dead patient who had an unusual disorder. The patient was not                                 
able to talk even if no motor lesion on his tongue or mouth was noticed. Broca examined the                                   
brain and discover lesion in the posterior portion of the frontal lobe of the left hemisphere.                               
Years later Carl Wernicke a German neurologist, discovered another part of the brain, this one                             
involved in understanding language, in the posterior portion of the left temporal lobe. People                           
who had a lesion at this location could speak, but their speech was often incoherent and made                                 
no   sense.  
From this we decided to built an architecture that try to mimic this process by                             
dissociating the language and his semantics. In our model this was done by adding two part, a                                 
Language one which represent the Broca area and the Context which represent the Wernicke                           
area. The first part gather all the vocabulary and the rules to produce the rights sentences. The                                 
second one was involve in our framework in making sense of this word, in our case to put                                   
semantics   of   vague   term   with   the   use   of   fuzzy   logic. 
 
 
Figure   10.   Framework­V2   Architecture 
 
 
   
34/55 
The method used to put semantics on vague term is the same as the first version of the                                   
framework, with threshold computed from the cumulative distribution of attribute. Furthermore                     
we focus on to use summarization and vagueness for a classification task, the goal was to find                                 
the summary that allow the best discrimination between the different class. The problem                         
became then to find the best summary given a vocabulary that have the higher probability to                               
be true for the target class and lower probability to be true for the other class. This problematic                                   
can be view as a game and then algorithms from game theory can be used and especially the                                   
tree   search   using   an   heuristic.  
Taking the same definition as previously with a dataset with a vector of features , and a                                   
language       composed   of   propositional   value,      we   add   connectives   logic   symbol   :    . 
The task became to find the conjunction or disjunction of attributes that can well discriminate a                               
specific   class       using   this   heuristic   for   the   research   in   the   space   search   : 
 
 
 
Theta is the sentence explored in the tree search, is a combination of conjunction or disjunction of                                   
attributes with vague terms. The heuristic is directly translated with the idea that a sentence have to be                                   
most   true   for   the   target   class       and   a   little   true   for   the   other   class       . 
 
We choose for a computational and performance issue to first try to translate all the                             
numeric data with our language in this case . Following Zadeh in [8] we                           
introduce quantifiers to our language , this quantifiers act directly on the                       
membership function. The quantifier very put to the power the membership value of each                             
word, for the quite quantifiers the square root is used. The same membership function was                             
taken to modelize Low and High, for the word Normal we choose a gaussian membership. This                               
choice was made because even if you can’t describe how a single random thing happens, a                               
whole   mess   of   them   together   will   act   like   a   gaussian. 
 
So given a dataset with features describing by the vector , we translate the numeric                                 
data   into   linguistic   data   with   : 
  
 
In this figure we can see the results of this process with the translation into word of numeric                                   
input       : 
 
 
Figure   11.   Example   of   numeric   to   words   translation 
 
This translation allow in the search process to have better computational time, the algorithm                           
will   exhaustively   search   word   matching   using   several   sentence   (    )   given   by   the   vocabulary. 
35/55 
The search algorithm with the heuristic, calcul for a given class the maximum summary                           
that discriminate this class. This is an example of the output generated by this version of the                                 
framework   : 
   
Figure   12.   Iris   Summary   Class   Iris­versicolor   (   Class   1   ) 
 
The other issue treated in this version of this framework is the choice of the threshold                               
over the cumulative distribution for the upper and lower bound of our vague vocabulary. One                             
approach explored took inspiration from Januzs in [5] where he used genetic algorithm to find                             
the right summary where in our framework we use a tree space search. We took inspiration                               
from   Januzs   and   try   to   apply   genetic   algorithm   to   find   the   right   threshold   where   : 
 
­       The   population   is   the   threshold   vector   for   all   the   attributes 
­   The   crossing   function   is   based      on   taking   a   max/mean   of   all   the   selected   individuals 
­   The   fitness   function   is   the   probability   computed   given   a   summary   for   a   specific   class 
 
The application of genetic algorithm to find threshold was not a complete success, the                           
computational time was to important according to the results. We choose to explore another                           
method to dynamically find the right threshold for vague term by exploring achievements made                           
in   child   development   of   language. 
 
   
36/55 
 
3. Framework   Version   3 
 
In the last version of the framework we decide to focus first on the calcul of the                                 
threshold for vague term. After several lectures about the child development of language                         
especially in psycholinguistic we find an interesting alternative to threshold. In [7] the study                           
highlights that the child to learn new words concepts such as cup, works like a clustering                               
algorithm. That is that to catch the link between concept and a word the child first create set of                                     
different concepts and refine them during his development. When new words with new                         
concepts arise in his vocabulary conflict of words and meaning force the child to refine his                               
definition. The conjunction of this concepts create more precise boundary and thus a concept                           
become more precise by learning and error making. To treat with vague term we explore a                               
similar mechanism, we state as the contextualism that the semantics of vague term came from                             
the experience. During his development the child learn vague concept such as distance (away,                           
close) or about size (small, tall) by his environment. In our framework the environment where                             
the knowledge can be first extracted will be on a fraction of the input data ( Training Set) .                                     
This TS will be use to find the threshold with the use of a non supervised algorithm : k­means.                                     
In k­means algorithm the goal is to find the centroids that better balance our data. In our case                                   
the vocabulary is , so a 3­means algorithm will be use for every TS, the                             
goal being to output the three value of the centroids find by the algorithm. This centroids will                                 
be use as previously like threshold for membership function, the motivation of doing this is the                               
adaptability   of   the   method.  
Moreover we focus on the decision part, on the last version the numeric data were first                               
translate in the language L, the main issue is that we loose information with the max operator.                                 
Given a summary over a class , for each data to classify we keep the membership value                                     
[0­1]   and   inject   it   into   a   decision   equation. 
The final model proposed is the evolution of the version two but incorporating the dynamic                             
calcul   of   threshold   and   a   new   decision   process,   the   model   is   presented   in    Figure   6. 
 
 
   
37/55 
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport
GonzalezJonas_M2IAICI-Rapport

Weitere ähnliche Inhalte

Andere mochten auch

shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...
shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...
shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...shareNL
 
Ensayo sobre escaleras mecanicas
Ensayo sobre escaleras mecanicasEnsayo sobre escaleras mecanicas
Ensayo sobre escaleras mecanicasvaleriaandr
 
Creative Outlets Mississippi 2016
Creative Outlets Mississippi 2016Creative Outlets Mississippi 2016
Creative Outlets Mississippi 2016Brian Housand
 
Curiosity Mississippi 2016
Curiosity Mississippi 2016Curiosity Mississippi 2016
Curiosity Mississippi 2016Brian Housand
 
группа зайчик
группа зайчикгруппа зайчик
группа зайчикGBDOU №51
 
Inmersión Fotográfica a la Arquitectura Colonial
Inmersión Fotográfica a la Arquitectura ColonialInmersión Fotográfica a la Arquitectura Colonial
Inmersión Fotográfica a la Arquitectura ColonialErica De Sousa
 
Marom Bikson on EEG guided tES / TDCS
Marom Bikson on EEG guided tES / TDCSMarom Bikson on EEG guided tES / TDCS
Marom Bikson on EEG guided tES / TDCSmbikson
 

Andere mochten auch (9)

shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...
shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...
shareNL symposium autodelen 2016, Bart Stoffels, Wordt de zelfrijdende auto e...
 
Ensayo sobre escaleras mecanicas
Ensayo sobre escaleras mecanicasEnsayo sobre escaleras mecanicas
Ensayo sobre escaleras mecanicas
 
IB Small Group Project
IB Small Group ProjectIB Small Group Project
IB Small Group Project
 
Creative Outlets Mississippi 2016
Creative Outlets Mississippi 2016Creative Outlets Mississippi 2016
Creative Outlets Mississippi 2016
 
Curiosity Mississippi 2016
Curiosity Mississippi 2016Curiosity Mississippi 2016
Curiosity Mississippi 2016
 
группа зайчик
группа зайчикгруппа зайчик
группа зайчик
 
Inmersión Fotográfica a la Arquitectura Colonial
Inmersión Fotográfica a la Arquitectura ColonialInmersión Fotográfica a la Arquitectura Colonial
Inmersión Fotográfica a la Arquitectura Colonial
 
Marom Bikson on EEG guided tES / TDCS
Marom Bikson on EEG guided tES / TDCSMarom Bikson on EEG guided tES / TDCS
Marom Bikson on EEG guided tES / TDCS
 
ESTUDO SOBRE EVÓDIA E SÍNTIQUE
ESTUDO SOBRE EVÓDIA E SÍNTIQUEESTUDO SOBRE EVÓDIA E SÍNTIQUE
ESTUDO SOBRE EVÓDIA E SÍNTIQUE
 

Ähnlich wie GonzalezJonas_M2IAICI-Rapport

Objectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative ConnotationsObjectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative ConnotationsBeth Johnson
 
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docxlorainedeserre
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...The Higher Education Academy
 
Proposal Essay Outline
Proposal Essay OutlineProposal Essay Outline
Proposal Essay OutlineKenya Lucas
 
Insights into Innovation, Tokyo 8-6-10, Martha G. Russell
Insights into Innovation, Tokyo 8-6-10, Martha G. RussellInsights into Innovation, Tokyo 8-6-10, Martha G. Russell
Insights into Innovation, Tokyo 8-6-10, Martha G. RussellMartha Russell
 
Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...
Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...
Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...Istvan Csakany
 
Informal Essay Examples. Informal Interview Essay Interview Essays Free 30...
Informal Essay Examples. Informal Interview Essay  Interview  Essays  Free 30...Informal Essay Examples. Informal Interview Essay  Interview  Essays  Free 30...
Informal Essay Examples. Informal Interview Essay Interview Essays Free 30...Ashley Rosas
 
An Empirical Study on Comment Classification
An Empirical Study on Comment ClassificationAn Empirical Study on Comment Classification
An Empirical Study on Comment Classificationijtsrd
 
How the Foundation Model is Changing the Landscape of Natural Language Proces...
How the Foundation Model is Changing the Landscape of Natural Language Proces...How the Foundation Model is Changing the Landscape of Natural Language Proces...
How the Foundation Model is Changing the Landscape of Natural Language Proces...Ciente
 
ASIS&T Diane Sonnenwald Information Science as a Career
ASIS&T Diane Sonnenwald Information Science as a Career ASIS&T Diane Sonnenwald Information Science as a Career
ASIS&T Diane Sonnenwald Information Science as a Career ASIS&T
 
Expository Essay Thesis Statement.pdf
Expository Essay Thesis Statement.pdfExpository Essay Thesis Statement.pdf
Expository Essay Thesis Statement.pdfKristen Farnsworth
 
Cloud Computing Essays.pdf
Cloud Computing Essays.pdfCloud Computing Essays.pdf
Cloud Computing Essays.pdfJessica Spyrakis
 
Use Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorUse Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorLiz Danzico
 
From Crowdsourcing to BigData - how ePatients, and their machines, are transf...
From Crowdsourcing to BigData - how ePatients, and their machines, are transf...From Crowdsourcing to BigData - how ePatients, and their machines, are transf...
From Crowdsourcing to BigData - how ePatients, and their machines, are transf...Ferdinando Scala
 
sxsw-interactive-festival-2013)
sxsw-interactive-festival-2013)sxsw-interactive-festival-2013)
sxsw-interactive-festival-2013)Kristin Milburn
 
Thesis For Narrative Essay. Sample narrative essay
Thesis For Narrative Essay. Sample narrative essayThesis For Narrative Essay. Sample narrative essay
Thesis For Narrative Essay. Sample narrative essayHeidi Andrews
 
Understanding Context for UX Strategy UXSTRAT 2015
Understanding Context for UX Strategy UXSTRAT 2015 Understanding Context for UX Strategy UXSTRAT 2015
Understanding Context for UX Strategy UXSTRAT 2015 Andrew Hinton
 
Internet Essay Topics. Informative Essay Examples sample, Bookwormlab
Internet Essay Topics. Informative Essay Examples sample, BookwormlabInternet Essay Topics. Informative Essay Examples sample, Bookwormlab
Internet Essay Topics. Informative Essay Examples sample, BookwormlabNicole Muyeed
 

Ähnlich wie GonzalezJonas_M2IAICI-Rapport (20)

City Essay
City EssayCity Essay
City Essay
 
Objectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative ConnotationsObjectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative Connotations
 
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
2820181Phil 2 Puzzles and ParadoxesProf. Sven B.docx
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Proposal Essay Outline
Proposal Essay OutlineProposal Essay Outline
Proposal Essay Outline
 
Insights into Innovation, Tokyo 8-6-10, Martha G. Russell
Insights into Innovation, Tokyo 8-6-10, Martha G. RussellInsights into Innovation, Tokyo 8-6-10, Martha G. Russell
Insights into Innovation, Tokyo 8-6-10, Martha G. Russell
 
Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...
Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...
Crowdsourcing and Cognitive Data Analytics for Conflict Transformation - Istv...
 
Research 36. How to Write Significance. Code.601.pptx
Research 36. How to Write Significance.  Code.601.pptxResearch 36. How to Write Significance.  Code.601.pptx
Research 36. How to Write Significance. Code.601.pptx
 
Informal Essay Examples. Informal Interview Essay Interview Essays Free 30...
Informal Essay Examples. Informal Interview Essay  Interview  Essays  Free 30...Informal Essay Examples. Informal Interview Essay  Interview  Essays  Free 30...
Informal Essay Examples. Informal Interview Essay Interview Essays Free 30...
 
An Empirical Study on Comment Classification
An Empirical Study on Comment ClassificationAn Empirical Study on Comment Classification
An Empirical Study on Comment Classification
 
How the Foundation Model is Changing the Landscape of Natural Language Proces...
How the Foundation Model is Changing the Landscape of Natural Language Proces...How the Foundation Model is Changing the Landscape of Natural Language Proces...
How the Foundation Model is Changing the Landscape of Natural Language Proces...
 
ASIS&T Diane Sonnenwald Information Science as a Career
ASIS&T Diane Sonnenwald Information Science as a Career ASIS&T Diane Sonnenwald Information Science as a Career
ASIS&T Diane Sonnenwald Information Science as a Career
 
Expository Essay Thesis Statement.pdf
Expository Essay Thesis Statement.pdfExpository Essay Thesis Statement.pdf
Expository Essay Thesis Statement.pdf
 
Cloud Computing Essays.pdf
Cloud Computing Essays.pdfCloud Computing Essays.pdf
Cloud Computing Essays.pdf
 
Use Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence BehaviorUse Your Words: Content Strategy to Influence Behavior
Use Your Words: Content Strategy to Influence Behavior
 
From Crowdsourcing to BigData - how ePatients, and their machines, are transf...
From Crowdsourcing to BigData - how ePatients, and their machines, are transf...From Crowdsourcing to BigData - how ePatients, and their machines, are transf...
From Crowdsourcing to BigData - how ePatients, and their machines, are transf...
 
sxsw-interactive-festival-2013)
sxsw-interactive-festival-2013)sxsw-interactive-festival-2013)
sxsw-interactive-festival-2013)
 
Thesis For Narrative Essay. Sample narrative essay
Thesis For Narrative Essay. Sample narrative essayThesis For Narrative Essay. Sample narrative essay
Thesis For Narrative Essay. Sample narrative essay
 
Understanding Context for UX Strategy UXSTRAT 2015
Understanding Context for UX Strategy UXSTRAT 2015 Understanding Context for UX Strategy UXSTRAT 2015
Understanding Context for UX Strategy UXSTRAT 2015
 
Internet Essay Topics. Informative Essay Examples sample, Bookwormlab
Internet Essay Topics. Informative Essay Examples sample, BookwormlabInternet Essay Topics. Informative Essay Examples sample, Bookwormlab
Internet Essay Topics. Informative Essay Examples sample, Bookwormlab
 

GonzalezJonas_M2IAICI-Rapport

  • 3. RESUME    Durant les dernières décennies il y a eu une importante amélioration de la précision dans                              la lecture des données disponibles. Par exemple la géolocalisation est devenu de plus en plus                              précise et peut maintenant identifier dans quelle pièce d’un bâtiment un utilisateur se trouve.                            Pourtant ce niveau de précision est il toujours nécessaire ou même utile ? Par exemple un                                utilisateur peut vouloir savoir sa position approximative avec l’aide de phrases tel que “ Vous                              êtes   proche   de   la   station   de   train”   ou   “   Vous   êtes   toujours   loin   du   centre”.   Avec ces deux phrases nous pouvons mettre en lumières deux mots particuliers “ proche” et                              “loin”, pourquoi ? Car ils sont de nature vague, nous pouvons les lire et les comprendre                                cependant il nous est impossible de leurs donner un sens précis. Ces mots ne sont pas fixe                                  dans leur interprétation, ils dépendent du context, croyances de la personne qui les prononcent                            ou   dépendent   d’un   phénomène   jusque   là   inconnu.  La communication humaine est pleine de mots vague, nous les utilisons quotidiennement dans                          notre travail, en société sans vraiment y prêter attention. Etonnamment nous parvenons à                          communiquer sans créer d'ambiguïté même si nous avons des définitions opposée de ces mots.                            De plus l’utilisation de mot vague permet dans une communication d’avoir une liberté                          d’interprétation, cette tâche est laissé à l’interlocuteur qui doit adapter la sémantique de ces                            mots par rapport à la personne qui les a utilisés. Le phénomène d’être vague peut être vu                                  comme un degré de croyance à propos d’un concept et dépend donc de la personne qui les                                  utilises   et   implique   un   traitement   particulier   de   la   part   de   l’interlocuteur.  Ce papier vise à explorer l’utilisation de terme vague comme : grand, petit, gros; pour la                                production automatique de résumé à partir de données numériques. De nos jour avec l’ère du                              Big Data beaucoup de données numériques sont disponibles mais il y a peu de travaux se                                concentrant sur la présentation de ces données. Ce projet explore l’humanisation des données                          avec l’aide de termes vagues et d’algorithme d’apprentissage automatique pour proposer un                        modèle   élégant   et   adaptatif   pour   le   résumé   linguistique   automatique.  Le résumé automatique essai de généraliser un concept sur une base de données et a pour but                                  de transmettre uniquement l’information pertinente. C’est pourquoi les mots vagues semblent                      être le bon outil et une piste sérieuse à explorer de part leurs nature, l’utilisation de termes                                  vague permet une plus grande interprétation pour l’interlocuteur et permet donc une                        généralisation   d’un   concept.    3/55 
  • 4. Abstract    In the last Decade there has been a major increase in precision of some types of readily                                  available data. For example geo­localisation has become more and more precise and can now                            pinpoint locations to within a particular room of a certain building. Yet this level of precision                                always really required or even helpful ? For example an individual may want to know their                                approximate locations with sentences like “ You are near the train station “ or “You are still                                  away from the center”. With this two examples we can highlight two words “near” and “away”,                                why ? Because they are vague, you read this words without noticing that in fact you can’t give                                    a precise definition. They are not fix in their interpretation, they depends of the context,                              beliefs   of   the   person   who   utter   it   or   are   the   effect   of   an   unknown   phenomenon.  Human communication is full of vagueness, we use unconsciously everyday of our life, in work                              environment, in the society... Surprisingly we manage to communicate without ambiguity even                        if we don’t share the same definition on this kind of words. Moreover vagueness allow us in                                  communication to have a freedom of interpretation, letting to the listener the difficult task of                              computing the semantics. Vagueness can be seen has a degree of belief about a particular                              concept and thus depend of the speaker state of mind and imply also a particular treatment for                                  the   listener.  This paper aim to explore the use vagueness for producing automatic summary of                          numeric data with the use of vague terms such as : tall,small, big… Nowadays in the big data                                    era a lot of numeric data are available but yet there is a lack of work for the presentation of                                        this data. This project explore the humanisation of data with the help of vague term and                                machine   learning   in   order   to   propose   an   elegant   and   adaptive   model   for   automatic   summary.  Automatic summary try to generalize a concept over a dataset and aim to transmit only the                                useful information. In that sense vagueness seems to be the perfect tools because of its                              nature, vague term allow a bigger range of interpretation and thus allow generalization of a                              concept.                4/55 
  • 5.   HOST   ESTABLISHMENT   PRESENTATION  A. Bristol   University   The University of Bristol is a red brick research university located in Bristol, United                            Kingdom. It received its royal charter in 1909, and its predecessor institution, University College                            Bristol, had been in existence since 1876. Bristol is organised into six academic faculties                            composed of multiple schools and departments running over 200 undergraduate courses                      situated in the Clifton area along with three of its nine halls of residence. The other six halls are                                      located in Stoke Bishop, an outer city suburb located 1.8 miles away. It is the largest                                independent   employer   in   Bristol.  The University of Bristol is ranked 11th in the UK for its research, according to the                                Research Excellence Framework The University of Bristol is ranked 37th by the QS World                            University Rankings 2015­16, and is ranked amongst the top ten of UK universities. The                            University of Bristol is the youngest British university to be ranked among the top 40                              institutions in the world according to the QS World University Rankings, and has also been                              ranked   at   15th   in   the   world   in   terms   of   reputation.  B.   Laboratory   :   Intelligent   System   Lab  The University of Bristol has a long tradition of excellence in Artificial Intelligence, with                            research groups in Engineering dating back to the 1970s and 1980s. Now all these traditions                              have converged to form the Intelligent Systems Laboratory (ISL), a leading research unit                          counting 15 members of staff (four professors) and about 50 PhD students and postdocs.                            Research activities include foundational work in machine learning (many of the ISL members                          work in this central area of research), and applications to web intelligence, machine translation,                            bioinformatics, semantic image analysis, robotics, as well as natural intelligent systems. Besides                        these applications, research in ISL is a key enabler in a number of strategic research directions.                                Data Science is one of the main frontiers for modern AI, dealing with vast masses of data, both                                    enabling their exploitation and benefiting from them. Another key frontier for intelligent                        systems research is interacting with modern biology, both taking inspiration by it, and providing                            tools   for   it.  5/55 
  • 6. B. Supervisor   :   Doctor   Jonathan   Lawry  Dr Lawry is focussed on developing probabilistic models of vagueness (fuzziness) and                        applying them across a number of application domains in Artificial Intelligence. His approach is                            the identification of vagueness or fuzziness with linguistic (semantic) uncertainty. This                      approach allows for a much more flexible representation framework in which both propositions                          and valuations can be ordered in terms of their relative vagueness, and in which we can                                capture both stronger and weaker versions of an assertion e.g. absolutely short, quite short                            etc. This opens the possibility of developing choice models of assertion in which there is a clear                                  rationale for choosing a vague statement over a (more) crisp one in the presence of                              uncertainty.        6/55 
  • 7. Table   of   Contents  I. Definition   of   Vagueness  A. What   is   vagueness   ?   Why   Language   is   vague   ?  B. Reasoning   and   use   of   vagueness  C. Modelisation   of   vagueness  II. Summarization   of   numeric   data   with   words   A. State   of   Art  B. Vagueness   in   automatic   summary,   a   fuzzy   question  III. Mathematical   framework   for   automatic   summary  A. General   presentation  B. Detailed   framework,   end   to   be   vague  IV. Results  V. Case   study  A. Application   to   web  B. Results  VI. Overview  A. Participation   of   this   framework   within   the   axes   of   investigation  B. Future   investigation   and   amelioration  VII. Conclusion  A. Personal   experience  B. Professional   overview  C. Greeting                                Bibliography    page      8    page      9  page   10  page   18    page   21  page   26    page   31  page   42  page   48    page   49  page   50    page   51  page   53    page   54    page   55  7/55 
  • 8. I. Definition   of   Vagueness   A. What   is   Vagueness   ?   Why   is   language   vague   ?     An unexpected new trend is emerging in apps and web services, indeed the                          information   presented   to   the   user   is   growing   vague.   So   what   is   vagueness   ?   Vagueness can be explained with the help of Classic logic where a sentence is either True or                                  False, this is known as the Boolean logic. But for some words and concepts of Human language                                  such as “Tall” and “Small” classical logic don’t work well. This words are what we call vague,                                  that is if you ask two people to classify a set of human according to their height into two                                      category small and tall. The two person will reach quite the same classification but some                              person will be classify as tall for one and small for the other. This is vagueness, even if in some                                        case you will find no difference with Boolean logic, in some case an object can be tall and not                                      tall as the same time. This effect of vagueness can’t be represented with Boolean logic, it’s                                called borderline case. There is no precise, known height which defines the line between a                              person who is tall and a person who is not. That is why borderline term is used because there                                      is   a   range   where   you   can’t   be   absolutely   sure   that   someone   is   tall   or   not.  Why then we don’t simply adopt as a definition that “tall” will mean above a particular                                threshold   ?   How   Human   represent   vagueness   and   process   with   it   ?    In a communication process view we can first think that vagueness is suboptimally                          because it can create confusion and thus lead to misunderstanding. But in reality our brain and                                communication deal perfectly with vagueness, letting to us the task of interpreting vagueness.                          Even if you don’t think that house is tall because of your experience you can imagine and try to                                      understand the other point of view. Vagueness allow different interpretation depending of the                          protagonist of the conversation, but even with this ambiguity they manage to understand and                            accomplish complex task based on vague information. This is the reason why vagueness is                            treated   in   the   field   of   artificial   intelligence   and   more   particularly   in   reasoning   under   uncertainty.    “    A   concept­expression   is   vague   if   it   is   indeterminate,   for   some   objects,   whether   or   not  the   concept   applies   “  Gottlob   Frege      8/55 
  • 9.   B. Reasoning   and   use   of   Vagueness     We saw in the previous section the nature of vagueness and its particularity, but                            what can we do with vagueness ? Vagueness is proper to language so to Human                              communication, then the first use we might think for application of vagueness is for                            humanising information. For example in a communication process a person will never say “ I                              just bought a new shirt for one hundred twenty five pounds and fifty five penny” instead you                                  will rather say something like “ I just bought an expensive shirt” or “ I just buy a shirt for an                                          hundred of pounds”. In this example we see the difference between the crisp assertion in the                                first place followed by the vague assertion just by reading. One seems more natural, could                              have been uttered by an Human the other one is clearly unnatural and seems emotionless too                                much   mathematical.   Our brain generalize information from our environment every second, and it do the same when                              we talk with other people. We sum the information that we want to transmit using vagueness,                                this is done naturally, unconsciously and we don’t have issue to understand each other. This is                                because we focus on the important information, the information that will transmit the true                            concept. In the example the price was not the primary, important information that was the                              fact that i buy a shirt, furthermore with the word expansive i can predict with the information                                  that   i   have   over   the   speaker   in   what   range   of   price   it   could   be.  Reasoning with vagueness as we see in this little example is focused on balancing with                              crisp and vague terms to express an information, concept, feelings and emotions. But the                            particularity of vagueness is that it's depends of the interpretation of the two people and how                                they can reach a common understanding. One hack of human communication well know is the                              misunderstanding when both speaker and listener interpret differently a concept but don’t                        realise it. Vagueness is always present in communication process and is at the base of human                                speech. The modelization of vagueness will always imply a speaker and a listener, because                            vagueness   rely   on   the   belief   and   goals   of   the   two   people   involved   in   the   discussion.          9/55 
  • 10.   C. Modelisation   of   Vagueness    So the first question we have to explore is : How to modelize vagueness ? We see that                                    with classical logic you can’t, but some researcher propose variation of Boolean logic. New                            mathematical tools emerge for modelize vagueness, they take basics in philosophy and more                          especially in the problematic : representation of knowledge. In order to understand this tools                            we have first to define the difference between the truth value of vague term and classical logic                                  term.   Classical term will have only two truth value : True or False, in opposition vague term have this                                    pattern but in top they admit borderline case. If we take the example of person height and the                                    characterisation of this feature with small and tall, we can find a threshold under which we are                                  absolutely sure that these person is small and another threshold above we are definitely sure                              that   this   person   is   tall.     Height   of   a   person      :          Tall   is   true   if      :                                                                                                                  Small   is   true   if    :    BorderLine     case      :     Where  correspond   to   the   threshold   for   Small   and   Tall  Figure   1   :   Vagueness   with   BorderLine   case      According to figure 1 we can see that we have crisp threshold to statut whether a                                person is tall or small. But in the borderline case we can’t determine if the person is Tall or                                      Small, it’s undefined. Moreover the value of the threshold is problematic, how to choose this                              thresholds   ?             10/55 
  • 11. Epistemicists in vagueness want to retain classical logic and they endorse the somewhat                          surprising claim that there's actually such a thresholds (they claim we know the existential                            generalization `there is an n that such and such' even if there is no Particular n of which we                                      know that such and such). Many philosophers, however, find this claim something too hard to                              swallow   and   take   it   as   evidence   that   classical   logic   should   be   modified.  One problem that all different philosophers try to solve and where classical logic can’t be used                                is the sorites paradox. The are a lot of variant in sorite paradox but all of them rely on the                                        difficulty   of   representing   and   deal   with   vague   term.     Heap   Paradox    1   grain   of   wheat   does   not   make   a   heap.  If   1   grain   of   wheat   does   not   make   a   heap   then   2   grains   of   wheat   do   not.  If   2   grains   of   wheat   do   not   make   a   heap   then   3   grains   do   not.  …  If   9,999   grains   of   wheat   do   not   make   a   heap   then   10,000   do   not.   10,000   grains   of  wheat   do   not   make   a   heap.     Figure   2.   Sorites   Paradox    Solving this problem with classical logic force to reach absurd inference like 1000 grains                            doesn’t form a heap. That is why different logic emerge to threat and deal with this kind of                                    paradox, in every model we will refer to this issue and see how every model propose to solve                                    it.        ( All the following definition of modelisation of vagueness are from the book [1] from the                                researcher   Kees   van   Deemter.   )        11/55 
  • 12.   1. Supervaluationism    According to supervaluationists, borderline statements lack a truth­value. This                  neatly explains why it is universally impossible to know the truth­value of a borderline                            statement (to recall a truth­value of a statement p for example, is True or False in standard                                  logic ). Supervaluationism exploit the law of excluded middle to treat with vague term, for                              example instead of having the predicate “Charles is a baby “ they have “ Charles is a baby or is                                        not the case that charles is a baby”. Thus the method of supervaluationists allows one to                                retain all the theorems of standard logic while admitting “truth­value gaps”. The basic thought                            underlying supervaluationism is that vagueness is a matter of underdetermination of meaning.                        This thought is captured with the idea that the use we make of an expression does not decide                                    between a number of admissible candidates for making the expression precise. For example,                          we can make it precise by saying that x is a baby just in case x is less than one year old; but                                              the use of the expression will allow other ways of making precise like `less than one year plus                                    a second'. If Martin is one year old, the sentence `Martin is a baby' will be true in some ways                                        of making `baby' precise and false in others. Since our use does not decide which of the ways                                    of making precise is correct, the truth­value of the sentence `Martin is a baby' is left unsettled.                                  By supervaluationist standards, a sentence is true just in case it is true in every way of making                                    precise the vague expressions contained in it (that is, `truth is supertruth'). A precisification is                              a way of making precise all the expressions of the language so that every sentence gets a                                  truth­value (true or false but not both) in each precisification. In this sense, a precisification is                                a   classical   truth­value   assignment.  As part of his solution to the sorites paradox, the supervaluationist will assert ‘There was                              grains not being an heap when n grains but will be a heap when n+1 grains’. For this                                    statement comes out true under all admissible precisification of heap. However, when pressed                          the supervaluationist will add an unofficial clarification: “Oh, of course I do not mean that there                                really   is   a   sharp   threshold   for   an   heap.”          12/55 
  • 13. 2. Fuzzy   Logic    Fuzzy logic was introduced by the mathematician Lotfi A.Zadeh, is a form of many                            valued logic where the truth value of a variable is in the range [0,1] and called a membership.                                    Fuzzy logic in opposition of Boolean logic deal with set of object where there is no a precise                                    define criteria of membership. The fuzzy set theory then deal with this kind of classes for                                example to define the old set, there is no precise way of decide whether an object is old or not.                                        Instead fuzzy set works with membership function which assign to each object a continuous                            value   traducing   how   much   this   object   is   rely   to   this   set.  So how fuzzy logic can be use in vagueness ? As we said before a vague term is a                                      particular object where you have borderline case, to illustrate how fuzzy logic can be use to                                deal with vagueness we will use the set old. In pure set theory “ The class of old people” can’t                                        constitute a set or classes in the usual mathematical sense of these terms. But with fuzzy                                theory we can because of the continuous value of the membership function, the class of old                                people is represented with a degree of confidence. To represent closer the way of human                              compute this kind of linguistic term, fuzzy set theory combine different terms, more often the                              antonym is taken to represent the class of a vague term. In our example the antonym of old is                                      young,   then   we   can   define   two   fuzzy   set   as   we   can   see   on   this   figure.  Figure   2.   Fuzzy   set   example   for   vague   term   :    Age    With this representation according to an input x representing the age you will have                            different membership value for : young, middle age and old. This theory try to modelize how                                our brain works with differents beliefs, every person have a particular mechanism to represents                            the vague class such as old people. In that sense you will probably find a matching with                                  another person for the classification of an old person according to his height, that is you will                                  share quite the same membership function and threshold. But in some case because of the                              threshold taken you will infer that a particular height is considered as old and middle age for                                  another person. The inference in fuzzy logic is called the defuzzification, it consist on choosing                              13/55 
  • 14. with the different membership values which one to retain. The most simple way of doing this is                                  taking   the   max   of   each   membership   function   to   classify   the   object.  Fuzzy set theory is a promising theory and can be view also with a degree of belief                                  about classification into sets. Moreover fuzzy set theory propose an elegant way to solve the                              sorites paradox. Taking back our example of the heap, with the fuzzy logic the paradox is                                solved with the membership function. Indeed as we see a 1000 grains doesn’t form an heap                                with the implication rules but with fuzzy logic sorites paradox is not an issue. Your input                                variable x representing the number of grains when is high will have a very low membership                                value for the class “ Not a heap” and become greater for the class “ Is a heap”. The sorites                                        paradox is solved without difficulty because in fuzzy logic is just a matter of degree of                                appartenance.   Even if fuzzy logic solve the sorites paradox a bias remain, the modelisation of fuzzy set                                imply to choose for the membership function a threshold where you are sure that below or                                above it, your concept is true or False ( as figure 1 ). The choose of the membership function                                      and the associate threshold are crucial, this choices are made by empirical experiment and yet                              there is not an accepted and shared methodology to do so. What is missing in fuzzy logic is an                                      emergent process to compute automatically this threshold and the choice of the shape of the                              membership   functions.       14/55 
  • 15. 3. Many­valued   Logic    Many­valued logics come from the field of non­classical logics and differ from the most                            common logic : Boolean. They are similar to classical logic because they accept the principle of                                truth­functionality, namely, that the truth of a compound sentence is determined by the truth                            values of its component sentences (and so remains unaffected when one of its component                            sentences is replaced by another sentence with the same truth value). But they differ from                              classical logic by the fundamental fact that they do not restrict the number of truth values to                                  only   two:   they   allow   for   a   larger   set    W    several   truth   degrees.  The fuzzy set theory came from this point of view as we already presented it, we will                                  introduce another theory : The three valued logic. Three valued logic is a good start point to                                  understand the mechanism behind the many valued logic. This logic was introduced by the                            mathematician Kleene and consist of three truth value {0,½,1}, where the ½ correspond to                            undefined truth value. This theory can be applied to vagueness where crisp term are always in                                the truth value domain {0,1} and vague term in the domain {0,½,1}. The mechanism of                              inference in this logic rely on two major principle of disjunction and conjunction. The                            disjunction for three value logic works generally with the maximum of the truth value, and for                                disjunction   the   lowest   value   is   taken.  But this is only one way to treat this operators, another wild use semantics is to mix with the                                      probability theory. This figure sum up the difference of semantics for many valued logic and                              regroup   the   two   most   used   definition   of   disjunction   and   conjunction.                Figure   3.   Operators   Logic   With the probabilistic interpretation of this operators the sorites paradox can be solve,                          the use of recursive implication will decrease the truth value of the concept. The truth value                                will be the product of the previous sentences, as at one moment the predicate will have the                                  truth value ½ the product will then decrease and reach zero at some point. Thus the concept                                  of   heap   will   have   at   one   moment   a   zero   truth   value   and   then   be      false.  15/55 
  • 16. 4. Contextualism     Epistemic contextualism (EC) is a recent and hotly debated position. EC is roughly the                            view that what is expressed by a knowledge attribution — a claim to the effect that S ‘knows’                                    that p — depends partly on something in the context of ‘the attributor’, and hence the view is                                    often called ‘attributor contextualism’. So EC, of the sort which will concern us here, is a                                semantic thesis: it concerns the truth conditions of knowledge sentences, and/or the                        propositions expressed by utterances thereof. The thesis is that it is only relative to a                              contextually­determined standard that a knowledge sentence expresses a complete                  proposition: change the standard, and you change what the sentence expresses; acontextually,                        however, no such proposition is expressed. In this respect, knowledge utterances are supposed                          to resemble utterances involving uncontroversially context­sensitive terms. For instance, just                    what   proposition   is   expressed   by   an   utterance   of   :    1. He   is   a   tall   person  2. That’s   red  3. He   is   nice   person    depends in certain obvious ways upon such facts as the location (1) or identity (2) of the                                  speaker, and/or the referent of the demonstrative (in 3). Contextualism rely in the mere fact                              that every truth value of a predicate depend of the context. To illustrate this vision let’s                                suppose we have a jury of two person who have to classify person according to their height                                  and   suppose   that   we   have   two   scenario   :     ­ Every   person   to   judge   pass   on   a   stage   one   by   one   and   leave   the   scene.  ­ Every   person   to   judge   come   on   the   stage   and   stay     In the first scenario the context will not be build dynamically because all the people leave the                                  stage once they been classify. You will have disparity in the judgement of the jury due to the                                    fact that they both rely on their personal knowledge, interpretation of the predicate tall and                              small. Whilst in the other hand in the second scenario they will have all the people/input in                                  front of them, the context will then be built dynamically depending to the set of people                                presented. So they will reach more or less the same classification because they built the same                                context,   with   this   reflexion   the   sorites   paradox   it’s   solved.   16/55 
  • 17. But it is solved theoretically, contextualism doesn’t propose a mathematical model to modelize                          this view. This more a philosophical point of view on which you can start to built a more                                    concrete   implementation   by   using   mathematical   tools.  Many critic come to this solution because contextualism only displace the issue of vagueness                            into the mind of Human. For contextualist all the definition we can give to vague term depend                                  entirely to our psychological state, which was made by our experiment in life. A good example                                is   comparing   two   child   :   ­ One   who   grew   up      in   a   healthy   family.  ­ The   second      in   a   poor   one.   When they will be teenager they won't have the same definition of vague term such as                                “expansive”. They built two different knowledge about this term according to their development                          and the environment they were confronted. It’s like during our childhood development                        especially in the first year of our life, the brain learn and fix by habits threshold to represent                                    vague term. That is why when you are more aged it’s more complicated to change the meaning                                  over   vague   predicate,   you   are   conditioned   from   your   personal   experience.        Personal   choice     There is a lot of other possibility to deal with vagueness but we choose to explore the                                  most used one. After a careful reflexion we choose in our project to explore the use of Fuzzy                                    Logic mixed with contextualism. This motivation is argued with the wish of developing an                            adaptive autonomous framework to represent vagueness. The contextualism point of view can                        be explored with non supervised learning algorithm in order to propose an emergent                          mechanism to learn and find the right threshold for vague term. Furthermore we think that this                                methodology is closer to the mechanism involved in the human brain, this is support with                              lecture   in   psychology   and   cognitive   science   over   the   child   development   of   language.      17/55 
  • 18. II. Summarization   of   numeric   Data   with   words  The summarization of data is an active field in research especially in artificial                          intelligence and data mining. With the increase of data a lot of information can be extracted,                                yet we struggle with the presentation of this information. Actually a lot of improvement are                              made but they focus on machine learning, classification task and are reserved to specialist for                              understanding and deal with the knowledge extracted. A regain of interest is made now in the                                presentation of data, the goal is to make the information understandable by majority of human                              and didn’t require special knowledge. This field regroup researcher from natural language                        processing, artificial intelligence, mathematics… Vagueness is widely exploited in this field as                        is a base of human language, and especially in computer science. We will introduce and                              present the different methods explored by researcher in this field and see how vagueness play                              an   important   role   in   this   area.    A. State   of   Art  Automatic summary provide a new tool to extract knowledge from a large set of                            information. It is a communication process where according to some inputs, information must                          be extracted and transmitted to another. To represent this process a two agent model can be                                used composed with a speaker and a listener. The speaker is the major actor, he is in charge                                    to compute the inputs information in order to extract patterns. This patterns are then                            translated into a communication channel and transmitted to the second agent, the listener. The                            listener is a passive actor and only receive the information from the speaker and interpret it. In                                  some application he had to choose an action to perform depending to the message he                              received. This kind of issues require a more complex modelization, the field of Game theory is                                often used where the model ( speaker and listener ) is represented as a game. The goal being                                    to find the best summary/information to transmit in order to maximize the award as typical                              game   model,   the   difficulty   is   to   find   the   right   award   according   to   the   application.    The speaker is the important part while the listener in most of the case are human and                                  no works have to be made on this part. The modelisation of the speaker agent depends on the                                    application, it is composed of different and specific part. For example the speaker must                            implement an algorithm to treat the input information and extract pattern, knowledge, this is                            where   the   field   of   Machine   Learning   is   used.   18/55 
  • 19. The information infer from this part is indigest to present for an human, that is why the                                  speaker have to also implement a linguistic process to communicate the extracted patterns.                          The field of Natural language generation with the field of game theory propose elegant                            architecture and mathematical framework. In the case where the listener use the information                          from the speaker to do an action the speaker must take in account this issue for the process of                                      summarization. He have to endorse a mechanism to exploit the beliefs and actions of the                              listener to produce and choose the right message. This kind of problematic rely on the                              treatment and decision under uncertainty, game theory algorithm again are widely explored.                        To sum up and make more precise what have been said, Yager in [2] propose an elegant and                                    modulable   model   for   automatic   summary      :  Figure   4.   Yager   Summary   model     The summarizer is a linguistic sentence and most of the time is a vague one, to illustrate this                                    model   let’s   take   a   easy   example   :    S       =   middle   age   Q       =   most   T degree of truth ( have to be calculated according to the Dataset, fuzzy logic is used                                   by   Yader   to   compute   it   ).    Yager pose the basics for summarization and propose a general architecture that can be                            complexified depending to the application. In the case of T, fuzzy logic is most of the time                                   used, P. Villar and al in [2] use Yager model to propose an automatic summarization of the                                  opinion of tourism hotel data. Villar and al go deeper in the use of linguistic term to describe                                    heterogeneous data and propose a fuzzy model based on semantic translation as a tool to                              produce linguistic summary. Their goal is to benefit from the use of linguistic term in order for                                  a data analyst and a non specialist to understand and exploit the inferred information. Their                              works start from the Yager model with in upstream a process to identify and classify the input                                  information   in   order   to   extract   in   their   application   sentiment   classification.   19/55 
  • 20. This classification is important because it’s use to determine the fame of an hotel according to                                different   textual   and   numeric   input.  To modelize vagueness they choose fuzzy logic, trapezoidal curve were chosen for membership                          function. Their main works focus in the mix and calculation of degree of truth from different                                vague term. They go much deeper in this way and propose several method for the                              aggregation of vague term and different approaches to balance with the lost of information                            induced   from   vagueness.    A.Ramos and al in [3] also explore the use of fuzzy logic to produce automatic textual                                short­term weather forecast for the region of Galicia. Their approach is mainly based on fuzzy                              operators and crisp one but they innove with the use of intermediate language to capture                              vagueness and produce linguistic term. In opposition at the model of Yader the architecture                            of A.Ramos implements a pre process where information is extracted and computed into                          intermediate code for finally with natural language templates produce linguistic weather                      forecast.  Figure   5.   Architecture   for   short   term   weather   forecast    Even if a lot of architecture for linguistic summarization exist they all take basics with the                                Yader architecture. The two other architecture we saw complexify and focus on some specific                            part of the summary as a finer modelisation of vagueness and are dependent of the                              application.  In the next part we will focus and go deeper in the mathematical formalism adopted to                                modelize and represent vagueness in this two articles. Fuzzy logic will be more explained with                              the   presentation   of   the   works   of   D.Dubois   a   referent   in   this   field.  20/55 
  • 21. B.    Vagueness   in   automatic   summary,   a   fuzzy   question  In this part we will focus on the formalism adopted in [2],[3],[4] and in particular                              the use of Fuzzy Logic. Fuzzy logic was introduced by Zadeh to formalize the human                              knowledge, Dubois in [5] explain how fuzzy set theory and vagueness are rely even if Zadeh                                wanted a distinction. The claim that fuzzy sets are a basic tool for addressing vagueness of                                linguistic terms has been around for a long time. But some researcher as Novak make the                                opposition of vagueness with uncertainty, thus vague term must fulfill three features : The                            existence   of   BorderLine   Case,   Unsharp   boundaries,   Susceptibility   to   sorites   paradox.    Fuzzy logic have been controversial for philosophers, many of them are reluctant to consider a                              different truth value system than Boolean one. One of the reasons for the misunderstanding                            between fuzzy sets and the philosophy of vagueness may lie in the fact that Zadeh was trained                                  in engineering mathematics, not in the area of philosophy. In particular, vagueness is often                            understood as a defect of natural language (since it is not appropriate for devising formal                              proofs, it questions usual rational forms of reasoning). Actually, the vagueness of linguistic                          terms was considered as a logical nightmare for early 20th century philosophers. In contrast,                            for Zadeh, going from Boolean logic to fuzzy logic is viewed as a positive move: it captures                                  tolerance to errors (softening blunt threshold effects in algorithms) and may account for the                            flexible use of words by people. It also help for information summarisation: detailed                          descriptions are sometimes hard to make sense of, while summaries, even if imprecise, are                            easier to grasp. The link between fuzzy set theory and vague term can be argue with the idea                                    that it’s natural to represent incomplete knowledge with set. But fuzzy logic was understood in                              various way and help to modelize uncertainty, degree of belief and can even be connected with                                modal   logic.    Vagueness is a phenomenon observed in the way people use language, and is                          characterized by variability in the use of some concepts between the listener and speaker. It                              may be that one cause of such variability is the gradual perception of some concepts or some                                  words in natural language. This variability of interpretation, perception can be use in automatic                            summary to capture a concept like in [4] where A.Ramos­Soto and al use it to generate                                weather forecast. In [3], [5] they also use vagueness to produce automatic summary, even if                              they   all   use   Fuzzy   Logic   to   represent   vagueness   they   all   differ   in   the   way   they   implemented   it.    21/55 
  • 22. Ramos­Soto   :   Weather   forecast  Ramos and al in [4] proposed an architecture to automatically produce weather forecast                          summary over the 315 Galician municipalities. Formally, each municipality M has an associated                          forecast data series :        which includes data series        for the input variables considered: sky state ( ), wind ( ) and maximum ( )                              and minimum ( ) temperature. For clarity reasons in follow we will consider a single                              municipality   data   series.   For each forecast data series , Ramos and al obtains linguistic descriptions about                          seven forecast variables, namely cloud coverage, precipitation, wind, maximum and minimum                      temperature variation and maximum and minimum temperature climatic behavior. For this,                      they have devised a computational method divided in several linguistic description generation                        operators. Here is the process where fuzzy logic is used to translate this features into vague                                term,   we   will   take   the   sky   data   to   illustrate   :  ­ The first stage in Ramos and al application is to first transform the chronological data                                serie into temporal linguistic term. To do so they use fuzzy set to represent linguistic temporal                                term   {    Beginning,   Half,   End   }    with   an   associated   membership   function.  ­ The second stage is to catch the concept associated with the main feature, here sky                                data   are   traduced   into   CCL   =   {C,   P   C,   V   C}   (“clear”,“partly   cloudy”,   “very   cloudy”)   fuzzy   sets.     The procedure then is to concatenate all this temporal description taking the maximum degree                            of membership in the fuzzy sets. The output is then an intermediate code with vague term                                which describe the weather in precise time window. The global process is repeated for each                              features, Ramos and al choose to first translate the numeric data into vague term in order to                                  produce   at   the   end   with   template   method   and   NLG   methods   a   linguistic   weather   forecast.   This   figure   can   summarize   the   process   more   clearly   :                   22/55 
  • 23. Ramon   A.Carrasco   :   Automatic   summary   for   Tourism   Web   Data     In this paper [3] Ramon and al propose a novel model to aggregate heterogeneous data                              from various websites with opinions about hotels. Ramon and al focus on the mathematical                            modelization of vague term and how to solve the issue of crisp boundaries. They use the same                                  architecture as Yager in [1] but go deeper in the representation of vagueness and the                              formalism adopted. As in [4] they gather various data from websites about hotels opinions, but                              they differ in the way that they use linguistic input from forum or comment section in some                                  rating website (TripAdvisor…). We will only focus on the way they treat and implement fuzzy                              logic with vague term as this is the core issue, to illustrate our talk we will take as an example                                        the   age   of   the   clients   in   an   Hotel.  A set of seven terms on the age of the hotel guest could be given as follows: = baby,                                          = child, = teenager, = young, = adult, = mature and = old. The semantic                                        ie the membership value is calculated with unbalanced trapezoidal functions. Trapezoidal                      functions are represented with the 4­tuple : { } where and represent the interval                                where the concept is totally true ie 1, and represent the two threshold respectively lower                                  and upper where the concept is false ie 0. The vague interval is represented in the interval                                     and       the   membership   value   can   only   be   in   the   [0,1]   interval.  To solve the issue of crisp threshold another metrics is added , it represent the range of                                    translation for where the concept is True. To deal with this, two set and for                                    high and low translation are created in order to compute the truth degree of a concept. The                                  idea of the translation is to catch the different possible interpretation according to the choice of                                the thresholds. Furthermore the authors propose a weighted model for example to highlight                          metrics from valuable client. They determine two operators to do so and , the                              first one is just a weight sum the second is a more complex aggregation and can be viewed as                                      a quasiarithmetic average. This procedure is applicable for vague term but also for crisp term,                              to deal with it the authors propose a grammar G in which they store the space of interpretation                                    of the terms. For example primary term have no D parameter whereas some have a high/low                                comparative   term   that   means   they   have   the    D    parameter.             23/55 
  • 24.   This   figure   summarize   the   process   proposed   by   Ramos   and   al:     Figure   5.   Fuzzy   model   based   on   semantic   translation    JANUZS   Kacprzyk   Fuzzy   Logic   for   linguistic   summarization   of   DataBases    Januzs in his paper aim to produce a new query system interface to display the                              information from a Database. This system is based on fuzzy logic and take basics with the                                Yader architecture { S : summarizer, Q : quantity in agreement, T : degree of truth }. The                                    innovation in this proposed paper is not in a the way Januzs use fuzzy logic to represent                                  vagueness   but   in   the   way   he   treated   the   combinational   issue   of   automatic   summary.  To   highlight   this   issue   let   take   the   study   case   of   Januzs   in   his   paper   :   Computer   retailer.  For example to summarize the sell of a computer many options can arise like “most of the sell                                    are second hand”, but it can be more precise by adding conjunction,disjunction “ Most of the                                sell are second hand and/or recent computer”. Given a set of attribute A and a vocabulary to                                  describe the features V with connector like AND, OR the research space of the possible                              summary   are   huge   and   become   an   issue   to   compute   with   a   large   database.  Calculate the validity of each summary is a considerable task, and George and Srikanth                            (1996) use a genetic algorithm to find the most appropriate summary in the space search. In                                his approach, Januzs use also a genetic algorithm, and the overall quality (goodness) of a                              summary is given by the weighted sum of some partial quality indicators, TI, ..., Ts (cf.                                Kacprzyk and Yager, 1999) and the weights are derived from expert testimonies from pairwise                            24/55 
  • 25. comparisons between the particular indicators using Saaty's AHP method. Thus, basically, the                        problem   is   to   find   an   optimal   summary   ,   S*       {    S   }    such   that   :      Where       represent   the   weight   associated   to   the   linguistic   term    .  In this paper the author highlights the combinational issue for the automatic                        summarization, in our work a special focus is made on this part in order to propose a fast and                                      reliable   algorithm   to   generate   the   optimal   summary.    Vagueness   to   Classify     We saw in all this different paper that vagueness is a wide use phenomenon especially                              in summarization task. This is because in the summarization task the goal is to find the most                                  global summary that englobe different concepts such as for example :” Well paid workers”. In                              this summary the interpretation of “ well paid” allow a greater set of possible object than the                                  sentence “ workers paid 120$ per hour”. That can be viewed as classification problem where                              the goal is to create set using vague concept, this is why vague adjective are good way to do                                      so. They allow a larger range of interpretation and thus can theoretically catch fuzzy concept                              and   carry   out   more   information.  In our work we take basics with all the techniques we saw previously, our wish was to                                  propose a modulable architecture for automatic summary. But we focus more on classification                          task as final goal like for example the discrimination between diabetic and non diabetic patients                              with the use of vagueness. Moreover our work was also focused on proposing a solution for the                                  bias that exist in modelisation of vagueness like finite threshold, decision making. In order to                              be closer of the Human thinking we explore paper on psychology, psycholinguistic in childhood                            development of language, to propose a mathematical model directly inspired from human brain                          behavior.                25/55 
  • 26. III. Mathematical   Framework   for   automatic   summary  A. General   presentation  We saw in the last section the nature of vagueness and the way to use it for automatic                                    summary. In the papers [3],[4],[5] the authors propose new model and architecture for                          different application of summarization with vagueness, more precisely with fuzzy logic. In [2]                          Yader pose a basic architecture which can be complexified according to the application.                          Furthermore we saw that the modelisation of vagueness depending on the choice of the logic                              imply some bias, the most critical one is the choice of the membership function and the                                threshold when used with fuzzy logic. To be closer of the Human thinking we explore the                                development of language and meaning of vague adjective in the development of the child.                            S.Andersen in the paper [7] explore the process involved by child to treat vague concept such                                as words cup or glass. Thus the idea is to take inspiration from child language emergence, how                                  do   they   treat   vague   concept   ?   How   do   they   learn   the   boundary   involved   in   vague   concept   ?   The overall process explored in this work take basics from the progress in the field of cognitive                                  science, that is the mix of different point of view from mathematics, psychology, logic to                              understand and modelize the human cognition. We wanted to follow this process of reasoning                            to   propose   a   general   and   reliable   framework   inspired   from   the   Human   brain   behavior.  To do so we split the framework into different part each focused on specific treatment, the goal                                  was   to   propose   based   on   Yager   model   [1]   a      inspired   cognitive   architecture.   Figure   6.   Framework   Architecture      26/55 
  • 27. Extraction   of   Data  This part deal with the extraction of the information from Dataset, we use in this work                                the UCI repository a famous repository for ML, the extraction of data is made for the format                                  CSV. CSV format state for comma separate value, in top of this convention the class attribute                                have to be the last one in the row. The nature of the informations are not restricted to numeric                                      data,   we   extend   with   nominal   value   but   only   one   with   less   than   8   possible   values.  This is not a crucial part of the project but is essential for the rest of the framework, it began to                                          be essential when we propose a concrete application : a Web extractor data. The idea is to                                  extract the data directly from a website using a web crawler, in this case Celenium library was                                  used. In the interest of having a modular framework a inheritance model was built, to adapt                                the process for a specific website a few parameters have to be settled. This mechanism allow                                to test our framework on real data from the internet and from regular Dataset like UCI                                repository.     Model  In this framework a model can be view as a definition of an object, that is all the                                    features that describe this object. All the name of the attribute, the number of class, their                                name and the range values for nominal features. The model part is the basics of the                                framework, it’s a definition of the object and give the right linguistic term to produce at the end                                    for the summary. For example to describe the diabetic­Pima Dataset several features are taken                            like the concentration of plasma glucose in the blood after 2 hours, the age…. This attributes                                are used to produce sentences at the end of the summary process with connectors ( And, OR)                                  to rely and produce a more convincing summary. This is inspired from Ramos and al in [4]                                  where they use template for the generation of linguistic summary, the motivation by doing that                              is   to   focus   only   in   vague   term   and   not   in   crisp   term.  Furthermore this modelisation allow to adapt the framework to different model, for any                          Dataset   only   one   class   have   to   be   written   to   describe   the   object.                27/55 
  • 28. Math   :   Fuzzy   set  This part is the mathematical definition of fuzzy set, it gather all the fuzzy term with                                their mathematical definition. The choice of membership function and other parameters proper                        to Fuzzy set theory are set in this part of the framework. With the other math part they both                                      constitute   the   base   to   treat   with   vagueness   according   to   contextualism   point   of   view.     Math   :   Statistics   /   Machine   learning  Given a Dataset to summarize a lot of statistics metrics can be extracted such as                              distribution over one particular feature or several, or learning of some dependency between set                            of objects... In this framework this part focus on extracting distribution graphics like cumulative                            histogram, representation of object in several dimension... This part rely more on computer                          modelisation rather than algorithmic process, the role of this part is to translate the input data                                into mathematical presentation. We use framework such as Scipy or numpy to allow an easy                              manipulation. Library such as sktlearn have been widely used to test and include machine                            learning algorithm like clustering along several dimensions. To follow our thought of “ inspired                            cognitive architecture” this part would be the traduction one, where brut data are treated to                              allow the extraction of knowledge. This part is linked with the context part as we can see in                                    figure 6 , this two part represented the contextualism point of view. The context is dynamically                              built from the input data which are traduced and computed to produce the mathematical                            context   to   treat   with   vagueness.      Math   :   Context  To recall with the contextualism point of view, to treat with vagueness this logic propose                              that every threshold proper to vague term exist but are not fixed. This threshold are computed                                dynamically according to the set of object presented. This class is directly inspired with this                              philosophy, the context is represented with a fraction of the dataset thus for every vague term                                the   threshold   are   computed   actively.              28/55 
  • 29. Language   This part regroup all the vague vocabulary used to produce summary, such as : big,                              low, high, normal...To follow Zadeh fuzzy theory like in [8] where fuzzy quantifiers are used                              to modify the sense of vague term, we used : very, most. For example very is a quantifiers                                    which can be used to amplify the sense of vague term such as tall, and thus influence the                                    membership function. Others quantifiers are used to capture the group characterised with the                          summary depending to the distribution, fraction of object represented by the summary. For                          example when someone tell “ some of the birds are smart” the some attribute refer to a portion                                    of a set but not a precise one. That is why it can be used when the summary didn’t fit all the                                            target concept (ie the class) for balancing the summary and be true in the interpretation.                              Moreover it can be combined as in Zadeh proposed in [8] with other distribution vague term                                such as “most of all”, “ the majority” in order to cover and transmit the maximum information                                  with   the   summary   over   the   target   concept.    Summarizer   This part follow the architecture proposed by Yader in [2], it regroup the 3­tuple :                              Summarizer,Quantity agreement, Truth value : { S,Q,T }. The difference is that in this                            framework the summarizer is viewed as a classification task, the goal being to find the most                                accurate linguistic summary to discriminate the different class present in the input dataset. In                            our model the quantity agreement and truth value used the same mathematical metrics :                            distribution of a given summary over the dataset. That is that according to the probability                              assigned to a given summary a quantitative agreement is computed in order to catch and                              transmit with linguistic term the distribution. This process can be view with the information                            theory as a choice problem where according to a specific distribution a quantity agreement                            have   to   be   chosen      in   order   to   transmit   with   the   summary   the   right      interpretation.    Decision   making  Here the mathematical branch of deciding under uncertainty ie with probability are                        widely use. But in a computational process two method was explored to deal with this issue,                                the first one consist of traducing numeric features with word, for example : “ 1.86 meters “ ­>                                    “ high meters” before the summary process. This method allow to keep a very fast framework                                but in the other hand restrict and delete some useful information for the decision making part.                                That is why with the second method all the membership values are kept and treated at the end                                    where   a   mathematical   decision   equation   is   used   to   perform   the   choice   classification.  29/55 
  • 30.   Display   Summary  In order to produce a linguistic summary a specific part is used to do so, with the output                                    of the Decision making part and the Model part a summary is generated. A very basic flow of                                    NLG is used here to produce the sentences, most of the production is done with the template ie                                    the   Model.        We describe here the general part of the framework, the behavior and role of the                              different part composing it. All the architecture was inspired by neurofunctional knowledge, in                          the next part a direct link will be made, with the evolution of the different version of the                                    framework.            30/55 
  • 31. B.   Detailed   Framework,   end   to   be   vague  In the following part, the architecture will be detailed with argumentation about the                          choice   of   conception   and   the   mathematics   tools   used   in   every   part.     1. Framework   version   1.0   The first version of the framework was very simple and was more focus on trying                              different hypothesis before to start modeling all the architecture, we start with this very simple                              architecture   :    Figure   7.   Framework­V1   architecture      The first architecture as shown in the figure 7 is quite basics and was test on the                                  Iris­Dataset from the UCL repository. The idea in this first version is to test the following                                hypothesis   :   Suppose that our data has attributes so that takes values in for                                . Let and we let denote the vector of attributes values                          .   A   data   set   takes   then   the   form   of       where    .  Now we define a language with propositional variables : which are in our                              case vague term : High , Low. Then for every attributes of the data set we match each                                        propositional   variables   of       according   to   this   formula   :  Given an alpha , two threshold are computed over the cumulative histogram . The                              low   threshold   is   given   by       .      31/55 
  • 32. The upper threshold is given by then we have , the condition on                              alpha   is   :       due   to   the   upper   bound   given   by    .      Figure   8.   Calcul   of   threshold   example    Given this threshold over all the attribute of we can with fuzzy set theory calcul                                  the membership value for the propositional variables . To modelize the membership                        function we choose to use trapezoidal curve of type Right for lower bound and Left for upper                                  bound,   this   choice   is   totally   empirical.    With this two membership function for each data in the dataset a truth                                value for the propositional value is calculated. The decision maker in this version is based on a                                  maximum average, to illustrate this we will take the Iris dataset. Given a class and a                                  attribute with the language the goal is to two find the best sentence to                                describe   a   specific   attribute   according   to   a   given   class.            32/55 
  • 33. For the class Iris­setosa with the attribute sepal length the formula to find the right vague                                term          to   describe   it   follow   this   formula   :        Figure   9.   Decision   making   Framework­V1    This first version of the framework was not guided by cognitive perspective and the goal                              was to see the discrimination offered by the restricted language . We didn’t                          explore the classification of class using this linguistic description it will be done in the next                                version   of   the   framework.        33/55 
  • 34.   2.   Framework   version   Two     After the result highlighted with the first version of the framework we decided to take                              inspiration from the neuroscience and particularly in the treatment of language. In the human                            brain two different area are majoritary involve in the production and comprehension of                          language. The first one the Broca area was discovered by Paul Broca in the nineteenth century,                                he examined the brain of a dead patient who had an unusual disorder. The patient was not                                  able to talk even if no motor lesion on his tongue or mouth was noticed. Broca examined the                                    brain and discover lesion in the posterior portion of the frontal lobe of the left hemisphere.                                Years later Carl Wernicke a German neurologist, discovered another part of the brain, this one                              involved in understanding language, in the posterior portion of the left temporal lobe. People                            who had a lesion at this location could speak, but their speech was often incoherent and made                                  no   sense.   From this we decided to built an architecture that try to mimic this process by                              dissociating the language and his semantics. In our model this was done by adding two part, a                                  Language one which represent the Broca area and the Context which represent the Wernicke                            area. The first part gather all the vocabulary and the rules to produce the rights sentences. The                                  second one was involve in our framework in making sense of this word, in our case to put                                    semantics   of   vague   term   with   the   use   of   fuzzy   logic.      Figure   10.   Framework­V2   Architecture          34/55 
  • 35. The method used to put semantics on vague term is the same as the first version of the                                    framework, with threshold computed from the cumulative distribution of attribute. Furthermore                      we focus on to use summarization and vagueness for a classification task, the goal was to find                                  the summary that allow the best discrimination between the different class. The problem                          became then to find the best summary given a vocabulary that have the higher probability to                                be true for the target class and lower probability to be true for the other class. This problematic                                    can be view as a game and then algorithms from game theory can be used and especially the                                    tree   search   using   an   heuristic.   Taking the same definition as previously with a dataset with a vector of features , and a                                    language       composed   of   propositional   value,      we   add   connectives   logic   symbol   :    .  The task became to find the conjunction or disjunction of attributes that can well discriminate a                                specific   class       using   this   heuristic   for   the   research   in   the   space   search   :        Theta is the sentence explored in the tree search, is a combination of conjunction or disjunction of                                    attributes with vague terms. The heuristic is directly translated with the idea that a sentence have to be                                    most   true   for   the   target   class       and   a   little   true   for   the   other   class       .    We choose for a computational and performance issue to first try to translate all the                              numeric data with our language in this case . Following Zadeh in [8] we                            introduce quantifiers to our language , this quantifiers act directly on the                        membership function. The quantifier very put to the power the membership value of each                              word, for the quite quantifiers the square root is used. The same membership function was                              taken to modelize Low and High, for the word Normal we choose a gaussian membership. This                                choice was made because even if you can’t describe how a single random thing happens, a                                whole   mess   of   them   together   will   act   like   a   gaussian.    So given a dataset with features describing by the vector , we translate the numeric                                  data   into   linguistic   data   with   :       In this figure we can see the results of this process with the translation into word of numeric                                    input       :      Figure   11.   Example   of   numeric   to   words   translation    This translation allow in the search process to have better computational time, the algorithm                            will   exhaustively   search   word   matching   using   several   sentence   (    )   given   by   the   vocabulary.  35/55 
  • 36. The search algorithm with the heuristic, calcul for a given class the maximum summary                            that discriminate this class. This is an example of the output generated by this version of the                                  framework   :      Figure   12.   Iris   Summary   Class   Iris­versicolor   (   Class   1   )    The other issue treated in this version of this framework is the choice of the threshold                                over the cumulative distribution for the upper and lower bound of our vague vocabulary. One                              approach explored took inspiration from Januzs in [5] where he used genetic algorithm to find                              the right summary where in our framework we use a tree space search. We took inspiration                                from   Januzs   and   try   to   apply   genetic   algorithm   to   find   the   right   threshold   where   :    ­       The   population   is   the   threshold   vector   for   all   the   attributes  ­   The   crossing   function   is   based      on   taking   a   max/mean   of   all   the   selected   individuals  ­   The   fitness   function   is   the   probability   computed   given   a   summary   for   a   specific   class    The application of genetic algorithm to find threshold was not a complete success, the                            computational time was to important according to the results. We choose to explore another                            method to dynamically find the right threshold for vague term by exploring achievements made                            in   child   development   of   language.        36/55 
  • 37.   3. Framework   Version   3    In the last version of the framework we decide to focus first on the calcul of the                                  threshold for vague term. After several lectures about the child development of language                          especially in psycholinguistic we find an interesting alternative to threshold. In [7] the study                            highlights that the child to learn new words concepts such as cup, works like a clustering                                algorithm. That is that to catch the link between concept and a word the child first create set of                                      different concepts and refine them during his development. When new words with new                          concepts arise in his vocabulary conflict of words and meaning force the child to refine his                                definition. The conjunction of this concepts create more precise boundary and thus a concept                            become more precise by learning and error making. To treat with vague term we explore a                                similar mechanism, we state as the contextualism that the semantics of vague term came from                              the experience. During his development the child learn vague concept such as distance (away,                            close) or about size (small, tall) by his environment. In our framework the environment where                              the knowledge can be first extracted will be on a fraction of the input data ( Training Set) .                                      This TS will be use to find the threshold with the use of a non supervised algorithm : k­means.                                      In k­means algorithm the goal is to find the centroids that better balance our data. In our case                                    the vocabulary is , so a 3­means algorithm will be use for every TS, the                              goal being to output the three value of the centroids find by the algorithm. This centroids will                                  be use as previously like threshold for membership function, the motivation of doing this is the                                adaptability   of   the   method.   Moreover we focus on the decision part, on the last version the numeric data were first                                translate in the language L, the main issue is that we loose information with the max operator.                                  Given a summary over a class , for each data to classify we keep the membership value                                      [0­1]   and   inject   it   into   a   decision   equation.  The final model proposed is the evolution of the version two but incorporating the dynamic                              calcul   of   threshold   and   a   new   decision   process,   the   model   is   presented   in    Figure   6.          37/55