SlideShare ist ein Scribd-Unternehmen logo
1 von 17
The 5th International Joint
conference on Rules and Reasoning
Sept. 8–15, 2021
(virtually in) Leuven, Belgium
An Answer Set Programming based framework for High-Utility
Pattern Mining extended with Facets and Advanced Utility
Functions
Francesco Cauteruccio and Giorgio Terracina
DEMACS, University of Calabria, Italy
{cauteruccio, terracina}@mat.unical.it
Outline
• Context and Motivation
• Proposed Framework
• ASP Approach
• Experimental Evaluation
• Conclusion
1/17
Context and Motivation
• Pattern Mining is one of the most studied data mining branches
• Find interesting patterns (set of items) in a database of transactions
• Frequent pattern mining, sequential pattern mining, etc…
• High-Utility Pattern Mining (HUPM)
• Find patterns having a high-utility (w.r.t. some utility measure)
• Example: in a sales database, the utility of a pattern may be represented by the profit of some items bought
together.
• Basic assumption: each item is associated with one, static utility .
• However…
• The utility of an item can be defined from very different point of views,
• Transactions are not only flat lists of items but they can provide different level of abstractions.
Pattern Mining and High-Utility Pattern Mining
2/17
Context and Motivation
• We present a framework for HUPM extending basic notions and introducing:
• A higher level of abstraction with transaction set representation
• For each transaction, an Object, a Container and a Database level of aggregation can be defined.
• The notion of facet
• A facet is an attribute which can be associated with an item, a transaction, an object or a container;
• Each element may be characterized by more than one facet.
• A taxonomy of extended utility functions
• Based on the database structure and facets,
• They can be combined in several ways to fit different notions of utility.
An extended framework for HUPM
3/17
Proposed Framework
• The database representation is multi-layer
• 𝐷𝑎𝑡𝑎𝑏𝑎𝑠𝑒 → 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑒𝑟 → 𝑂𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛
• Given a database 𝐷 and a set of transactions {𝑇!, … , 𝑇"}
• 𝐷 is organized as a set of containers 𝐶 = {𝐶!, … , 𝐶"}
• 𝐶# can be associated with a set of objects 𝑂 = {𝑂!, … , 𝑂$}
• 𝑂% contains a set of transactions {𝑇!, … , 𝑇&}
Extending the database structure
Running example depicting a sales database
5/17
Proposed Framework
• The utility of 𝑖 may be defined from
different perspectives: facets.
• Each item 𝑖 can be associated with
one or more facets.
• Facets can also be defined for
transactions, objects and containers.
• To describe facets, we use the notion
of utility vector.
Facets
6/17
Proposed Framework
• Item utility vector for an item 𝑖
• 𝐼𝑈! = [𝑖𝑢", 𝑖𝑢#, … , 𝑖𝑢$], and 𝑖𝑢% describes a certain facet of 𝑖
• Transaction utility vector for a transaction 𝑇#
• 𝑇𝑈&!
= [𝑡𝑢", 𝑡𝑢#, … , 𝑡𝑢'], and 𝑡𝑢% describes a certain facet of 𝑇(
• Internal utility of 𝑖 in 𝑇( is available and denoted as 𝑞(𝑖, 𝑇()
• Object utility vector
• 𝑂𝑈) = 𝑜𝑢", 𝑜𝑢#, … , 𝑜𝑢* , and 𝑜𝑢% describes a certain facet of 𝑂
• Container utility vector
• 𝐶𝑈+ = [𝑐𝑢", 𝑐𝑢#, … , 𝑐𝑢,], and 𝑐𝑢% describes a certain facet of 𝐶
Facets and utility vectors
facets: price, weight
7/17
Proposed Framework
• Intra-pattern utility function
• Let 𝑃 be a pattern with 𝑟 items, and 𝑇( be a transaction containing it,
• Let 𝐼𝑈𝑆 = {𝐼𝑈", … , 𝐼𝑈-} be the set of item utility vectors of 𝑃
• The intra-pattern utility function 𝐼𝑈&!
= 𝑖𝑝(𝑃, 𝑇(, 𝐼𝑈𝑆) generates an
unique item utility vector for the pattern occurrence.
• 𝑖𝑝 ⋅ can be any function combining the utilities across the facets,
such as SUM, MAX, MIN, AVG, etc…
• Pattern utility function
• The occurrence utility vector for a pattern 𝑃 is 𝑂𝑐𝑐𝑈&!
=
[𝐼𝑈&!
, 𝑇𝑈&!
, 𝑂𝑈&!
, 𝐶𝑈&!
]
• The pattern utility vector 𝑈. is the collection of all the occurrence
utility vectors of 𝑃: 𝑈. = ⋃&!∈&"
𝑂𝑐𝑐𝑈&!
• 𝑈. is a matrix (occurrence utility vectors × facets).
Advanced utility functions
In this example, 𝑖𝑝 = ∑ 𝑓𝑎𝑐𝑒𝑡 × 𝑖𝑛𝑡𝑒𝑟𝑛𝑎𝑙𝑈𝑡𝑖𝑙𝑖𝑡𝑦
8/17
Proposed Framework
• The utility 𝑢 of a pattern 𝑃 can be obtained as an arbitrary
combination of the values of 𝑈! using a function 𝑢(𝑃)
• 𝑢(𝑃) can be classified as
• Horizontal first 𝑢 𝑃 = 𝑓?(𝑓@(𝑢.)) combines by row (facets), then by
column (occurrences)
• Vertical first 𝑢 𝑃 = 𝑓@(𝑓?(𝑢.)) combines by column (occurrences),
then by row (facets)
• Mixed 𝑢 𝑃 = 𝑓 𝑢. combines the values at once
• All of these classifications can be further classified in
• Inter-transaction utility
• Pattern-vs-object utility
• Pattern-vs-container utility
• These can be exploited to define utility measures relating
item/transaction facets and one of the object/container facets,
such as Pearson and Multiple correlation.
Advanced utility functions
As an example, suppose we select 𝑓# =
𝑓𝑖𝑙𝑡𝑒𝑟 ⋅ and 𝑓$ = max ⋅ , that is we first
filter for a single facet, and then we take
the maximum across all the occurrences.
9/17
Proposed Framework
• Given a pattern 𝑃, we say that 𝑃 is an extended high-utility pattern if its utility 𝑢(𝑃) is greater than
a minimum threshold 𝑡ℎ$ and it occurs in at least 𝑡ℎ% transactions.
• The problem of extended high-utility pattern mining (e-HUPM) is to discover all the extended
high-utility patterns in a given database 𝐷.
The e–HUPM problem
10/17
ASP Approach
• We want to provide as much flexibility as possible in the definition of what is a useful pattern.
• Encoding the problem in Answer Set Programming (ASP) helps achieving the desidered flexibility
and modularity.
• Classic guess-and-check scenario
• We generate one answer set for each valid pattern,
• Complex utility functions are executed by means of external functions (e.g., in DLVHEX, WASP, clingo)
• Pattern validity criteria and filter can be easily applied by encoding them as rules.
The why and the how
11/17
ASP Approach
The encoding
%%% Input schema:
%container(ContainerId)
%object(ObjectId,ContainerId)
%transaction(Tid, ObjectId)
%item(Item, Tid, Position, Q)
%itemUtilityVector(Item, I1, ..., Il)
%transactionUtilityVector(Tid, T1, ..., Tm)
%objectUtilityVector(ObjectId, O1, ..., On)
%containerUtilityVector(ContainerId, C1, ..., Co)
%%% Parameters
occurrencesThreshold(...). utilityTreshold(...).
%%% Item pre-filtering
usefulItem(I):- item(I,_,_,_),....any condition on the items.
%%% Candidate pattern generation
{inCandidatePattern(I)}:- usefulItem(I).
%%% Occurrences computation and check
inTransaction(Tid):- transaction(Tid,_), not incomplete(Tid).
incomplete(TiD):- transaction(Tid,_), inCandidatePattern(I), not contains(I,Tid).
contains(I,Tid):- item(I,Tid,_,_).
:- #count{ Tid : inTransaction(Tid)}=N, N < Tho, occurrencesThreshold(Tho).
%%% Utility computation
patternItemUtilityVectors(Tid,Item,I1,...,Il,Q):- inCandidatePattern(Item),
itemUtilityVector(Item, I1, ..., Il), inTransaction(Tid), item(Item, Tid, Position, Q).
intraPatternUtilityVector(Tid,I1,...,Il):-
&computeIntraPatternUtility[patternItemUtilityVectors](Tid,I1,...,Il).
occurrenceUtilityVector(Tid,I1,...,Il,T1,...Tm,O1,...On,C1,...,Co):-
inTransaction(Tid), intraPatternUtilityVector(Tid,I1,...,Il)
transactionUtilityVector(Tid, T1, ..., Tm), transaction(Tid, ObjectId),
objectUtilityVector(ObjectId, O1, ..., On), object(ObjectId , ContainerId)
containerUtilityVector(ContainerId, C1, ..., Co).
:- &computeUtility[occurrenceUtilityVector](U), U < Thu, utilityTreshold(Thu).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30 12/17
Experimental Evaluation
• Aspect-based sentiment analysis of scientific reviews [1]
• Each review is annotated with 8 different aspects: appropriataness, clarity, originality,
empirical/theoretical soudness, meaningful comparison, substance, impact and recommendation
• Each aspect has assigned one sentiment value from {𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒, 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒, 𝑛𝑒𝑢𝑡𝑟𝑎𝑙, 𝑎𝑏𝑠𝑒𝑛𝑡}.
• 814 papers and a total of 1148 reviews.
• 2230 annotated lines with a total of 15124 distinct words.
Dataset
[1] Chakraborty, S., Goyal, P., Mukherjee, A.: Aspect-based sentiment analysis of scientific reviews. In: JCDL ’20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5,
2020. pp. 207–216 (2020), ACM
13/17
Experimental Evaluation
• Average running time
• 𝑢(𝑃): disagreement between originality and decision (% of
pattern occurrences showing a positive originality and a
reject decision, thresholds: [15, 50, 75, 100])
Quantitative analysis
• Comparison with HUPM systems
• No analysis involving more than one
elements can be devised with such
systems
• Here, the utility of an item is the absolute
value of the appropriateness aspect in
the sentence it first appears in.
14/17
Experimental Evaluation
• 𝑢 𝑃 : Pearson correlation between an aspect 𝑋
and the final decision on the corresponding
paper.
• «The set of review sentences containing 𝑃 are
characterized by a direct correlation between the
sentiment on 𝑋 and the final decision on the
corresponding paper»
• 𝑢(𝑃): Multiple correlation between two aspects
𝑋 and 𝑌 and the final decision on the
corresponding paper.
Qualitative analysis
15/17
Conclusion
• We introduced a general framework for HUPM with several extensions.
• The framework allows to work with multi-dimensional data and with different utility measures.
• A versatile and modular ASP encoding has been developed.
• We employed a real use case on paper review to carry out both quantitative and qualitative
analyses.
• Facets and advanced utility functions help reducing the amount of relevant patterns
• Useful in providing deep insights on the data.
• Not an ending point!
• Apply the framework to new contexts,
• Derive ad-hoc algorithms for the extended utility functions.
16/17
Thanks for your attention!
These slides available at https://bit.ly/ehupm-ruleml21
Francesco Cauteruccio
Research Fellow @ DEMACS, University of Calabria
cauteruccio@mat.unical.it
francescocauteruccio.info
@finalfire

Weitere ähnliche Inhalte

Was ist angesagt?

The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspectiveankurpandeyinfo
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methodsrajshreemuthiah
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionIOSR Journals
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...ranjit banshpal
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Modelsguest0edcaf
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slidestafosepsdfasg
 

Was ist angesagt? (12)

The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspective
 
2.mathematics for machine learning
2.mathematics for machine learning2.mathematics for machine learning
2.mathematics for machine learning
 
4 module 3 --
4 module 3 --4 module 3 --
4 module 3 --
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methods
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature Selection
 
Matlab:Regression
Matlab:RegressionMatlab:Regression
Matlab:Regression
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slides
 
7 decision tree
7 decision tree7 decision tree
7 decision tree
 
3 module 2
3 module 23 module 2
3 module 2
 
Matlab Data And Statistics
Matlab Data And StatisticsMatlab Data And Statistics
Matlab Data And Statistics
 

Ähnlich wie An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions

Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structuressonykhan3
 
UNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxUNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxBhagyasriPatel2
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code executionAlexander Decker
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code executionAlexander Decker
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionChittagong Independent University
 
Programming in python
Programming in pythonProgramming in python
Programming in pythonIvan Rojas
 
Introduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptIntroduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptSanthosh Kumar Srinivasan
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm. Abdul salam
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithmsiqbalphy1
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxGopalPatidar13
 
Data Analysis – Technical learnings
Data Analysis – Technical learningsData Analysis – Technical learnings
Data Analysis – Technical learningsInvenkLearn
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberJISC KeepIt project
 
2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applications2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applicationsVahabshaik Shai
 
Data Structures - Lecture 1 - Unit 1.pptx
Data Structures  - Lecture 1 - Unit 1.pptxData Structures  - Lecture 1 - Unit 1.pptx
Data Structures - Lecture 1 - Unit 1.pptxDanielNesaKumarC
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReducesscdotopen
 
Algorithms Analysis.pdf
Algorithms Analysis.pdfAlgorithms Analysis.pdf
Algorithms Analysis.pdfShaistaRiaz4
 

Ähnlich wie An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions (20)

Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
 
UNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxUNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptx
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code execution
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector Regression
 
Programming in python
Programming in pythonProgramming in python
Programming in python
 
Introduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptIntroduction to Design Patterns in Javascript
Introduction to Design Patterns in Javascript
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
algo 1.ppt
algo 1.pptalgo 1.ppt
algo 1.ppt
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithms
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptx
 
Data Analysis – Technical learnings
Data Analysis – Technical learningsData Analysis – Technical learnings
Data Analysis – Technical learnings
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
 
Query processing System
Query processing SystemQuery processing System
Query processing System
 
2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applications2021 icse reducedsylabiix-computer applications
2021 icse reducedsylabiix-computer applications
 
Data Structures - Lecture 1 - Unit 1.pptx
Data Structures  - Lecture 1 - Unit 1.pptxData Structures  - Lecture 1 - Unit 1.pptx
Data Structures - Lecture 1 - Unit 1.pptx
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
Algorithms Analysis.pdf
Algorithms Analysis.pdfAlgorithms Analysis.pdf
Algorithms Analysis.pdf
 

Kürzlich hochgeladen

300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptRakeshMohan42
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 

Kürzlich hochgeladen (20)

300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 

An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions

  • 1. The 5th International Joint conference on Rules and Reasoning Sept. 8–15, 2021 (virtually in) Leuven, Belgium An Answer Set Programming based framework for High-Utility Pattern Mining extended with Facets and Advanced Utility Functions Francesco Cauteruccio and Giorgio Terracina DEMACS, University of Calabria, Italy {cauteruccio, terracina}@mat.unical.it
  • 2. Outline • Context and Motivation • Proposed Framework • ASP Approach • Experimental Evaluation • Conclusion 1/17
  • 3. Context and Motivation • Pattern Mining is one of the most studied data mining branches • Find interesting patterns (set of items) in a database of transactions • Frequent pattern mining, sequential pattern mining, etc… • High-Utility Pattern Mining (HUPM) • Find patterns having a high-utility (w.r.t. some utility measure) • Example: in a sales database, the utility of a pattern may be represented by the profit of some items bought together. • Basic assumption: each item is associated with one, static utility . • However… • The utility of an item can be defined from very different point of views, • Transactions are not only flat lists of items but they can provide different level of abstractions. Pattern Mining and High-Utility Pattern Mining 2/17
  • 4. Context and Motivation • We present a framework for HUPM extending basic notions and introducing: • A higher level of abstraction with transaction set representation • For each transaction, an Object, a Container and a Database level of aggregation can be defined. • The notion of facet • A facet is an attribute which can be associated with an item, a transaction, an object or a container; • Each element may be characterized by more than one facet. • A taxonomy of extended utility functions • Based on the database structure and facets, • They can be combined in several ways to fit different notions of utility. An extended framework for HUPM 3/17
  • 5. Proposed Framework • The database representation is multi-layer • 𝐷𝑎𝑡𝑎𝑏𝑎𝑠𝑒 → 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑒𝑟 → 𝑂𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 • Given a database 𝐷 and a set of transactions {𝑇!, … , 𝑇"} • 𝐷 is organized as a set of containers 𝐶 = {𝐶!, … , 𝐶"} • 𝐶# can be associated with a set of objects 𝑂 = {𝑂!, … , 𝑂$} • 𝑂% contains a set of transactions {𝑇!, … , 𝑇&} Extending the database structure Running example depicting a sales database 5/17
  • 6. Proposed Framework • The utility of 𝑖 may be defined from different perspectives: facets. • Each item 𝑖 can be associated with one or more facets. • Facets can also be defined for transactions, objects and containers. • To describe facets, we use the notion of utility vector. Facets 6/17
  • 7. Proposed Framework • Item utility vector for an item 𝑖 • 𝐼𝑈! = [𝑖𝑢", 𝑖𝑢#, … , 𝑖𝑢$], and 𝑖𝑢% describes a certain facet of 𝑖 • Transaction utility vector for a transaction 𝑇# • 𝑇𝑈&! = [𝑡𝑢", 𝑡𝑢#, … , 𝑡𝑢'], and 𝑡𝑢% describes a certain facet of 𝑇( • Internal utility of 𝑖 in 𝑇( is available and denoted as 𝑞(𝑖, 𝑇() • Object utility vector • 𝑂𝑈) = 𝑜𝑢", 𝑜𝑢#, … , 𝑜𝑢* , and 𝑜𝑢% describes a certain facet of 𝑂 • Container utility vector • 𝐶𝑈+ = [𝑐𝑢", 𝑐𝑢#, … , 𝑐𝑢,], and 𝑐𝑢% describes a certain facet of 𝐶 Facets and utility vectors facets: price, weight 7/17
  • 8. Proposed Framework • Intra-pattern utility function • Let 𝑃 be a pattern with 𝑟 items, and 𝑇( be a transaction containing it, • Let 𝐼𝑈𝑆 = {𝐼𝑈", … , 𝐼𝑈-} be the set of item utility vectors of 𝑃 • The intra-pattern utility function 𝐼𝑈&! = 𝑖𝑝(𝑃, 𝑇(, 𝐼𝑈𝑆) generates an unique item utility vector for the pattern occurrence. • 𝑖𝑝 ⋅ can be any function combining the utilities across the facets, such as SUM, MAX, MIN, AVG, etc… • Pattern utility function • The occurrence utility vector for a pattern 𝑃 is 𝑂𝑐𝑐𝑈&! = [𝐼𝑈&! , 𝑇𝑈&! , 𝑂𝑈&! , 𝐶𝑈&! ] • The pattern utility vector 𝑈. is the collection of all the occurrence utility vectors of 𝑃: 𝑈. = ⋃&!∈&" 𝑂𝑐𝑐𝑈&! • 𝑈. is a matrix (occurrence utility vectors × facets). Advanced utility functions In this example, 𝑖𝑝 = ∑ 𝑓𝑎𝑐𝑒𝑡 × 𝑖𝑛𝑡𝑒𝑟𝑛𝑎𝑙𝑈𝑡𝑖𝑙𝑖𝑡𝑦 8/17
  • 9. Proposed Framework • The utility 𝑢 of a pattern 𝑃 can be obtained as an arbitrary combination of the values of 𝑈! using a function 𝑢(𝑃) • 𝑢(𝑃) can be classified as • Horizontal first 𝑢 𝑃 = 𝑓?(𝑓@(𝑢.)) combines by row (facets), then by column (occurrences) • Vertical first 𝑢 𝑃 = 𝑓@(𝑓?(𝑢.)) combines by column (occurrences), then by row (facets) • Mixed 𝑢 𝑃 = 𝑓 𝑢. combines the values at once • All of these classifications can be further classified in • Inter-transaction utility • Pattern-vs-object utility • Pattern-vs-container utility • These can be exploited to define utility measures relating item/transaction facets and one of the object/container facets, such as Pearson and Multiple correlation. Advanced utility functions As an example, suppose we select 𝑓# = 𝑓𝑖𝑙𝑡𝑒𝑟 ⋅ and 𝑓$ = max ⋅ , that is we first filter for a single facet, and then we take the maximum across all the occurrences. 9/17
  • 10. Proposed Framework • Given a pattern 𝑃, we say that 𝑃 is an extended high-utility pattern if its utility 𝑢(𝑃) is greater than a minimum threshold 𝑡ℎ$ and it occurs in at least 𝑡ℎ% transactions. • The problem of extended high-utility pattern mining (e-HUPM) is to discover all the extended high-utility patterns in a given database 𝐷. The e–HUPM problem 10/17
  • 11. ASP Approach • We want to provide as much flexibility as possible in the definition of what is a useful pattern. • Encoding the problem in Answer Set Programming (ASP) helps achieving the desidered flexibility and modularity. • Classic guess-and-check scenario • We generate one answer set for each valid pattern, • Complex utility functions are executed by means of external functions (e.g., in DLVHEX, WASP, clingo) • Pattern validity criteria and filter can be easily applied by encoding them as rules. The why and the how 11/17
  • 12. ASP Approach The encoding %%% Input schema: %container(ContainerId) %object(ObjectId,ContainerId) %transaction(Tid, ObjectId) %item(Item, Tid, Position, Q) %itemUtilityVector(Item, I1, ..., Il) %transactionUtilityVector(Tid, T1, ..., Tm) %objectUtilityVector(ObjectId, O1, ..., On) %containerUtilityVector(ContainerId, C1, ..., Co) %%% Parameters occurrencesThreshold(...). utilityTreshold(...). %%% Item pre-filtering usefulItem(I):- item(I,_,_,_),....any condition on the items. %%% Candidate pattern generation {inCandidatePattern(I)}:- usefulItem(I). %%% Occurrences computation and check inTransaction(Tid):- transaction(Tid,_), not incomplete(Tid). incomplete(TiD):- transaction(Tid,_), inCandidatePattern(I), not contains(I,Tid). contains(I,Tid):- item(I,Tid,_,_). :- #count{ Tid : inTransaction(Tid)}=N, N < Tho, occurrencesThreshold(Tho). %%% Utility computation patternItemUtilityVectors(Tid,Item,I1,...,Il,Q):- inCandidatePattern(Item), itemUtilityVector(Item, I1, ..., Il), inTransaction(Tid), item(Item, Tid, Position, Q). intraPatternUtilityVector(Tid,I1,...,Il):- &computeIntraPatternUtility[patternItemUtilityVectors](Tid,I1,...,Il). occurrenceUtilityVector(Tid,I1,...,Il,T1,...Tm,O1,...On,C1,...,Co):- inTransaction(Tid), intraPatternUtilityVector(Tid,I1,...,Il) transactionUtilityVector(Tid, T1, ..., Tm), transaction(Tid, ObjectId), objectUtilityVector(ObjectId, O1, ..., On), object(ObjectId , ContainerId) containerUtilityVector(ContainerId, C1, ..., Co). :- &computeUtility[occurrenceUtilityVector](U), U < Thu, utilityTreshold(Thu). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 12/17
  • 13. Experimental Evaluation • Aspect-based sentiment analysis of scientific reviews [1] • Each review is annotated with 8 different aspects: appropriataness, clarity, originality, empirical/theoretical soudness, meaningful comparison, substance, impact and recommendation • Each aspect has assigned one sentiment value from {𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒, 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒, 𝑛𝑒𝑢𝑡𝑟𝑎𝑙, 𝑎𝑏𝑠𝑒𝑛𝑡}. • 814 papers and a total of 1148 reviews. • 2230 annotated lines with a total of 15124 distinct words. Dataset [1] Chakraborty, S., Goyal, P., Mukherjee, A.: Aspect-based sentiment analysis of scientific reviews. In: JCDL ’20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. pp. 207–216 (2020), ACM 13/17
  • 14. Experimental Evaluation • Average running time • 𝑢(𝑃): disagreement between originality and decision (% of pattern occurrences showing a positive originality and a reject decision, thresholds: [15, 50, 75, 100]) Quantitative analysis • Comparison with HUPM systems • No analysis involving more than one elements can be devised with such systems • Here, the utility of an item is the absolute value of the appropriateness aspect in the sentence it first appears in. 14/17
  • 15. Experimental Evaluation • 𝑢 𝑃 : Pearson correlation between an aspect 𝑋 and the final decision on the corresponding paper. • «The set of review sentences containing 𝑃 are characterized by a direct correlation between the sentiment on 𝑋 and the final decision on the corresponding paper» • 𝑢(𝑃): Multiple correlation between two aspects 𝑋 and 𝑌 and the final decision on the corresponding paper. Qualitative analysis 15/17
  • 16. Conclusion • We introduced a general framework for HUPM with several extensions. • The framework allows to work with multi-dimensional data and with different utility measures. • A versatile and modular ASP encoding has been developed. • We employed a real use case on paper review to carry out both quantitative and qualitative analyses. • Facets and advanced utility functions help reducing the amount of relevant patterns • Useful in providing deep insights on the data. • Not an ending point! • Apply the framework to new contexts, • Derive ad-hoc algorithms for the extended utility functions. 16/17
  • 17. Thanks for your attention! These slides available at https://bit.ly/ehupm-ruleml21 Francesco Cauteruccio Research Fellow @ DEMACS, University of Calabria cauteruccio@mat.unical.it francescocauteruccio.info @finalfire