SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Pattern-Based Classification of Demographic Sequences
Dmitry I. Ignatov1, Danil Gizdatullin1, Ekaterina Mitrofanova1, Anna
Muratova1, Jaume Baixeries2
1National Research University Higher School of Economics, Moscow
2Universitat Polit`ecnica de Catalunya, Barcelona
Intelligent Data Processing 2016, Barcelona
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 1 / 27
Possible life events
First job (job)
The highest education degree is obtained (education)
Leaving parents’ home (separation)
First partner (partner)
First marriage (marriage)
First child birth (children)
Break-up (parting)
... (divorce)
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 2 / 27
Data and problem statement
[Ignatov et al., 2015],[Blockeel et al., 2001]
Generation and Gender Survey (GGS): three waves panel data for 11
generations of Russian citizens starting from 30s
Binary classification
1545 men
3312 women
Examples of sequential patterns
{education, separation}, {work}, {marriage}, {children} (m)
{work}, {marriage}, {children}{education} (f )
{partner}, {marriage, separation}, {children} (f )
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 3 / 27
Basic definitions
Texbooks of Han et al., Zaki & Meira, Aggarwal et al., etc
s = s1, ..., sk is the subsequence of s = s1, ..., sk (s s ) if
k ≤ k and there exist 1 ≤ r1 < r2 < ... < rk ≤ k such sj = srj for all
1 ≤ j ≤ k.
support(s, D) is the support of a sequence s in D, i.e. the number of
sequences in D such that s is their subsequence.
support(s, D) = |{s |s ∈ D, s s }|
s is a frequent closed sequence (sequential pattern) if there is no
s such that s s and
support(s, D) = support(s , D)
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 4 / 27
Example
Let D be a set of sequences:
Table: Dataset D.
s1 {a, b, c}{a, b}{b}
s2 {a}{a, c}{a}
s3 {a, b}{b, c}
I = {a, b, c} is the set of all items (atomic events)
{a, b}{b} belongs to s1 and s3 but it is missing in s2
supportD( {a, b}{b} ) = 2
{ {a} , {c} , {a}{c} , {a, b}{b} , {a, c}{a} } is the set of closed
sequences.
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 5 / 27
CAEP: Classification by Aggregating Emerging Patterns
G. Dong et al., 1999
Growth Rate
growth rateD →D (X) =



suppD (X)
suppD (X)
if suppD (X) = 0
0 if suppD (X) = supp(X) = 0
∞ if suppD (X) = 0 and suppD (X) = 0
Class score
score(s, C) =
e⊆s,e∈E(c)
growth rateC (e)
growth rateC (e) + 1
· suppc(e)
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 6 / 27
CAEP: Classification by Aggregating Emerging Patterns
Score normalization
normal score(s, C) =
score(s, C)
median({growth rateC (ei )})
Classification rule
class(s) =



C1, if normal score(s, C1) > normal score(s, C2)
C2, if normal score(s, C1) < normal score(s, C2)
undetermined if normal score(s, C1) = normal score(s, C2)
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 7 / 27
Gapless prefix-based sequential patterns
s = s1, ..., sk is a gapless prefix-based subsequence of
s = s1, ..., sk (s∗ = s ) if k ≤ k and ∀i ∈ k : si = si .
Support of gapless prefix-based sequences
Let T be a set of sequences.
support(s, T) =
|{s |s ∈ T, s∗ = s }|
|T|
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 8 / 27
Gapless sequential patterns
Let 0 < minSup ≤ 1 be a minimal support parameter and D is a set
of sequences then searching for prefix-based gapless sequential
patterns is the task of enumeration of all prefix-based gapless
sequences s such that support(s, D) ≥ minSup. Every sequence s
with support(s, D) ≥ minSup is called a prefix-based gapless
sequential pattern.
Prefix-based gapless sequential pattern (PGSP) p is called closed if
there is no PGSP d of greater of equal support such that d = p∗.
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 9 / 27
Gapless sequential patterns
Example
Table: D is a set of sequences.
s1 {a}{b}{d}
s2 {a}{b}{c}
s3 {a, b}{b, c}
s = {a}{b}
I = {a, b, c} is the set of all items (atomic events)
s1 = s∗; s2 = s∗
s3 = s∗
SuppD(s) =
2
3
{a}{b} is closed, {a} is not closed.
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 10 / 27
Pattern Structures
Ganter & Kuznetsov, 2001
(S, (D, ), δ) is a pattern structure
S is a set of objects, D is a set of their their possible descriptions
δ(g) is the description of g from S
Galois connection is given by operator as follows:
A :=
g∈A
δ(g) for A ⊆ S
d := {s ∈ S|d δ(g)} for d ∈ D
For two sequences may result in their largest common prefix
subsequence
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 11 / 27
Pattern concepts
A pair (A, d) is called a pattern concept of a pattern structure
(S, (D, ), δ) if
1 A ⊆ S
2 d ∈ D
3 A = d
4 d = A
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 12 / 27
Pattern Structures
Example
s1 : a, b, c
s2 : a, b, c
s3 : a, b, d
Tree
0
a(3)
b(3)
c(2) d(1)
Pattern concepts (PCs)
({s1, s2, s3}, a, b ); ({s1, s2}, a, b, c )
({s1}, a, b, c ) is not a PC; ({s3}, a, b, d )
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 13 / 27
Pattern-based JSM-hypotheses
[Finn, 1981], [Kuznetsov, 1993], [Ganter et al, 2004]
Positive, negative and undetermined pattern structures
K⊕ = (S⊕, (D, ), δ⊕)
K = (S , (D, ), δ )
There is a pattern structure of undetermined examples:
Kτ = (Sτ , (D, ), δτ )
Hypothesis
A hypothesis is a pattern intent that belongs to examples from a fixed
class only
A pattern intent h is a positive hypothesis (dually for negative hypotheses)
if
∀s ∈ S (s ∈ S⊕) : h s (h s⊕
)
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 14 / 27
Hypotheses generation: An example
Sequential classification rules
s1 : a, b, c - class 0
s2 : a, b, c - class 0
s3 : a, b, d - class 1
Prefix-tree
0
a(2; 1)
b(2; 1)
c(2; 0) d(0; 1)
Hypotheses
{a}, {b}, {c} is a hypothesis of class 0
{a}, {b}, {d} is a hypothesis of class 1
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 15 / 27
Classification via hypotheses
class(gτ ) =



positive if ∃h⊕, h⊕ δ(gτ ) and h , h δ(gτ )
negative if h⊕, h⊕ δ(gτ ) and ∃h , h δ(gτ )
undetermined if ∃h⊕, h⊕ δ(gτ ) and ∃h , h δ(gτ )
undetermined if h⊕, h⊕ δ(gτ ) and h , h δ(gτ )
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 16 / 27
Emerging patterns based on pattern structures
Growth Rate
GrowthRate(g, K⊕, K ) =
SupK⊕ (g)
SupK (g)
Emerging patterns
A pattern is called emerging pattern if its growth rate is greater than or
equal to Θmin
GrowthRate(g, K⊕, K ) > Θmin
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 17 / 27
Emerging patterns for classification
s is a new object
normal score⊕(s) =
p∈P⊕
GrowthRate(p, K⊕, K )
median(GrowthRate(P⊕))
: p s
normal score (s) =
p∈P GrowthRate(p, K , K⊕)
median(GrowthRate(P ))
: p s
Classification via emerging patterns
class(s) =



positive if normal score⊕(s) > score (s)
negative if normal score⊕(s) < score (s)
undetermined if normal score⊕(s) = normal score (s)
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 18 / 27
Classification algorithm for gapless prefix-based sequential
patterns
1 Build the prefix tree for the input sequences.
2 For each tree node calculate its Growth Rate.
3 For every new sequence traverse the tree and compute the Score for
each class.
4 Compare the Score value for different classes and classify the new
sequence.
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 19 / 27
Execution example
Input sequences
class 0 : { {a}{b}{c} , {b}{a}{c} , {b}{a}{c} , {b}{c} }
class 1 : { {a}{c}{b} , {b}{c}{a} , {b}{c}{a} }
Prefix tree
0
a(1; 1)
b(1; 0)
c(1; 0)
c(0; 1)
b(0; 1)
b(2; 2)
a(2; 0)
c(2; 0)
c(1; 2)
a(0; 2)
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 20 / 27
Counting Growth Rate
0
a(0.75; 1.33)
b(∞; 0)
c(∞; 0)
c(0; ∞)
b(0; ∞)
b(0.75; 1.33)
a(∞; 0)
c(∞; 0)
c(0.38; 2.67)
a(0; ∞)
New sequence
{b}; {c}; {a} −???
Score0 = 0
Score1 = 2.67 + ∞ = ∞
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 21 / 27
Comparison of closed and non-closed patterns
Figure: TPR vs FPR for closed and non-closed patterns
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 22 / 27
Experiments and results
Figure: TPR-FPR for classification via gapless prefix-based patterns
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 23 / 27
Interesting patterns (women)
( {work, separation}, {marriage}, {children}, {education} , [inf , 0.006])
( {separation, partner}, {marriage} , [inf , 0.006])
( {work, separation}, {marriage}, {children} , [inf , 0.008])
( {work, separation}, {marriage} , [inf , 0.009])
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 24 / 27
Interesting patterns (men)
( {education}, {marriage}, {work}, {children}, {separation} , [10.6, 0.006])
( {education}, {marriage}, {work}, {children} , [12.7, 0.007])
( {educ}, {work}, {part}, {mar}, {sep}, {ch} , [10.6, 0.006])
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 25 / 27
Conclusion
1 We have studied several pattern mining techniques for demographic
sequences including pattern-based classification in particular.
2 We have fitted existing approaches for sequence mining of a special
type (gapless and prefix-based ones).
3 The results for different demographic groups (classes) have been
obtained and interpreted.
4 In particular, a classifier based on emerging sequences and pattern
structures has been proposed.
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 26 / 27
Conclusion
Thank you!
Questions?
Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 27 / 27

Weitere ähnliche Inhalte

Was ist angesagt?

Internal workshop jub talk jan 2013
Internal workshop jub talk jan 2013Internal workshop jub talk jan 2013
Internal workshop jub talk jan 2013Jurgen Riedel
 
Introduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananIntroduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananArjun Narayanan
 
IRJET- Global Accurate Domination in Jump Graph
IRJET- Global Accurate Domination in Jump GraphIRJET- Global Accurate Domination in Jump Graph
IRJET- Global Accurate Domination in Jump GraphIRJET Journal
 
A method for constructing fuzzy test statistics with application
A method for constructing fuzzy test statistics with applicationA method for constructing fuzzy test statistics with application
A method for constructing fuzzy test statistics with applicationAlexander Decker
 
A common fixed point theorem for two random operators using random mann itera...
A common fixed point theorem for two random operators using random mann itera...A common fixed point theorem for two random operators using random mann itera...
A common fixed point theorem for two random operators using random mann itera...Alexander Decker
 
comments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle samplercomments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle samplerChristian Robert
 
FURTHER RESULTS ON ODD HARMONIOUS GRAPHS
FURTHER RESULTS ON ODD HARMONIOUS GRAPHSFURTHER RESULTS ON ODD HARMONIOUS GRAPHS
FURTHER RESULTS ON ODD HARMONIOUS GRAPHSgraphhoc
 
Bidimensionality
BidimensionalityBidimensionality
BidimensionalityASPAK2014
 
A common fixed point of integral type contraction in generalized metric spacess
A  common fixed point of integral type contraction in generalized metric spacessA  common fixed point of integral type contraction in generalized metric spacess
A common fixed point of integral type contraction in generalized metric spacessAlexander Decker
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Typing quantum superpositions and measurement
Typing quantum superpositions and measurementTyping quantum superpositions and measurement
Typing quantum superpositions and measurementAlejandro Díaz-Caro
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsJean-Charles Vialatte
 
Rational points on elliptic curves
Rational points on elliptic curvesRational points on elliptic curves
Rational points on elliptic curvesmmasdeu
 

Was ist angesagt? (20)

Internal workshop jub talk jan 2013
Internal workshop jub talk jan 2013Internal workshop jub talk jan 2013
Internal workshop jub talk jan 2013
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
L(2,1)-labeling
L(2,1)-labelingL(2,1)-labeling
L(2,1)-labeling
 
Linear models2
Linear models2Linear models2
Linear models2
 
Introduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananIntroduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun Narayanan
 
Triggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphsTriggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphs
 
IRJET- Global Accurate Domination in Jump Graph
IRJET- Global Accurate Domination in Jump GraphIRJET- Global Accurate Domination in Jump Graph
IRJET- Global Accurate Domination in Jump Graph
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
 Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli... Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
 
Planted Clique Research Paper
Planted Clique Research PaperPlanted Clique Research Paper
Planted Clique Research Paper
 
A method for constructing fuzzy test statistics with application
A method for constructing fuzzy test statistics with applicationA method for constructing fuzzy test statistics with application
A method for constructing fuzzy test statistics with application
 
A common fixed point theorem for two random operators using random mann itera...
A common fixed point theorem for two random operators using random mann itera...A common fixed point theorem for two random operators using random mann itera...
A common fixed point theorem for two random operators using random mann itera...
 
comments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle samplercomments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle sampler
 
FURTHER RESULTS ON ODD HARMONIOUS GRAPHS
FURTHER RESULTS ON ODD HARMONIOUS GRAPHSFURTHER RESULTS ON ODD HARMONIOUS GRAPHS
FURTHER RESULTS ON ODD HARMONIOUS GRAPHS
 
Bidimensionality
BidimensionalityBidimensionality
Bidimensionality
 
A common fixed point of integral type contraction in generalized metric spacess
A  common fixed point of integral type contraction in generalized metric spacessA  common fixed point of integral type contraction in generalized metric spacess
A common fixed point of integral type contraction in generalized metric spacess
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
Typing quantum superpositions and measurement
Typing quantum superpositions and measurementTyping quantum superpositions and measurement
Typing quantum superpositions and measurement
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph Domains
 
Vancouver18
Vancouver18Vancouver18
Vancouver18
 
Rational points on elliptic curves
Rational points on elliptic curvesRational points on elliptic curves
Rational points on elliptic curves
 

Andere mochten auch

Experimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshopExperimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshopDmitrii Ignatov
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsDmitrii Ignatov
 
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016Dmitrii Ignatov
 
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationContext-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationDmitrii Ignatov
 
Поиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правилаПоиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правилаDmitrii Ignatov
 
On the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCAOn the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCADmitrii Ignatov
 
Pattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic SequencesPattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic SequencesDmitrii Ignatov
 
AIST 2016 Opening Slides
AIST 2016 Opening SlidesAIST 2016 Opening Slides
AIST 2016 Opening SlidesDmitrii Ignatov
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduceDmitrii Ignatov
 
RAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern StructuresRAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern StructuresDmitrii Ignatov
 
Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.Dmitrii Ignatov
 
A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?Dmitrii Ignatov
 
Boolean matrix factorisation for collaborative filtering
Boolean matrix factorisation for collaborative filteringBoolean matrix factorisation for collaborative filtering
Boolean matrix factorisation for collaborative filteringDmitrii Ignatov
 
Intro to Data Mining and Machine Learning
Intro to Data Mining and Machine LearningIntro to Data Mining and Machine Learning
Intro to Data Mining and Machine LearningDmitrii Ignatov
 
Improved method for pattern discovery in text mining
Improved method for pattern discovery in text miningImproved method for pattern discovery in text mining
Improved method for pattern discovery in text miningeSAT Journals
 
Modified Apriori Algorithm for Frequent Pattern Mining
Modified Apriori Algorithm for Frequent Pattern MiningModified Apriori Algorithm for Frequent Pattern Mining
Modified Apriori Algorithm for Frequent Pattern MiningPritish Yuvraj
 
Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...Dmitrii Ignatov
 

Andere mochten auch (18)

Experimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshopExperimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshop
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensors
 
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
 
Sequence mining
Sequence miningSequence mining
Sequence mining
 
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationContext-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix Factorisation
 
Поиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правилаПоиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правила
 
On the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCAOn the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCA
 
Pattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic SequencesPattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic Sequences
 
AIST 2016 Opening Slides
AIST 2016 Opening SlidesAIST 2016 Opening Slides
AIST 2016 Opening Slides
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduce
 
RAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern StructuresRAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern Structures
 
Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.
 
A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?
 
Boolean matrix factorisation for collaborative filtering
Boolean matrix factorisation for collaborative filteringBoolean matrix factorisation for collaborative filtering
Boolean matrix factorisation for collaborative filtering
 
Intro to Data Mining and Machine Learning
Intro to Data Mining and Machine LearningIntro to Data Mining and Machine Learning
Intro to Data Mining and Machine Learning
 
Improved method for pattern discovery in text mining
Improved method for pattern discovery in text miningImproved method for pattern discovery in text mining
Improved method for pattern discovery in text mining
 
Modified Apriori Algorithm for Frequent Pattern Mining
Modified Apriori Algorithm for Frequent Pattern MiningModified Apriori Algorithm for Frequent Pattern Mining
Modified Apriori Algorithm for Frequent Pattern Mining
 
Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...
 

Ähnlich wie Pattern-based classification of demographic sequences

Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsCaleb (Shiqiang) Jin
 
Hypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesHypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesYoung-Geun Choi
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....
Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....
Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....SYRTO Project
 
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMINGChristian Robert
 
Generative models : VAE and GAN
Generative models : VAE and GANGenerative models : VAE and GAN
Generative models : VAE and GANSEMINARGROOT
 
Bayesian selection of best subsets in high-dimensional regression
Bayesian selection of best subsets in high-dimensional regressionBayesian selection of best subsets in high-dimensional regression
Bayesian selection of best subsets in high-dimensional regressionCaleb (Shiqiang) Jin
 
Machine Learning Today: Current Research And Advances From AMLAB, UvA
Machine Learning Today: Current Research And Advances From AMLAB, UvAMachine Learning Today: Current Research And Advances From AMLAB, UvA
Machine Learning Today: Current Research And Advances From AMLAB, UvAAdvanced-Concepts-Team
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsNTNU
 
Hunermund causal inference in ml and ai
Hunermund   causal inference in ml and aiHunermund   causal inference in ml and ai
Hunermund causal inference in ml and aiBoston Global Forum
 
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013Christian Robert
 

Ähnlich wie Pattern-based classification of demographic sequences (20)

Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
Interval Pattern Structures: An introdution
Interval Pattern Structures: An introdutionInterval Pattern Structures: An introdution
Interval Pattern Structures: An introdution
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
Hypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesHypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rules
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....
Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....
Bayesian Graphical Models for Structural Vector Autoregressive Processes - D....
 
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
 
Generative models : VAE and GAN
Generative models : VAE and GANGenerative models : VAE and GAN
Generative models : VAE and GAN
 
Bayesian selection of best subsets in high-dimensional regression
Bayesian selection of best subsets in high-dimensional regressionBayesian selection of best subsets in high-dimensional regression
Bayesian selection of best subsets in high-dimensional regression
 
Machine Learning Today: Current Research And Advances From AMLAB, UvA
Machine Learning Today: Current Research And Advances From AMLAB, UvAMachine Learning Today: Current Research And Advances From AMLAB, UvA
Machine Learning Today: Current Research And Advances From AMLAB, UvA
 
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet Metrics
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
NCM Latin squares talk
NCM Latin squares talkNCM Latin squares talk
NCM Latin squares talk
 
Hunermund causal inference in ml and ai
Hunermund   causal inference in ml and aiHunermund   causal inference in ml and ai
Hunermund causal inference in ml and ai
 
Practive test 1
Practive test 1Practive test 1
Practive test 1
 
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
 
DIC
DICDIC
DIC
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 

Mehr von Dmitrii Ignatov

Interpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley ValuesInterpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley ValuesDmitrii Ignatov
 
AIST2019 – opening slides
AIST2019 – opening slidesAIST2019 – opening slides
AIST2019 – opening slidesDmitrii Ignatov
 
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...Dmitrii Ignatov
 
Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...Dmitrii Ignatov
 
Social Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic RulesSocial Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic RulesDmitrii Ignatov
 
Orpailleur -- triclustering talk
Orpailleur -- triclustering talkOrpailleur -- triclustering talk
Orpailleur -- triclustering talkDmitrii Ignatov
 
CoClus ICDM Workshop talk
CoClus ICDM Workshop talkCoClus ICDM Workshop talk
CoClus ICDM Workshop talkDmitrii Ignatov
 
Radio recommender system for FMHost
Radio recommender system for FMHostRadio recommender system for FMHost
Radio recommender system for FMHostDmitrii Ignatov
 

Mehr von Dmitrii Ignatov (11)

Interpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley ValuesInterpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley Values
 
AIST2019 – opening slides
AIST2019 – opening slidesAIST2019 – opening slides
AIST2019 – opening slides
 
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
 
Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...
 
Aist2014
Aist2014Aist2014
Aist2014
 
Social Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic RulesSocial Learning in Networks: Extraction Deterministic Rules
Social Learning in Networks: Extraction Deterministic Rules
 
Orpailleur -- triclustering talk
Orpailleur -- triclustering talkOrpailleur -- triclustering talk
Orpailleur -- triclustering talk
 
CoClus ICDM Workshop talk
CoClus ICDM Workshop talkCoClus ICDM Workshop talk
CoClus ICDM Workshop talk
 
Pseudo-triclustering
Pseudo-triclusteringPseudo-triclustering
Pseudo-triclustering
 
Radio recommender system for FMHost
Radio recommender system for FMHostRadio recommender system for FMHost
Radio recommender system for FMHost
 
CrowDM system
CrowDM systemCrowDM system
CrowDM system
 

Kürzlich hochgeladen

Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 

Kürzlich hochgeladen (20)

Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 

Pattern-based classification of demographic sequences

  • 1. Pattern-Based Classification of Demographic Sequences Dmitry I. Ignatov1, Danil Gizdatullin1, Ekaterina Mitrofanova1, Anna Muratova1, Jaume Baixeries2 1National Research University Higher School of Economics, Moscow 2Universitat Polit`ecnica de Catalunya, Barcelona Intelligent Data Processing 2016, Barcelona Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 1 / 27
  • 2. Possible life events First job (job) The highest education degree is obtained (education) Leaving parents’ home (separation) First partner (partner) First marriage (marriage) First child birth (children) Break-up (parting) ... (divorce) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 2 / 27
  • 3. Data and problem statement [Ignatov et al., 2015],[Blockeel et al., 2001] Generation and Gender Survey (GGS): three waves panel data for 11 generations of Russian citizens starting from 30s Binary classification 1545 men 3312 women Examples of sequential patterns {education, separation}, {work}, {marriage}, {children} (m) {work}, {marriage}, {children}{education} (f ) {partner}, {marriage, separation}, {children} (f ) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 3 / 27
  • 4. Basic definitions Texbooks of Han et al., Zaki & Meira, Aggarwal et al., etc s = s1, ..., sk is the subsequence of s = s1, ..., sk (s s ) if k ≤ k and there exist 1 ≤ r1 < r2 < ... < rk ≤ k such sj = srj for all 1 ≤ j ≤ k. support(s, D) is the support of a sequence s in D, i.e. the number of sequences in D such that s is their subsequence. support(s, D) = |{s |s ∈ D, s s }| s is a frequent closed sequence (sequential pattern) if there is no s such that s s and support(s, D) = support(s , D) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 4 / 27
  • 5. Example Let D be a set of sequences: Table: Dataset D. s1 {a, b, c}{a, b}{b} s2 {a}{a, c}{a} s3 {a, b}{b, c} I = {a, b, c} is the set of all items (atomic events) {a, b}{b} belongs to s1 and s3 but it is missing in s2 supportD( {a, b}{b} ) = 2 { {a} , {c} , {a}{c} , {a, b}{b} , {a, c}{a} } is the set of closed sequences. Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 5 / 27
  • 6. CAEP: Classification by Aggregating Emerging Patterns G. Dong et al., 1999 Growth Rate growth rateD →D (X) =    suppD (X) suppD (X) if suppD (X) = 0 0 if suppD (X) = supp(X) = 0 ∞ if suppD (X) = 0 and suppD (X) = 0 Class score score(s, C) = e⊆s,e∈E(c) growth rateC (e) growth rateC (e) + 1 · suppc(e) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 6 / 27
  • 7. CAEP: Classification by Aggregating Emerging Patterns Score normalization normal score(s, C) = score(s, C) median({growth rateC (ei )}) Classification rule class(s) =    C1, if normal score(s, C1) > normal score(s, C2) C2, if normal score(s, C1) < normal score(s, C2) undetermined if normal score(s, C1) = normal score(s, C2) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 7 / 27
  • 8. Gapless prefix-based sequential patterns s = s1, ..., sk is a gapless prefix-based subsequence of s = s1, ..., sk (s∗ = s ) if k ≤ k and ∀i ∈ k : si = si . Support of gapless prefix-based sequences Let T be a set of sequences. support(s, T) = |{s |s ∈ T, s∗ = s }| |T| Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 8 / 27
  • 9. Gapless sequential patterns Let 0 < minSup ≤ 1 be a minimal support parameter and D is a set of sequences then searching for prefix-based gapless sequential patterns is the task of enumeration of all prefix-based gapless sequences s such that support(s, D) ≥ minSup. Every sequence s with support(s, D) ≥ minSup is called a prefix-based gapless sequential pattern. Prefix-based gapless sequential pattern (PGSP) p is called closed if there is no PGSP d of greater of equal support such that d = p∗. Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 9 / 27
  • 10. Gapless sequential patterns Example Table: D is a set of sequences. s1 {a}{b}{d} s2 {a}{b}{c} s3 {a, b}{b, c} s = {a}{b} I = {a, b, c} is the set of all items (atomic events) s1 = s∗; s2 = s∗ s3 = s∗ SuppD(s) = 2 3 {a}{b} is closed, {a} is not closed. Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 10 / 27
  • 11. Pattern Structures Ganter & Kuznetsov, 2001 (S, (D, ), δ) is a pattern structure S is a set of objects, D is a set of their their possible descriptions δ(g) is the description of g from S Galois connection is given by operator as follows: A := g∈A δ(g) for A ⊆ S d := {s ∈ S|d δ(g)} for d ∈ D For two sequences may result in their largest common prefix subsequence Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 11 / 27
  • 12. Pattern concepts A pair (A, d) is called a pattern concept of a pattern structure (S, (D, ), δ) if 1 A ⊆ S 2 d ∈ D 3 A = d 4 d = A Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 12 / 27
  • 13. Pattern Structures Example s1 : a, b, c s2 : a, b, c s3 : a, b, d Tree 0 a(3) b(3) c(2) d(1) Pattern concepts (PCs) ({s1, s2, s3}, a, b ); ({s1, s2}, a, b, c ) ({s1}, a, b, c ) is not a PC; ({s3}, a, b, d ) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 13 / 27
  • 14. Pattern-based JSM-hypotheses [Finn, 1981], [Kuznetsov, 1993], [Ganter et al, 2004] Positive, negative and undetermined pattern structures K⊕ = (S⊕, (D, ), δ⊕) K = (S , (D, ), δ ) There is a pattern structure of undetermined examples: Kτ = (Sτ , (D, ), δτ ) Hypothesis A hypothesis is a pattern intent that belongs to examples from a fixed class only A pattern intent h is a positive hypothesis (dually for negative hypotheses) if ∀s ∈ S (s ∈ S⊕) : h s (h s⊕ ) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 14 / 27
  • 15. Hypotheses generation: An example Sequential classification rules s1 : a, b, c - class 0 s2 : a, b, c - class 0 s3 : a, b, d - class 1 Prefix-tree 0 a(2; 1) b(2; 1) c(2; 0) d(0; 1) Hypotheses {a}, {b}, {c} is a hypothesis of class 0 {a}, {b}, {d} is a hypothesis of class 1 Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 15 / 27
  • 16. Classification via hypotheses class(gτ ) =    positive if ∃h⊕, h⊕ δ(gτ ) and h , h δ(gτ ) negative if h⊕, h⊕ δ(gτ ) and ∃h , h δ(gτ ) undetermined if ∃h⊕, h⊕ δ(gτ ) and ∃h , h δ(gτ ) undetermined if h⊕, h⊕ δ(gτ ) and h , h δ(gτ ) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 16 / 27
  • 17. Emerging patterns based on pattern structures Growth Rate GrowthRate(g, K⊕, K ) = SupK⊕ (g) SupK (g) Emerging patterns A pattern is called emerging pattern if its growth rate is greater than or equal to Θmin GrowthRate(g, K⊕, K ) > Θmin Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 17 / 27
  • 18. Emerging patterns for classification s is a new object normal score⊕(s) = p∈P⊕ GrowthRate(p, K⊕, K ) median(GrowthRate(P⊕)) : p s normal score (s) = p∈P GrowthRate(p, K , K⊕) median(GrowthRate(P )) : p s Classification via emerging patterns class(s) =    positive if normal score⊕(s) > score (s) negative if normal score⊕(s) < score (s) undetermined if normal score⊕(s) = normal score (s) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 18 / 27
  • 19. Classification algorithm for gapless prefix-based sequential patterns 1 Build the prefix tree for the input sequences. 2 For each tree node calculate its Growth Rate. 3 For every new sequence traverse the tree and compute the Score for each class. 4 Compare the Score value for different classes and classify the new sequence. Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 19 / 27
  • 20. Execution example Input sequences class 0 : { {a}{b}{c} , {b}{a}{c} , {b}{a}{c} , {b}{c} } class 1 : { {a}{c}{b} , {b}{c}{a} , {b}{c}{a} } Prefix tree 0 a(1; 1) b(1; 0) c(1; 0) c(0; 1) b(0; 1) b(2; 2) a(2; 0) c(2; 0) c(1; 2) a(0; 2) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 20 / 27
  • 21. Counting Growth Rate 0 a(0.75; 1.33) b(∞; 0) c(∞; 0) c(0; ∞) b(0; ∞) b(0.75; 1.33) a(∞; 0) c(∞; 0) c(0.38; 2.67) a(0; ∞) New sequence {b}; {c}; {a} −??? Score0 = 0 Score1 = 2.67 + ∞ = ∞ Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 21 / 27
  • 22. Comparison of closed and non-closed patterns Figure: TPR vs FPR for closed and non-closed patterns Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 22 / 27
  • 23. Experiments and results Figure: TPR-FPR for classification via gapless prefix-based patterns Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 23 / 27
  • 24. Interesting patterns (women) ( {work, separation}, {marriage}, {children}, {education} , [inf , 0.006]) ( {separation, partner}, {marriage} , [inf , 0.006]) ( {work, separation}, {marriage}, {children} , [inf , 0.008]) ( {work, separation}, {marriage} , [inf , 0.009]) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 24 / 27
  • 25. Interesting patterns (men) ( {education}, {marriage}, {work}, {children}, {separation} , [10.6, 0.006]) ( {education}, {marriage}, {work}, {children} , [12.7, 0.007]) ( {educ}, {work}, {part}, {mar}, {sep}, {ch} , [10.6, 0.006]) Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 25 / 27
  • 26. Conclusion 1 We have studied several pattern mining techniques for demographic sequences including pattern-based classification in particular. 2 We have fitted existing approaches for sequence mining of a special type (gapless and prefix-based ones). 3 The results for different demographic groups (classes) have been obtained and interpreted. 4 In particular, a classifier based on emerging sequences and pattern structures has been proposed. Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 26 / 27
  • 27. Conclusion Thank you! Questions? Ignatov et al. (HSE) Classification of Demographic Sequences IDP 2016 27 / 27