SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Downloaden Sie, um offline zu lesen
Compressed Membership in Automata with
                            Compressed Labels

                                              Markus Lohrey
                                            Christian Mathissen

                                             University of Leipzig


                                              June 16, 2011




Lohrey, Mathissen (University of Leipzig)     Compressed Membership   June 16, 2011   1 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings
  Applications:
     ◮    all domains, where massive string data arise and have to be
          processed, e.g. bioinformatics




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings
  Applications:
     ◮    all domains, where massive string data arise and have to be
          processed, e.g. bioinformatics
     ◮    large (and highly compressible) strings often occur as intermediate
          data structures (e.g. in computational group theory, program analysis,
          verification).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.
  One may have |val(A)| = 2n .



Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.
  One may have |val(A)| = 2n .
  Thus, an SLP A can be seen as a compressed representation of val(A).

Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):
     ◮    From an SLP A one can compute in polynomial time LZ77(val(A)).




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):
     ◮    From an SLP A one can compute in polynomial time LZ77(val(A)).
     ◮    From LZ77(w ) one can compute in polynomial time an SLP A with
          val(A) = w .

Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?


  Note: The decompress-and-compare strategy does not work here.
  We cannot compute val(A) and val(B) in polynomial time.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?


  Note: The decompress-and-compare strategy does not work here.
  We cannot compute val(A) and val(B) in polynomial time.

  Plandowski’s algorithm uses combinatorics on words, in particular the
  Periodicity-Lemma of Fine and Wilf.


Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Improvements of Plandowski’s result


  Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara,
  Takeda (mid 90’s)
  The following problem can be solved in polynomial time
  (fully compressed pattern matching):
  INPUT: SLPs P, T
  QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   6 / 12
Improvements of Plandowski’s result


  Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara,
  Takeda (mid 90’s)
  The following problem can be solved in polynomial time
  (fully compressed pattern matching):
  INPUT: SLPs P, T
  QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ?


  The best known algorithm has a running time of O(|P| · |T|2 )
  (Lifshits 2006).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   6 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   7 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.

  Example: The compressed automaton
           Σ                Σ

                      q              A      q

  accepts all words that have val(A) as a factor.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   7 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.

  Example: The compressed automaton
           Σ                Σ

                      q              A        q

  accepts all words that have val(A) as a factor.

  The size |A| of the compressed automaton A
  (with the set ∆ of transition triples) is

                                            |A| =                |A|.
                                                    (p,A,q)∈∆


Lohrey, Mathissen (University of Leipzig)     Compressed Membership     June 16, 2011   7 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?


  Plandowski, Rytter 1999

     ◮    Compressed membership for compressed automata belongs to
          PSPACE.
     ◮    Compressed membership for compressed automata over a unary
          alphabet is NP-complete.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?


  Plandowski, Rytter 1999

     ◮    Compressed membership for compressed automata belongs to
          PSPACE.
     ◮    Compressed membership for compressed automata over a unary
          alphabet is NP-complete.


  Plandowski and Rytter conjectured that compressed membership for
  compressed automata is NP-complete (for every alphabet size).
Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.
  Then period(w ) = 3 and order(w ) = 2.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.
  Then period(w ) = 3 and order(w ) = 2.

  For a compressed automaton A, let:

             order(A) = max{order(val(A)) | A occurs as a label in A}
           period(A) = max{period(val(A)) | A occurs as a label in A}



Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
First main result


  Theorem 1
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   10 / 12
First main result


  Theorem 1
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A).


  We use the following combinatorial fact:


     Let u, v ∈ Σ∗ and let p be a position in v . Then there exist at most
     order(u) many occurrences of u in v that “touch” position p.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   10 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).
  A word w = an has period 1!




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).
  A word w = an has period 1!
  The theorem is proven by a reduction to the case of a unary alphabet.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.


  Conjecture 2
  If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there
  exists an accepting run of A on val(B) (viewed as a word over the set of
  transition triples of A), which can be generated by an SLP of size
  poly(|B|, |A|).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.


  Conjecture 2
  If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there
  exists an accepting run of A on val(B) (viewed as a word over the set of
  transition triples of A), which can be generated by an SLP of size
  poly(|B|, |A|).

  Conjecture 2 implies Conjecture 1:
     ◮    Guess nondeterministically an SLP C of polynomial size.
     ◮    Check (in deterministic polynomial time), whether val(C) is an
          accepting run of A on val(B).

Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12

Weitere ähnliche Inhalte

Mehr von CSR2011

Csr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCsr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCSR2011
 
Csr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCsr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCSR2011
 
Csr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCsr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCSR2011
 
Csr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCsr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCSR2011
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCSR2011
 
Csr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCsr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCSR2011
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCSR2011
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCSR2011
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCSR2011
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCSR2011
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCSR2011
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCSR2011
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCSR2011
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCSR2011
 
Csr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCsr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCSR2011
 

Mehr von CSR2011 (20)

Csr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCsr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilka
 
Csr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCsr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyen
 
Csr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCsr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskin
 
Csr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCsr2011 june18 11_30_remila
Csr2011 june18 11_30_remila
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanov
 
Csr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCsr2011 june17 16_30_blin
Csr2011 june17 16_30_blin
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morin
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyi
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonati
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatov
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminski
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatov
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morin
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyi
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonati
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanov
 
Csr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCsr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadis
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Csr2011 june16 17_00_lohrey

  • 1. Compressed Membership in Automata with Compressed Labels Markus Lohrey Christian Mathissen University of Leipzig June 16, 2011 Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 1 / 12
  • 2. Motivation Try to develop algorithms that directly work on compressed data. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 3. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 4. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 5. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Applications: ◮ all domains, where massive string data arise and have to be processed, e.g. bioinformatics Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 6. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Applications: ◮ all domains, where massive string data arise and have to be processed, e.g. bioinformatics ◮ large (and highly compressible) strings often occur as intermediate data structures (e.g. in computational group theory, program analysis, verification). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 7. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 8. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 9. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 10. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 11. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 12. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 13. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. One may have |val(A)| = 2n . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 14. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. One may have |val(A)| = 2n . Thus, an SLP A can be seen as a compressed representation of val(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 15. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 16. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 17. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 18. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 19. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): ◮ From an SLP A one can compute in polynomial time LZ77(val(A)). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 20. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): ◮ From an SLP A one can compute in polynomial time LZ77(val(A)). ◮ From LZ77(w ) one can compute in polynomial time an SLP A with val(A) = w . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 21. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 22. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 23. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Note: The decompress-and-compare strategy does not work here. We cannot compute val(A) and val(B) in polynomial time. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 24. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Note: The decompress-and-compare strategy does not work here. We cannot compute val(A) and val(B) in polynomial time. Plandowski’s algorithm uses combinatorics on words, in particular the Periodicity-Lemma of Fine and Wilf. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 25. Improvements of Plandowski’s result Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara, Takeda (mid 90’s) The following problem can be solved in polynomial time (fully compressed pattern matching): INPUT: SLPs P, T QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 6 / 12
  • 26. Improvements of Plandowski’s result Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara, Takeda (mid 90’s) The following problem can be solved in polynomial time (fully compressed pattern matching): INPUT: SLPs P, T QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ? The best known algorithm has a running time of O(|P| · |T|2 ) (Lifshits 2006). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 6 / 12
  • 27. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 28. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Example: The compressed automaton Σ Σ q A q accepts all words that have val(A) as a factor. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 29. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Example: The compressed automaton Σ Σ q A q accepts all words that have val(A) as a factor. The size |A| of the compressed automaton A (with the set ∆ of transition triples) is |A| = |A|. (p,A,q)∈∆ Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 30. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 31. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Plandowski, Rytter 1999 ◮ Compressed membership for compressed automata belongs to PSPACE. ◮ Compressed membership for compressed automata over a unary alphabet is NP-complete. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 32. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Plandowski, Rytter 1999 ◮ Compressed membership for compressed automata belongs to PSPACE. ◮ Compressed membership for compressed automata over a unary alphabet is NP-complete. Plandowski and Rytter conjectured that compressed membership for compressed automata is NP-complete (for every alphabet size). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 33. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 34. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 35. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 36. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Then period(w ) = 3 and order(w ) = 2. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 37. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Then period(w ) = 3 and order(w ) = 2. For a compressed automaton A, let: order(A) = max{order(val(A)) | A occurs as a label in A} period(A) = max{period(val(A)) | A occurs as a label in A} Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 38. First main result Theorem 1 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 10 / 12
  • 39. First main result Theorem 1 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A). We use the following combinatorial fact: Let u, v ∈ Σ∗ and let p be a position in v . Then there exist at most order(u) many occurrences of u in v that “touch” position p. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 10 / 12
  • 40. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 41. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 42. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). A word w = an has period 1! Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 43. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). A word w = an has period 1! The theorem is proven by a reduction to the case of a unary alphabet. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 44. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12
  • 45. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Conjecture 2 If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there exists an accepting run of A on val(B) (viewed as a word over the set of transition triples of A), which can be generated by an SLP of size poly(|B|, |A|). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12
  • 46. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Conjecture 2 If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there exists an accepting run of A on val(B) (viewed as a word over the set of transition triples of A), which can be generated by an SLP of size poly(|B|, |A|). Conjecture 2 implies Conjecture 1: ◮ Guess nondeterministically an SLP C of polynomial size. ◮ Check (in deterministic polynomial time), whether val(C) is an accepting run of A on val(B). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12