SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Distributed Database Systems

          22-11-2012
You must remember!
You must also remember!
• Relation data languages are based on
  relational algebra
• Relational algebra consist of a set of operators
  on relations, which include:
  – Selection
  – Projection
  – Union
  – Cartesian product
Cartesian Product
• The Cartesian product of two relations R of
  degree k1 and S of degree k2 is the set of
  (k1+k2)-tuples, where each result tuple is a
  concatenation of one tuple of R with one
  tuple of S, for all tuples of R and S (R X S)
• Consider the relation EMP and PAY, EMPXPAY
  is:
Cartesian Product (EMPXPAY)
Joins
• Join is a derivative of Cartesian Product
• There are various forms of joins
  – Join
     • Inner join
           – Theta join
           – Equi-join
     • Outer join
           – Left join
           – Right join
           – Full join
  – Semi join
Theta Join
• Consider the relation EMP, the theta-join of
  relation EMP and ASG over the join predicate
  EMP.ENO=ASG.ENO
Equi-Join
• This example demonstrate a special case of
  theta-join called equi-join
Semi-Join
• The semi-join of relation R, defined over the
  set of attributes A, by relation S, defined over
  the set of attributes B, is the subset of the
  tuples of R that participate in the join of R
  with S
• The advantage of semi-join is that it decreases
  the number of tuples that need to be handled
  to form the join
Semi-Join
• In centralized database systems, this is
  important because it usually results in a
  decreased number of secondary storage
  accesses by making better use of the memory.
• It is even more important in distributed
  databases since it usually reduces the amount
  of data that needs to be transmitted between
  sites in order to evaluate a query.
Semi-Join
• To demonstrate the difference between join
  and semi-join, lets consider the semi-join of
  EMP with PAY over the predicate EMP.TITLE =
  PAY.TITLE that is
Semi-Join
Derived Horizontal Fragmentation
• A derived horizontal fragmentation is defined
  on a member relation of a link according to a
  selection operation specified on its owner
• It is important to remember two points
  – First, the link between the owner and the member
    relations is defined as an equi-join
  – Second, an equi-join can be implemented by
    means of semi-join
Derived Horizontal Fragmentation
• Accordingly, given a link L where owner(L) = S
  and member(L) = R, the derived horizontal
  fragments of R are defined as:

• Where w is the maximum number of
  fragments that will be defined on R, and
                       S

  where Fi is the formula according to which
  the primary horizontal fragment Si is defined
Derived Horizontal Fragmentation
• To carry out a derived horizontal
  fragmentation, three inputs are needed:
  – The set of partitions of the owner relation (PAY1,
    PAY2)
  – The member relation
  – The set of semi join predicates between the
    owner and member (EMP.TITLE=PAY.TITLE)
Example
Example
• Consider L1, where owner(L1) = PAY and
  member (L1) = EMP
• We can group engineers into two groups
  according to their salary: those making less
  then or equal to $30,000, and those making
  more then $30,000
• The two fragments EMP1 and EMP2 are
  defined as:
Example
• The result of this fragmentation is depicted as:
Derived Horizontal Fragmentation
• One potential complication that need
  attention
• In a database schema if there are two link into
  a relation R, there could be more than one
  possible derived horizontal fragmentation of R
• The choice of candidate fragmentation is
  based on two criteria
  – The fragmentation with better join characteristics
  – The fragmentation used in more applications
The fragmentation used in more
             Applications
• It is quite straight forward if we take into
  consideration the frequency with which
  application access some data
• The access of the heavy users can minimize
  the total impact on system performance
The Fragmentation with better join
           characteristics
• Consider the last example, the effect of this
  fragmentation is that the join of the EMP and
  PAY relations to answer the query is assisted
  – By performing it on smaller relations
  – By potentially performing joins in parallel
The Fragmentation with better join
           characteristics
• The first point is obvious, the fragments of EMP
  are smaller than EMP itself
• Therefore, it will be faster to join any fragment of
  PAY with any fragment of EMP than to work with
  the relations themselves
• The second point is however, more important and
  is at the heart of distributed databases
• If, besides executing a number of queries at
  different sites, we can parallelize execution of one
  join query, the response time or throughput of
  the system can be expected to improve
The Fragmentation with better join
            characteristics
• In the case of joins, this is possible under certain
  circumstances
• Consider the join graph between the fragments of EMP
  and PAY, there is only one link coming in or going out of
  a fragment
• Such a join graph is called a simple graph
• The advantage of a design where the join relationship
  between fragments is simple is that the member and
  owner link can be allocated to one site and the joins
  between different pairs of fragments can proceed
  independently and in parallel
The Fragmentation with better join
         characteristics
The Fragmentation with better join
           characteristics
• Unfortunately, obtaining simple join graphs may
  not always be possible
• In that case the next desirable alternative is to
  have a design that results in a partitioned join
  graph
• A partitioned graph consist of two or more sub-
  graphs with no links between them
• Fragments so obtained may not be distributed for
  parallel execution as easily as those obtained via
  simple join graphs, but the allocation is still
  possible
The Fragmentation with better join
            characteristics
• Let us continue with the distribution design of the database
  we started before
• We already decided on the fragmentation of relation EMP
  according to the fragmentation of PAY
• Lets now consider ASG, assume that there are two
  applications
   – The first application finds the names of engineers who work at
     certain places, it turns on all three sites and accesses the
     information about the engineer who work on local projects with
     higher probability than those of projects at other locations
   – At each administrative sites where employee records are
     managed, users would like to access the responsibilities on the
     projects that these employee work on and learn how they will
     work on those projects
The Fragmentation with better join
           characteristics
• The first application results in a fragmentation
  of ASG according to the fragments PROJ1,
  PROJ3, PROJ4 and PROJ6 of PROJ obtained
  before
The Fragmentation with better join
           characteristics
• Therefore, the derived fragmentation of ASG
  according to {PROJ1, PROJ3, PROJ4, PROJ6} is
  defined as:

• The fragment instances are:
The Fragmentation with better join
           characteristics
• The second query can be specified in SQL as:



• Where i=1 or i=2, depending on the site where
  the query is issued
• The derived fragmentation of ASG according
  to the fragmentation of EMP is defined as:
The Fragmentation with better join
         characteristics
The Fragmentation with better join
           characteristics
• The example demonstrate two things:
  – Derived fragmentation may follow a chain where
    one relation is fragmented as a result of another
    one’s design and it, in turn, causes the
    fragmentation of another relation
      (PAY->EMP->ASG)
  – Typically, there will be more than one candidate
    fragmentation for a relation (ASG), the final choice
    of the fragmentation scheme may be a decision
    problem addressed during allocation
Checking of Correctness
• We should now check the fragmentation
  algorithms discussed so far with respect to
  three correctness criteria
  – Completeness
  – Reconstruction
  – Disjointness
Completeness
• The completeness of a primary horizontal
  fragmentation is based on the selection
  predicate used
• As long as the selection predicates are
  complete, the resulting fragmentation is
  guaranteed to be complete as well
Completeness
• The completeness of a derived horizontal
  fragmentation is somewhat more difficult to define




• For example, there should be no ASG tuple which has
  a project number that is not also contained in PROJ,
  this rule is know as referential integrity
Reconstruction
• Reconstruction of a global relation from its
  fragments is performed by the union operator
  in both the primary and the derived horizontal
  fragmentation
• Thus for a relation R with fragmentation
Disjointness
• It is easier to establish Disjointness of
  fragmentation for primary than for derived
  horizontal fragmentation
• In PHF Disjointness is guaranteed as long as
  the minterm predicates determining the
  fragmentation are mutually exclusive
Example
• In derived fragmentation, however, there is a
  semi join involved that adds considerable
  complexity
• Disjointness can be guaranteed if the join graph is
  simple, otherwise it is necessary to investigate
  actual tuple values
• In general we do not want a tuple of a member
  relation to join with two or more tuples of the
  owner relation when these tuples are in different
  fragments of the owner
Example
• In fragmenting relation PAY, the minterm predicates M =
  {m1, m2} where
   m1: SAL<=30000
   m2: SAL>30000
• Since m1 and m2 are mutually exclusive, the fragmentation
  of PAY is disjoint
• For relation EMP, however we require that
   – Each engineer has a single title
   – Each title have a single salary value associated with it
• Since these two rules follow from the semantics of the
  database, the fragmentation of EMP with respect to PAY is
  also disjoint

Weitere ähnliche Inhalte

Was ist angesagt?

Advanced normalization - Bcnf
Advanced normalization - BcnfAdvanced normalization - Bcnf
Advanced normalization - Bcnflitpuvn
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactionsNilu Desai
 
2 PHASE COMMIT PROTOCOL
2 PHASE COMMIT PROTOCOL2 PHASE COMMIT PROTOCOL
2 PHASE COMMIT PROTOCOLKABILESH RAMAR
 
Transaction management DBMS
Transaction  management DBMSTransaction  management DBMS
Transaction management DBMSMegha Patel
 
Transaction management and concurrency control
Transaction management and concurrency controlTransaction management and concurrency control
Transaction management and concurrency controlDhani Ahmad
 
Evaluation of Expression in Query Processing
Evaluation of Expression in Query ProcessingEvaluation of Expression in Query Processing
Evaluation of Expression in Query ProcessingNeel Shah
 
Fragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed DatabaseFragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed DatabaseAbhilasha Lahigude
 
Database administrator
Database administratorDatabase administrator
Database administratorTech_MX
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Gyanmanjari Institute Of Technology
 
Dbms 3: 3 Schema Architecture
Dbms 3: 3 Schema ArchitectureDbms 3: 3 Schema Architecture
Dbms 3: 3 Schema ArchitectureAmiya9439793168
 
Database design (conceptual, logical and physical design) unit 2 part 2
Database design (conceptual, logical and physical design)  unit 2 part 2Database design (conceptual, logical and physical design)  unit 2 part 2
Database design (conceptual, logical and physical design) unit 2 part 2Ram Paliwal
 

Was ist angesagt? (20)

Advanced normalization - Bcnf
Advanced normalization - BcnfAdvanced normalization - Bcnf
Advanced normalization - Bcnf
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactions
 
2 PHASE COMMIT PROTOCOL
2 PHASE COMMIT PROTOCOL2 PHASE COMMIT PROTOCOL
2 PHASE COMMIT PROTOCOL
 
Transaction management DBMS
Transaction  management DBMSTransaction  management DBMS
Transaction management DBMS
 
DDBMS Paper with Solution
DDBMS Paper with SolutionDDBMS Paper with Solution
DDBMS Paper with Solution
 
Triggers and active database
Triggers and active databaseTriggers and active database
Triggers and active database
 
Transaction management and concurrency control
Transaction management and concurrency controlTransaction management and concurrency control
Transaction management and concurrency control
 
The CAP Theorem
The CAP Theorem The CAP Theorem
The CAP Theorem
 
Evaluation of Expression in Query Processing
Evaluation of Expression in Query ProcessingEvaluation of Expression in Query Processing
Evaluation of Expression in Query Processing
 
Fragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed DatabaseFragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed Database
 
Query processing
Query processingQuery processing
Query processing
 
Parallel Database
Parallel DatabaseParallel Database
Parallel Database
 
Database administrator
Database administratorDatabase administrator
Database administrator
 
Data administration
Data administrationData administration
Data administration
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Dbms 3: 3 Schema Architecture
Dbms 3: 3 Schema ArchitectureDbms 3: 3 Schema Architecture
Dbms 3: 3 Schema Architecture
 
Advanced DBMS presentation
Advanced DBMS presentationAdvanced DBMS presentation
Advanced DBMS presentation
 
Software design
Software designSoftware design
Software design
 
Database design (conceptual, logical and physical design) unit 2 part 2
Database design (conceptual, logical and physical design)  unit 2 part 2Database design (conceptual, logical and physical design)  unit 2 part 2
Database design (conceptual, logical and physical design) unit 2 part 2
 

Ähnlich wie 8 drived horizontal fragmentation

Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptxShafii8
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
Chapter 3 principles of parallel algorithm design
Chapter 3   principles of parallel algorithm designChapter 3   principles of parallel algorithm design
Chapter 3 principles of parallel algorithm designDenisAkbar1
 
Adbms 24 data fragmentation
Adbms 24 data fragmentationAdbms 24 data fragmentation
Adbms 24 data fragmentationVaibhav Khanna
 
Phases of distributed query processing
Phases of distributed query processingPhases of distributed query processing
Phases of distributed query processingNevil Dsouza
 
chap 10 dbms.pptx
chap 10 dbms.pptxchap 10 dbms.pptx
chap 10 dbms.pptxarjun431527
 
Couplingand cohesion student
Couplingand cohesion studentCouplingand cohesion student
Couplingand cohesion studentsaurabh kumar
 
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...dyyjkd
 
DATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEMDATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEMDr. GOPINATH D
 
Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...
Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...
Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...Kaivalya Shah
 
Path loss exponent estimation
Path loss exponent estimationPath loss exponent estimation
Path loss exponent estimationNguyen Minh Thu
 
Ajal satellite link budget
Ajal satellite link budgetAjal satellite link budget
Ajal satellite link budgetAJAL A J
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)Sudarshan Mondal
 

Ähnlich wie 8 drived horizontal fragmentation (20)

Query trees
Query treesQuery trees
Query trees
 
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query ProcessingDistributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query Processing
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptx
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Chapter 3 principles of parallel algorithm design
Chapter 3   principles of parallel algorithm designChapter 3   principles of parallel algorithm design
Chapter 3 principles of parallel algorithm design
 
Nbvtalkataitamimageprocessingconf
NbvtalkataitamimageprocessingconfNbvtalkataitamimageprocessingconf
Nbvtalkataitamimageprocessingconf
 
Chap3 slides
Chap3 slidesChap3 slides
Chap3 slides
 
Adbms 24 data fragmentation
Adbms 24 data fragmentationAdbms 24 data fragmentation
Adbms 24 data fragmentation
 
Phases of distributed query processing
Phases of distributed query processingPhases of distributed query processing
Phases of distributed query processing
 
chap 10 dbms.pptx
chap 10 dbms.pptxchap 10 dbms.pptx
chap 10 dbms.pptx
 
Couplingand cohesion student
Couplingand cohesion studentCouplingand cohesion student
Couplingand cohesion student
 
I- Tasser
I- TasserI- Tasser
I- Tasser
 
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
 
Relational model
Relational modelRelational model
Relational model
 
DATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEMDATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEM
 
Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...
Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...
Paper Review: ENERGY-EFFICIENT WIRELESS COMMUNICATIONS TUTORIAL, SURVEY, AND ...
 
Path loss exponent estimation
Path loss exponent estimationPath loss exponent estimation
Path loss exponent estimation
 
Ajal satellite link budget
Ajal satellite link budgetAjal satellite link budget
Ajal satellite link budget
 
Transactions
TransactionsTransactions
Transactions
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)
 

8 drived horizontal fragmentation

  • 3. You must also remember! • Relation data languages are based on relational algebra • Relational algebra consist of a set of operators on relations, which include: – Selection – Projection – Union – Cartesian product
  • 4. Cartesian Product • The Cartesian product of two relations R of degree k1 and S of degree k2 is the set of (k1+k2)-tuples, where each result tuple is a concatenation of one tuple of R with one tuple of S, for all tuples of R and S (R X S) • Consider the relation EMP and PAY, EMPXPAY is:
  • 6. Joins • Join is a derivative of Cartesian Product • There are various forms of joins – Join • Inner join – Theta join – Equi-join • Outer join – Left join – Right join – Full join – Semi join
  • 7. Theta Join • Consider the relation EMP, the theta-join of relation EMP and ASG over the join predicate EMP.ENO=ASG.ENO
  • 8. Equi-Join • This example demonstrate a special case of theta-join called equi-join
  • 9. Semi-Join • The semi-join of relation R, defined over the set of attributes A, by relation S, defined over the set of attributes B, is the subset of the tuples of R that participate in the join of R with S • The advantage of semi-join is that it decreases the number of tuples that need to be handled to form the join
  • 10. Semi-Join • In centralized database systems, this is important because it usually results in a decreased number of secondary storage accesses by making better use of the memory. • It is even more important in distributed databases since it usually reduces the amount of data that needs to be transmitted between sites in order to evaluate a query.
  • 11. Semi-Join • To demonstrate the difference between join and semi-join, lets consider the semi-join of EMP with PAY over the predicate EMP.TITLE = PAY.TITLE that is
  • 13. Derived Horizontal Fragmentation • A derived horizontal fragmentation is defined on a member relation of a link according to a selection operation specified on its owner • It is important to remember two points – First, the link between the owner and the member relations is defined as an equi-join – Second, an equi-join can be implemented by means of semi-join
  • 14. Derived Horizontal Fragmentation • Accordingly, given a link L where owner(L) = S and member(L) = R, the derived horizontal fragments of R are defined as: • Where w is the maximum number of fragments that will be defined on R, and S where Fi is the formula according to which the primary horizontal fragment Si is defined
  • 15. Derived Horizontal Fragmentation • To carry out a derived horizontal fragmentation, three inputs are needed: – The set of partitions of the owner relation (PAY1, PAY2) – The member relation – The set of semi join predicates between the owner and member (EMP.TITLE=PAY.TITLE)
  • 17. Example • Consider L1, where owner(L1) = PAY and member (L1) = EMP • We can group engineers into two groups according to their salary: those making less then or equal to $30,000, and those making more then $30,000 • The two fragments EMP1 and EMP2 are defined as:
  • 18. Example • The result of this fragmentation is depicted as:
  • 19. Derived Horizontal Fragmentation • One potential complication that need attention • In a database schema if there are two link into a relation R, there could be more than one possible derived horizontal fragmentation of R • The choice of candidate fragmentation is based on two criteria – The fragmentation with better join characteristics – The fragmentation used in more applications
  • 20. The fragmentation used in more Applications • It is quite straight forward if we take into consideration the frequency with which application access some data • The access of the heavy users can minimize the total impact on system performance
  • 21. The Fragmentation with better join characteristics • Consider the last example, the effect of this fragmentation is that the join of the EMP and PAY relations to answer the query is assisted – By performing it on smaller relations – By potentially performing joins in parallel
  • 22. The Fragmentation with better join characteristics • The first point is obvious, the fragments of EMP are smaller than EMP itself • Therefore, it will be faster to join any fragment of PAY with any fragment of EMP than to work with the relations themselves • The second point is however, more important and is at the heart of distributed databases • If, besides executing a number of queries at different sites, we can parallelize execution of one join query, the response time or throughput of the system can be expected to improve
  • 23. The Fragmentation with better join characteristics • In the case of joins, this is possible under certain circumstances • Consider the join graph between the fragments of EMP and PAY, there is only one link coming in or going out of a fragment • Such a join graph is called a simple graph • The advantage of a design where the join relationship between fragments is simple is that the member and owner link can be allocated to one site and the joins between different pairs of fragments can proceed independently and in parallel
  • 24. The Fragmentation with better join characteristics
  • 25. The Fragmentation with better join characteristics • Unfortunately, obtaining simple join graphs may not always be possible • In that case the next desirable alternative is to have a design that results in a partitioned join graph • A partitioned graph consist of two or more sub- graphs with no links between them • Fragments so obtained may not be distributed for parallel execution as easily as those obtained via simple join graphs, but the allocation is still possible
  • 26. The Fragmentation with better join characteristics • Let us continue with the distribution design of the database we started before • We already decided on the fragmentation of relation EMP according to the fragmentation of PAY • Lets now consider ASG, assume that there are two applications – The first application finds the names of engineers who work at certain places, it turns on all three sites and accesses the information about the engineer who work on local projects with higher probability than those of projects at other locations – At each administrative sites where employee records are managed, users would like to access the responsibilities on the projects that these employee work on and learn how they will work on those projects
  • 27. The Fragmentation with better join characteristics • The first application results in a fragmentation of ASG according to the fragments PROJ1, PROJ3, PROJ4 and PROJ6 of PROJ obtained before
  • 28. The Fragmentation with better join characteristics • Therefore, the derived fragmentation of ASG according to {PROJ1, PROJ3, PROJ4, PROJ6} is defined as: • The fragment instances are:
  • 29. The Fragmentation with better join characteristics • The second query can be specified in SQL as: • Where i=1 or i=2, depending on the site where the query is issued • The derived fragmentation of ASG according to the fragmentation of EMP is defined as:
  • 30. The Fragmentation with better join characteristics
  • 31. The Fragmentation with better join characteristics • The example demonstrate two things: – Derived fragmentation may follow a chain where one relation is fragmented as a result of another one’s design and it, in turn, causes the fragmentation of another relation (PAY->EMP->ASG) – Typically, there will be more than one candidate fragmentation for a relation (ASG), the final choice of the fragmentation scheme may be a decision problem addressed during allocation
  • 32. Checking of Correctness • We should now check the fragmentation algorithms discussed so far with respect to three correctness criteria – Completeness – Reconstruction – Disjointness
  • 33. Completeness • The completeness of a primary horizontal fragmentation is based on the selection predicate used • As long as the selection predicates are complete, the resulting fragmentation is guaranteed to be complete as well
  • 34. Completeness • The completeness of a derived horizontal fragmentation is somewhat more difficult to define • For example, there should be no ASG tuple which has a project number that is not also contained in PROJ, this rule is know as referential integrity
  • 35. Reconstruction • Reconstruction of a global relation from its fragments is performed by the union operator in both the primary and the derived horizontal fragmentation • Thus for a relation R with fragmentation
  • 36. Disjointness • It is easier to establish Disjointness of fragmentation for primary than for derived horizontal fragmentation • In PHF Disjointness is guaranteed as long as the minterm predicates determining the fragmentation are mutually exclusive
  • 37. Example • In derived fragmentation, however, there is a semi join involved that adds considerable complexity • Disjointness can be guaranteed if the join graph is simple, otherwise it is necessary to investigate actual tuple values • In general we do not want a tuple of a member relation to join with two or more tuples of the owner relation when these tuples are in different fragments of the owner
  • 38. Example • In fragmenting relation PAY, the minterm predicates M = {m1, m2} where m1: SAL<=30000 m2: SAL>30000 • Since m1 and m2 are mutually exclusive, the fragmentation of PAY is disjoint • For relation EMP, however we require that – Each engineer has a single title – Each title have a single salary value associated with it • Since these two rules follow from the semantics of the database, the fragmentation of EMP with respect to PAY is also disjoint