SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Bounds for Overlapping
Interval Join on MapReduce
Foto N. Afrati1, Shlomi Dolev2,
Shantanu Sharma2, and Jeffrey D. Ullman3
1 National Technical University of Athens, Greece
2 Ben-Gurion University of the Negev, Israel
3 Stanford University, USA
2nd Algorithms and Systems for MapReduce and Beyond (BeyondMR)
Brussels, Belgium (27 March 2015)
Outline
• Introduction
• Goal of Mapping Schema and Our Contribution
• Unit-Length and Equally-Spaced Intervals
• Variable-Length and Equally-Spaced Intervals
• Conclusion
2
Outline
• Introduction
– Interval and Overlapping Intervals
– Interval Join
– Reducer capacity and Mapping Schema
• Goal of Mapping Schema and Our Contribution
• Unit-Length and Equally-Spaced Intervals
• Variable-Length and Equally-Spaced Intervals
• Conclusion
3
• Interval
– A pair [starting time , ending time]
– A (time) interval, i, is represented by a pair of times
[Ts
i
, Te
i
], Ts
i
< Te
i
, where Ts
i
and Te
i
show the starting-
point and the ending-point of the interval i, respectively
– Example:
• My talk,
• a phase of a project, a class of a professor
Introduction
4
Ts
i
= 10am
Talk
Te
i
= 10:30am
• Overlapping Intervals
– Two intervals, say interval i and interval j are called
overlapping intervals if the intersection of both the
interval is nonempty
Introduction
5Non-overlapping intervalsOverlapping intervals
i
j
Overlapping intervals
Talk
Coffee break
10am 10:35am
10:30am 11am
Introduction
6
EmpID Name Duration
𝑒1 U 1-Apr –1-June
𝑒2 V 1-May –1-July
𝑒3 W 1-Apr –1-July
𝑒4 X 1-Mar –1-June
𝑒5 Y 1-Mar –1-Aug
Phase Duration
Requirement Analysis (RA) 1-Mar – 1-May
Design (D) 1-Apr – 1-June
Coding (C) 1-May –1-Aug
1-Mar 1-Apr 1-May 1-June 1-July 1-Aug
Project Employee
Project
Employee
RA
D
C
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
• Overlapping Interval Join: an example
Find all the employee that are
involved in RA phase of the
project
• Reducer capacity
– An upper bound on the total number of intervals
that are assigned to the reducer
– Example
• Reducer capacity to be the size of the main memory of
the processors on which reducers run
• Communication cost
– Total amount of data to be transferred from the map
phase to reduce phase
– Tradeoff between the reducer capacity and communication
cost
Introduction
7
Introduction
Mapping schema for interval join
An assignment of the set of intervals to some given
reducers, such that
– Respect the reducer capacity
• The total number of intervals assigned to a reducer must be
less than or equal to the reducer capacity
– Assignment of inputs
• For every output, it is required to assign every two
corrosponding overlapping corrossponding intervals to at least
one reducer in common
8Reducer
I1 I2 I3
Reducer Reducer Reducer
I1 I2 I3I1 I2 I3
State-of-the-Art
• B. Chawda, H. Gupta, S. Negi, T.A. Faruquie, L.V.
Subramaniam, and M.K. Mohania, “Processing Interval
Joins On Map-Reduce,” EDBT, 2014.
• MapReduce-based 2-way and multiway interval join
algorithms of overlapping intervals
• Not regarding the reducer capacity
• No analysis of a lower bound on replication of
individual intervals
• No analysis of the replication rate of the algorithms
offered therein
9
Outline
• Introduction
• Goal of Mapping Schema and Our Contribution
• Unit-Length and Equally-Spaced Intervals
• Variable-Length and Equally-Spaced Intervals
• Conclusion
10
• Interval join problem
– Assign all the intervals that share at least one
common point of time to at least one reduce in
common for finding outputs
Goal of Mapping Schema
11
• An algorithm for variable-length intervals that
can start at any time
– Before this, we consider two simple cases of
• Unit-length and equally-spaced intervals and provide
algorithm
• Variable-length and equally-spaced intervals and
provide algorithm
• All the algorithms achieve almost matching upper
bound on the replication rate to the lower bound
Our Contribution
12
Outline
• Introduction
• Goal of Mapping Schema and Our Contribution
• Unit-Length and Equally-Spaced Intervals
• Variable-Length and Equally-Spaced Intervals
• Conclusion
13
• Relations X and Y of n intervals
• All intervals do not have beginning beyond k and
before 0
• Hence, spacing between starting points of two
successive intervals =
k
n
< 1
Unit-Length and Equally-Spaced
Intervals
14
0 .25 .50 .75 1 1.25 1.5 1.75 2 2.25
X
Y
n = 9 and k = 2.25, so spacing = 0.25
• Divide the time-range from 0 to k into
equal-sized partitions of length w (say P
partitions are created)
• Arrange P reducers
• Assign all intervals of X that exist in a
partition pi to ith reducer
• Assign all intervals of Y that have their
starting or ending-point in partition pi to
ith reducer
Unit-Length and Equally-
Spaced Intervals-Algorithm
0 .25 .50 .75 1 1.25 1.5 1.75 2 2.25
X
Y
n = 9 and k = 2.25
1 partition 2 partition 3 partition
5 partition4 partition
• Does the algorithm work?
• Consider q =
3wn
k
+
n
k
+ 2
• q: the reducer capacity
• w: length of a partition
• n: the total number of intervals in a relation
• k: the last starting point of an interval
• Count how many intervals lie in a partition, if
they are less than or equal to q then we have
a solution and the algorithm works.
Unit-Length and Equally-Spaced
Intervals
16
• Does the algorithm work?
– Count 1: How many intervals of Y overlap with an
interval X in a partition of length w?
• Spacing is n/k, so at most 2wn/k intervals of Y can
overlap with an interval of X
– Count 2: How many intervals can have starting
points after starting of xi and starting points
before ending of xi.
• Intervals of X after starting point of xi = wn/k
• Intervals of X before starting point of xi = n/k
– Count 3: Do not forget to count xi itself and an
identical interval of Y i.e. yi.
Unit-Length and Equally-Spaced
Intervals
17
0 .25 .50 .75 1 1.25 1.5 1.75 2 2.25
X
Y
n = 9 and k = 2.25
1 partition 2 partition 3 partition
5 partition4 partition
• Does the algorithm work?
– Total number of intervals in a partition
– Count 1 + Count 2 + Count 3 =
2wn
k
+
wn
k
+
n
k
+ 2
= q
– OK. The algorithm works
Unit-Length and Equally-Spaced
Intervals
18
Outline
• Introduction
• Goal of Mapping Schema and Our Contribution
• Unit-Length and Equally-Spaced Intervals
• Variable-Length and Equally-Spaced Intervals
• Conclusion
19
• Two types of intervals
– Big and small intervals
– Different length intervals
Variable-Length and Equally-
Spaced Intervals
20
• Big and small intervals
– All the intervals of X are of length lmin
– All the intervals of Y are of length lmax
– The previous algorithm will work here too
– Note that an interval of X will be replicated to
several reducers, while an interval of Y will be
replicated to at most two reducers
Variable-Length and Equally-
Spaced Intervals
21
0 .7 1.4 2.1 2.8 3.5 4.2
X
Y
n = 6 and
spacing = 0.7
• Variable-length intervals: A general case
– All the restriction regarding length of an interval
and spacing between two interval is removed
– Intervals can begin at some time greater than or
equal to 0 and end by time T
– S: the total length of intervals in one relation
Variable-Length and Equally-
Spaced Intervals
22
0 s s+1 s+2 s+3 T
X
Y
• Variable-length intervals: A general case
– Algorithm
• Divide the time range into
T
w
equal sized partitions
• Arrange
T
w
reducers
• Follow the same procedure as in the previous algorithm
– i.e., assign all the intervals of X that belong to ith partition to ith
reducers and assign all the intervals of Y to reducers corresponding
to their starting and ending points (only to at most two reducers)
Variable-Length and Equally-
Spaced Intervals
23
0 s s+1 s+2 s+3 T
X
Y
• Variable-length intervals: A general case
– Does the algorithm work?
– Consider q =
3nw + S
T
– Count the average number of intervals of X and Y sent to a
reducer; if they are less than or equal to the reducer
capacity, then the algorithm will work
Variable-Length and Equally-
Spaced Intervals
24
• Variable-length intervals: A general case
– Count 1: Average number of intervals of Y
received by a reducer
•
Replication∗Total number of inputs
total number of reducer
– An interval of Y is sent to at most to 2 reducers
(Replication)
– There are
T
w reducers and n intervals in Y
• Average number of intervals of Y received by a
reducer =
2∗n
T/w
Variable-Length and Equally-
Spaced Intervals
25
• Variable-length intervals: A general case
– Count 2: Average number of intervals of X
received by a reducer
•
Replication∗Total number of inputs
total number of reducer
– Average length of intervals is S/n
– An interval of X is sent to at most to 1 + S/nw reducers
– There are
T
w
reducers and n intervals in X
• Average number of intervals of X received by a
reducer =
(1+S/nW)∗n
T/w
Variable-Length and Equally-
Spaced Intervals
26
Average
length/how much
length a reducer
can hold
• Variable-length intervals: A general case
– Does the algorithm work?
– Total number of intervals that a reducer receive
= Count 1+ Count 2
2nw
T
+
(1+S/nW)wn
T
=
3wn+S
T
= q
The algorithm works
Variable-Length and Equally-
Spaced Intervals
27
Outline
• Introduction
• Problem Statement and Our Contribution
• Unit-Length and Equally-Spaced Intervals
• Variable-Length and Equally-Spaced Intervals
• Conclusion
28
Conclusion
• An investigation for good MapReduce algorithms for
the problem of finding pairs of overlapping intervals
• Algorithms for:
– Unit-sized and equally-spaced intervals
• Lower bounds on the replication rate = 2 or 2q
n
k
• Upper bounds on the replication rate =
3
qT−S
S
2
– Big-small and equally-spaced intervals
• Lower bounds on the replication rate = 2 or 2q
lmin
s
• Upper bounds on the replication rate =
3
qT−S
S
2
– A general case for variable length intervals
• Upper bounds on the replication rate =
3
qT−S
S
2
29Proofs of lower and upper bounds on the replication rate are given in the paper
Foto Afrati1, Shlomi Dolev2, Shantanu Sharma2, and
Jeffrey D. Ullman3
1 School of Electrical and Computing Engineering, National Technical
University of Athens, Greece
afrati@softlab.ece.ntua.gr
2 Department of Computer Science, Ben-Gurion University of the
Negev, Israel
{dolev,sharmas}@cs.bgu.ac.il
3 Department of Computer Science, Stanford University, USA
ullman@cs.stanford.edu
Presentation is available at
http://www.cs.bgu.ac.il/~sharmas/publication.html

Weitere ähnliche Inhalte

Was ist angesagt?

Simple queuingmodelspdf
Simple queuingmodelspdfSimple queuingmodelspdf
Simple queuingmodelspdf
Ankit Katiyar
 
Convolution discrete and continuous time-difference equaion and system proper...
Convolution discrete and continuous time-difference equaion and system proper...Convolution discrete and continuous time-difference equaion and system proper...
Convolution discrete and continuous time-difference equaion and system proper...
Vinod Sharma
 
Lecture1 Intro To Signa
Lecture1 Intro To SignaLecture1 Intro To Signa
Lecture1 Intro To Signa
babak danyal
 

Was ist angesagt? (20)

Simple queuingmodelspdf
Simple queuingmodelspdfSimple queuingmodelspdf
Simple queuingmodelspdf
 
Lecture2 Signal and Systems
Lecture2 Signal and SystemsLecture2 Signal and Systems
Lecture2 Signal and Systems
 
2.time domain analysis of lti systems
2.time domain analysis of lti systems2.time domain analysis of lti systems
2.time domain analysis of lti systems
 
Discrete Time Fourier Transform
Discrete Time Fourier TransformDiscrete Time Fourier Transform
Discrete Time Fourier Transform
 
Frequency Estimation
Frequency EstimationFrequency Estimation
Frequency Estimation
 
Signal and System, CT Signal DT Signal, Signal Processing(amplitude and time ...
Signal and System, CT Signal DT Signal, Signal Processing(amplitude and time ...Signal and System, CT Signal DT Signal, Signal Processing(amplitude and time ...
Signal and System, CT Signal DT Signal, Signal Processing(amplitude and time ...
 
3.Frequency Domain Representation of Signals and Systems
3.Frequency Domain Representation of Signals and Systems3.Frequency Domain Representation of Signals and Systems
3.Frequency Domain Representation of Signals and Systems
 
Convolution discrete and continuous time-difference equaion and system proper...
Convolution discrete and continuous time-difference equaion and system proper...Convolution discrete and continuous time-difference equaion and system proper...
Convolution discrete and continuous time-difference equaion and system proper...
 
Fpw chapter 4 - digital ctrl of dynamic systems
Fpw chapter 4 - digital ctrl of dynamic systemsFpw chapter 4 - digital ctrl of dynamic systems
Fpw chapter 4 - digital ctrl of dynamic systems
 
Digital signal System
Digital signal SystemDigital signal System
Digital signal System
 
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
 
Circuit model of ESD waveforms
Circuit model of ESD waveformsCircuit model of ESD waveforms
Circuit model of ESD waveforms
 
modeling of system electromechanical, Armature Controlled D.C Motor -Reduced ...
modeling of system electromechanical, Armature Controlled D.C Motor -Reduced ...modeling of system electromechanical, Armature Controlled D.C Motor -Reduced ...
modeling of system electromechanical, Armature Controlled D.C Motor -Reduced ...
 
Lecture4 Signal and Systems
Lecture4  Signal and SystemsLecture4  Signal and Systems
Lecture4 Signal and Systems
 
Lecture5 Signal and Systems
Lecture5 Signal and SystemsLecture5 Signal and Systems
Lecture5 Signal and Systems
 
Lecture1 Intro To Signa
Lecture1 Intro To SignaLecture1 Intro To Signa
Lecture1 Intro To Signa
 
5. convolution and correlation of discrete time signals
5. convolution and correlation of discrete time signals 5. convolution and correlation of discrete time signals
5. convolution and correlation of discrete time signals
 
1.introduction to signals
1.introduction to signals1.introduction to signals
1.introduction to signals
 
Dc3 t1
Dc3 t1Dc3 t1
Dc3 t1
 
Z Transform
Z TransformZ Transform
Z Transform
 

Ähnlich wie Bounds for overlapping interval join on MapReduce

Queuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depthQueuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depth
IdcIdk1
 
CSP UNIT 2 AIML.ppt
CSP UNIT 2 AIML.pptCSP UNIT 2 AIML.ppt
CSP UNIT 2 AIML.ppt
ssuser6e2b26
 
IntroductionTopological sorting is a common operation performed .docx
IntroductionTopological sorting is a common operation performed .docxIntroductionTopological sorting is a common operation performed .docx
IntroductionTopological sorting is a common operation performed .docx
mariuse18nolet
 
Applications of graphs
Applications of graphsApplications of graphs
Applications of graphs
Tech_MX
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
Sheena Jose
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
Jothish DL
 

Ähnlich wie Bounds for overlapping interval join on MapReduce (20)

Queuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depthQueuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depth
 
NS 6141 - Physical quantities.pptx
NS 6141 - Physical quantities.pptxNS 6141 - Physical quantities.pptx
NS 6141 - Physical quantities.pptx
 
3D routing algorithm for sensor network in e-health
3D routing algorithm for sensor network in e-health3D routing algorithm for sensor network in e-health
3D routing algorithm for sensor network in e-health
 
09 placement
09 placement09 placement
09 placement
 
Ec 2401 wireless communication unit 2
Ec 2401 wireless communication   unit 2Ec 2401 wireless communication   unit 2
Ec 2401 wireless communication unit 2
 
CSP UNIT 2 AIML.ppt
CSP UNIT 2 AIML.pptCSP UNIT 2 AIML.ppt
CSP UNIT 2 AIML.ppt
 
IntroductionTopological sorting is a common operation performed .docx
IntroductionTopological sorting is a common operation performed .docxIntroductionTopological sorting is a common operation performed .docx
IntroductionTopological sorting is a common operation performed .docx
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
 
Wavelet Based Image Compression Using FPGA
Wavelet Based Image Compression Using FPGAWavelet Based Image Compression Using FPGA
Wavelet Based Image Compression Using FPGA
 
chap4_slides.ppt
chap4_slides.pptchap4_slides.ppt
chap4_slides.ppt
 
Applications of graphs
Applications of graphsApplications of graphs
Applications of graphs
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
 
SP_BEE2143_C1.pptx
SP_BEE2143_C1.pptxSP_BEE2143_C1.pptx
SP_BEE2143_C1.pptx
 
5. AREAS AND VOLUMES (SUR) 3140601 GTU
5. AREAS AND VOLUMES (SUR) 3140601 GTU5. AREAS AND VOLUMES (SUR) 3140601 GTU
5. AREAS AND VOLUMES (SUR) 3140601 GTU
 
Lecture24
Lecture24Lecture24
Lecture24
 
Chromatography
Chromatography Chromatography
Chromatography
 
Advanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filteringAdvanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filtering
 

Mehr von Shantanu Sharma

Secure and Privacy-Preserving Big-Data Processing
Secure and Privacy-Preserving Big-Data ProcessingSecure and Privacy-Preserving Big-Data Processing
Secure and Privacy-Preserving Big-Data Processing
Shantanu Sharma
 

Mehr von Shantanu Sharma (10)

Secure and Privacy-Preserving Big-Data Processing
Secure and Privacy-Preserving Big-Data ProcessingSecure and Privacy-Preserving Big-Data Processing
Secure and Privacy-Preserving Big-Data Processing
 
OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation QueriesOBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
 
Verifiable Round-Robin Scheme for Smart Homes (CODASPY 2019)
Verifiable Round-Robin Scheme for Smart Homes (CODASPY 2019)Verifiable Round-Robin Scheme for Smart Homes (CODASPY 2019)
Verifiable Round-Robin Scheme for Smart Homes (CODASPY 2019)
 
Partitioned Data Security on Outsourced Sensitive and Non-sensitive Data -- I...
Partitioned Data Security on Outsourced Sensitive and Non-sensitive Data -- I...Partitioned Data Security on Outsourced Sensitive and Non-sensitive Data -- I...
Partitioned Data Security on Outsourced Sensitive and Non-sensitive Data -- I...
 
Private and secure secret shared map reduce
Private and secure secret shared map reducePrivate and secure secret shared map reduce
Private and secure secret shared map reduce
 
A Survey on 5G: The Next Generation of Mobile Communication
A Survey on 5G: The Next Generation of Mobile CommunicationA Survey on 5G: The Next Generation of Mobile Communication
A Survey on 5G: The Next Generation of Mobile Communication
 
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce ComputationsMeta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
 
On Detecting Termination in Cognitive Radio Networks
On Detecting Termination in Cognitive Radio NetworksOn Detecting Termination in Cognitive Radio Networks
On Detecting Termination in Cognitive Radio Networks
 
Assignment of Different-Sized Inputs in MapReduce
Assignment of Different-Sized Inputs in MapReduceAssignment of Different-Sized Inputs in MapReduce
Assignment of Different-Sized Inputs in MapReduce
 
Self-Stabilizing End-to-End Communication in Bounded Capacity, Omitting, D...
Self-Stabilizing End-to-End Communication in Bounded Capacity, Omitting, D...Self-Stabilizing End-to-End Communication in Bounded Capacity, Omitting, D...
Self-Stabilizing End-to-End Communication in Bounded Capacity, Omitting, D...
 

KĂźrzlich hochgeladen

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
SUHANI PANDEY
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 

KĂźrzlich hochgeladen (20)

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Bounds for overlapping interval join on MapReduce

  • 1. Bounds for Overlapping Interval Join on MapReduce Foto N. Afrati1, Shlomi Dolev2, Shantanu Sharma2, and Jeffrey D. Ullman3 1 National Technical University of Athens, Greece 2 Ben-Gurion University of the Negev, Israel 3 Stanford University, USA 2nd Algorithms and Systems for MapReduce and Beyond (BeyondMR) Brussels, Belgium (27 March 2015)
  • 2. Outline • Introduction • Goal of Mapping Schema and Our Contribution • Unit-Length and Equally-Spaced Intervals • Variable-Length and Equally-Spaced Intervals • Conclusion 2
  • 3. Outline • Introduction – Interval and Overlapping Intervals – Interval Join – Reducer capacity and Mapping Schema • Goal of Mapping Schema and Our Contribution • Unit-Length and Equally-Spaced Intervals • Variable-Length and Equally-Spaced Intervals • Conclusion 3
  • 4. • Interval – A pair [starting time , ending time] – A (time) interval, i, is represented by a pair of times [Ts i , Te i ], Ts i < Te i , where Ts i and Te i show the starting- point and the ending-point of the interval i, respectively – Example: • My talk, • a phase of a project, a class of a professor Introduction 4 Ts i = 10am Talk Te i = 10:30am
  • 5. • Overlapping Intervals – Two intervals, say interval i and interval j are called overlapping intervals if the intersection of both the interval is nonempty Introduction 5Non-overlapping intervalsOverlapping intervals i j Overlapping intervals Talk Coffee break 10am 10:35am 10:30am 11am
  • 6. Introduction 6 EmpID Name Duration 𝑒1 U 1-Apr –1-June 𝑒2 V 1-May –1-July 𝑒3 W 1-Apr –1-July 𝑒4 X 1-Mar –1-June 𝑒5 Y 1-Mar –1-Aug Phase Duration Requirement Analysis (RA) 1-Mar – 1-May Design (D) 1-Apr – 1-June Coding (C) 1-May –1-Aug 1-Mar 1-Apr 1-May 1-June 1-July 1-Aug Project Employee Project Employee RA D C 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 • Overlapping Interval Join: an example Find all the employee that are involved in RA phase of the project
  • 7. • Reducer capacity – An upper bound on the total number of intervals that are assigned to the reducer – Example • Reducer capacity to be the size of the main memory of the processors on which reducers run • Communication cost – Total amount of data to be transferred from the map phase to reduce phase – Tradeoff between the reducer capacity and communication cost Introduction 7
  • 8. Introduction Mapping schema for interval join An assignment of the set of intervals to some given reducers, such that – Respect the reducer capacity • The total number of intervals assigned to a reducer must be less than or equal to the reducer capacity – Assignment of inputs • For every output, it is required to assign every two corrosponding overlapping corrossponding intervals to at least one reducer in common 8Reducer I1 I2 I3 Reducer Reducer Reducer I1 I2 I3I1 I2 I3
  • 9. State-of-the-Art • B. Chawda, H. Gupta, S. Negi, T.A. Faruquie, L.V. Subramaniam, and M.K. Mohania, “Processing Interval Joins On Map-Reduce,” EDBT, 2014. • MapReduce-based 2-way and multiway interval join algorithms of overlapping intervals • Not regarding the reducer capacity • No analysis of a lower bound on replication of individual intervals • No analysis of the replication rate of the algorithms offered therein 9
  • 10. Outline • Introduction • Goal of Mapping Schema and Our Contribution • Unit-Length and Equally-Spaced Intervals • Variable-Length and Equally-Spaced Intervals • Conclusion 10
  • 11. • Interval join problem – Assign all the intervals that share at least one common point of time to at least one reduce in common for finding outputs Goal of Mapping Schema 11
  • 12. • An algorithm for variable-length intervals that can start at any time – Before this, we consider two simple cases of • Unit-length and equally-spaced intervals and provide algorithm • Variable-length and equally-spaced intervals and provide algorithm • All the algorithms achieve almost matching upper bound on the replication rate to the lower bound Our Contribution 12
  • 13. Outline • Introduction • Goal of Mapping Schema and Our Contribution • Unit-Length and Equally-Spaced Intervals • Variable-Length and Equally-Spaced Intervals • Conclusion 13
  • 14. • Relations X and Y of n intervals • All intervals do not have beginning beyond k and before 0 • Hence, spacing between starting points of two successive intervals = k n < 1 Unit-Length and Equally-Spaced Intervals 14 0 .25 .50 .75 1 1.25 1.5 1.75 2 2.25 X Y n = 9 and k = 2.25, so spacing = 0.25
  • 15. • Divide the time-range from 0 to k into equal-sized partitions of length w (say P partitions are created) • Arrange P reducers • Assign all intervals of X that exist in a partition pi to ith reducer • Assign all intervals of Y that have their starting or ending-point in partition pi to ith reducer Unit-Length and Equally- Spaced Intervals-Algorithm 0 .25 .50 .75 1 1.25 1.5 1.75 2 2.25 X Y n = 9 and k = 2.25 1 partition 2 partition 3 partition 5 partition4 partition
  • 16. • Does the algorithm work? • Consider q = 3wn k + n k + 2 • q: the reducer capacity • w: length of a partition • n: the total number of intervals in a relation • k: the last starting point of an interval • Count how many intervals lie in a partition, if they are less than or equal to q then we have a solution and the algorithm works. Unit-Length and Equally-Spaced Intervals 16
  • 17. • Does the algorithm work? – Count 1: How many intervals of Y overlap with an interval X in a partition of length w? • Spacing is n/k, so at most 2wn/k intervals of Y can overlap with an interval of X – Count 2: How many intervals can have starting points after starting of xi and starting points before ending of xi. • Intervals of X after starting point of xi = wn/k • Intervals of X before starting point of xi = n/k – Count 3: Do not forget to count xi itself and an identical interval of Y i.e. yi. Unit-Length and Equally-Spaced Intervals 17 0 .25 .50 .75 1 1.25 1.5 1.75 2 2.25 X Y n = 9 and k = 2.25 1 partition 2 partition 3 partition 5 partition4 partition
  • 18. • Does the algorithm work? – Total number of intervals in a partition – Count 1 + Count 2 + Count 3 = 2wn k + wn k + n k + 2 = q – OK. The algorithm works Unit-Length and Equally-Spaced Intervals 18
  • 19. Outline • Introduction • Goal of Mapping Schema and Our Contribution • Unit-Length and Equally-Spaced Intervals • Variable-Length and Equally-Spaced Intervals • Conclusion 19
  • 20. • Two types of intervals – Big and small intervals – Different length intervals Variable-Length and Equally- Spaced Intervals 20
  • 21. • Big and small intervals – All the intervals of X are of length lmin – All the intervals of Y are of length lmax – The previous algorithm will work here too – Note that an interval of X will be replicated to several reducers, while an interval of Y will be replicated to at most two reducers Variable-Length and Equally- Spaced Intervals 21 0 .7 1.4 2.1 2.8 3.5 4.2 X Y n = 6 and spacing = 0.7
  • 22. • Variable-length intervals: A general case – All the restriction regarding length of an interval and spacing between two interval is removed – Intervals can begin at some time greater than or equal to 0 and end by time T – S: the total length of intervals in one relation Variable-Length and Equally- Spaced Intervals 22 0 s s+1 s+2 s+3 T X Y
  • 23. • Variable-length intervals: A general case – Algorithm • Divide the time range into T w equal sized partitions • Arrange T w reducers • Follow the same procedure as in the previous algorithm – i.e., assign all the intervals of X that belong to ith partition to ith reducers and assign all the intervals of Y to reducers corresponding to their starting and ending points (only to at most two reducers) Variable-Length and Equally- Spaced Intervals 23 0 s s+1 s+2 s+3 T X Y
  • 24. • Variable-length intervals: A general case – Does the algorithm work? – Consider q = 3nw + S T – Count the average number of intervals of X and Y sent to a reducer; if they are less than or equal to the reducer capacity, then the algorithm will work Variable-Length and Equally- Spaced Intervals 24
  • 25. • Variable-length intervals: A general case – Count 1: Average number of intervals of Y received by a reducer • Replication∗Total number of inputs total number of reducer – An interval of Y is sent to at most to 2 reducers (Replication) – There are T w reducers and n intervals in Y • Average number of intervals of Y received by a reducer = 2∗n T/w Variable-Length and Equally- Spaced Intervals 25
  • 26. • Variable-length intervals: A general case – Count 2: Average number of intervals of X received by a reducer • Replication∗Total number of inputs total number of reducer – Average length of intervals is S/n – An interval of X is sent to at most to 1 + S/nw reducers – There are T w reducers and n intervals in X • Average number of intervals of X received by a reducer = (1+S/nW)∗n T/w Variable-Length and Equally- Spaced Intervals 26 Average length/how much length a reducer can hold
  • 27. • Variable-length intervals: A general case – Does the algorithm work? – Total number of intervals that a reducer receive = Count 1+ Count 2 2nw T + (1+S/nW)wn T = 3wn+S T = q The algorithm works Variable-Length and Equally- Spaced Intervals 27
  • 28. Outline • Introduction • Problem Statement and Our Contribution • Unit-Length and Equally-Spaced Intervals • Variable-Length and Equally-Spaced Intervals • Conclusion 28
  • 29. Conclusion • An investigation for good MapReduce algorithms for the problem of finding pairs of overlapping intervals • Algorithms for: – Unit-sized and equally-spaced intervals • Lower bounds on the replication rate = 2 or 2q n k • Upper bounds on the replication rate = 3 qT−S S 2 – Big-small and equally-spaced intervals • Lower bounds on the replication rate = 2 or 2q lmin s • Upper bounds on the replication rate = 3 qT−S S 2 – A general case for variable length intervals • Upper bounds on the replication rate = 3 qT−S S 2 29Proofs of lower and upper bounds on the replication rate are given in the paper
  • 30. Foto Afrati1, Shlomi Dolev2, Shantanu Sharma2, and Jeffrey D. Ullman3 1 School of Electrical and Computing Engineering, National Technical University of Athens, Greece afrati@softlab.ece.ntua.gr 2 Department of Computer Science, Ben-Gurion University of the Negev, Israel {dolev,sharmas}@cs.bgu.ac.il 3 Department of Computer Science, Stanford University, USA ullman@cs.stanford.edu Presentation is available at http://www.cs.bgu.ac.il/~sharmas/publication.html