This document discusses different techniques for merging files in revision control systems. It begins by introducing the concept of merging as reconciling multiple changes made to files. It then discusses external sorting techniques that can handle large amounts of data. The main merging techniques covered are two-way merging, three-way merging, and k-way merging. Two-way merging considers differences between two files alone, while three-way merging also looks at the original parent file. Three-way merging is generally more reliable with less need for user intervention. K-way merging uses a tournament sort algorithm to merge an arbitrary number of files.
2. INTRODUCTION
Merging in revision control, is a fundamental operation that
reconciles multiple changes made to a revision-controlled collection
of files.
Most often, it is necessary when a file is modified by two people on
two different computers at the same time.
When two branches are merged, the result is a single collection of
files that contains both sets of changes.
External sorting may be used.
3. EXTERNAL SORTING
External sorting is a term for a class of sorting algorithms that can
handle massive amounts of data.
External sorting is required when the data being sorted do not fit into
the main memory of a computing device and instead they must reside
in the slower external memory.
External sorting typically uses a sort-merge strategy.
In the merge phase, the sorted sub files are combined into a single
larger file.
5. TWO WAY MERGE
A two-way merge performs an automated difference
analysis between a file 'A' and a file 'B'.
This method considers the differences between the two
files alone to conduct the merge and makes a "best-
guess" analysis to generate the resulting merge.
9. DISADVANTAGE OF TWO WAY MERGE
This type of merge is usually the most error prone.
Requires user intervention to verify and sometimes
correct the result of the merge.
10. THREE WAY MERGE
A three-way merge is performed after an automated difference
analysis between a file 'A' and a file 'B' while also considering
the origin, or parent, of both files.
This type of merge is more likely to be usable in revision
control systems, which can guarantee that such a parent exists
and is known.
The merge tool examines the differences and patterns
appearing in the changes between both files as well as the
parent.
12. ADVANTAGES OF THREE WAY MERGE
This merge is the most reliable and has performed well
in practice.
It has also required the least amount of user intervention.
In many cases, requiring no intervention at all making
the process eligible for task automation.
13. K-WAY MERGE ALGORITHM
Let there be two arrays:
•An array of k lists and
•An array of k index values corresponding to the current
element in each of the k lists, respectively.
Main loop of the K-Way Merge algorithm:
•Find the index of the minimum current item, minItem
•Process minItem(output it to the output list)
•For i=0 until i=k-1 (in increments of 1)
If the current item of list i is equal to minItem then
advance list i.
•Go back to the first step.
17. PERFORMANCE FACTORS
The number of records to be sorted.
The size of the records.
The number of storage devices used.
The distribution of those devices on the available I/O
channels.
The distribution of key values in the input files.