2. Compression
the process of coding that will effectively reduce
the total number of bits needed to represent
certain information.
3. Lossless Compression
• data compression technique that reduces
the size of a file without sacrificing any
original data
information from the file is still
there, nothing is deleted
lets you recreate the original file exactly
• suitable for: text and computer code
• example: ZIP archiving technology
(WinZip & PKZIP)
5. Lossless Compression
pro. exact duplicate of the original file
con. compression ratio is not all that high
Compression ratio is the ratio of the size or rate of the
original data to the size or rate of the compressed data.
6. Lossless Compression
Principle of Lossless Compression Algorithms
any non-random file will contain duplicated
information that can be condensed using statistical
modeling techniques that determine the probability
of a character or phrase appearing
9. Run-Length Encoding
compression technique that replaces runs
of two or more of the same character with a
number which represents the length of the run,
followed by the original character; single
characters are coded as runs of 1.
11. Burrows-Wheeler Transform
o technique invented in 1994 that aims to
reversibly transform a block of input data such
that the amount of runs of identical characters is
maximized.
o does not perform any compression operations,
it simply transforms the input such that it can be
more efficiently coded by a Run-Length Encoder
or other secondary compression technique.
12. Burrows-Wheeler Transform (cont’d)
Algorithm
I. Create a string array.
II. Generate all possible rotations of the input
string, storing each in the array.
III. Sort the array alphabetically.
IV.Return the last column of the array