1. Data Representation, Data
Compression & Encryption
Group Member:
TAY LEONG PING B031110105
NG SING HAN B031110101
NGOH KYE LIAN B031110024
WONG LAM SHEN B031110044
TAN CHING TING B031110241
3. Data Representation
Data representation is generally how information is conceived,
manipulated, and recorded. The term can also be defined as
the form in which data and information is kept in a certain
environment. How data is stored varies from one environment
to another, with each environment having its own set of rules
and standards.
4. Data Representation
Data Representation refers to the methods used internally
to represent information stored in a computer. Computers
store lots of different types of information:
numbers
text
graphics of many varieties (stills, video, animation)
sound
5. Data Representation
" The problem is that a file containing the bytes 108, 97,
110 would read as “lan” on an ASCII system, but
“%/>” on an EBCDIC system
In ASCII, the value 108 means the character 'l'
" In EBCDIC, the value 108 means the character '%'
6. ASCII - American Standard Code for Information
Interchange, representing English on all microcomputers
and most minicomputer.
EBCDIC - Extended Binary Coded Decimal Interchange
Code, represents English on IBM mainframes.
Shift-JIS - Japanese Characters.
7. Data Representation
Data representations include:
ASN.1 (Abstract Syntax Notation One) - an ISO
standard
XDR (External Data Representation)
- used with SunRPC
8. ASN.1
Abstract Syntax Notation (ASN.1) is standard and notation
that describes rules and structures for
representing, encoding, transmitting, and decoding data. It
consists of two parts:
1. abstract syntax that describes data structures in an
unambiguous way. Use “ integers”, “character strings”,
and “structures” rather than bits and bytes.
2. A transfer syntax that describes the bit stream
encoding of ASN.1 data objects.
10. ASN.1
Example of ASN.1’S abstract syntax:
Student ::= SEQUENCE {
name [0] IMPLICIT OCTET STRING OPTIONAL,
grad [1] IMPLICIT BOOLEAN OPTIONAL DEFAULT FALSE,
gpa [2] IMPLICIT REAL OPTIONAL,
id [3] IMPLICIT INTEGER,
bday [4] IMPLICIT OCTET STRING OPTIONAL
}
11. Current Uses of ANS1
Audio & Video over the Internet
AT&T, Intel, IBM, Microsoft, 3COM
Electronic Commerce
American Express, GTE, MasterCard, VISA
Telephony
AT&T, MCI, Motorola, Nokia, Sprint
Aviation
FAA, ICAO
Manufacturing
Ford, Mercedes Benz, Mitsubishi
Network Management
Bull, Compaq, Hewlett-Packard, Sun
Routers
Bay Networks, Cisco, Racal, Xyplex
12. External Data Representation
(XDR)
External Data Representation (XDR) is much simpler than
ASN.1, but less powerful. For instance:
1. XDR uses implicit typing. Communicating peers must
know the type of any exchanged data. In contrast,
ASN.1 uses explicit typing; it includes type information
as part of the transfer syntax.
2. In XDR, all data is transferred in units of 4 bytes.
Numbers are transferred in network order, most
significant byte first.
14. XDR
3. Strings consist of a 4 byte length, followed by the data
(and perhaps padding in the last byte).
4. Defined types include: integer, enumeration, boolean,
floating point, fixed length array, structures, plus
others.
One advantage that XDR has over ASN.1 is that current
implementations of ASN.1 execute significantly slower than
XDR.
15. Multipurpose Internet Mail Extensions
(MIME)
" The message “£100 is about !150” could
become
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=ISO-8859-15
MIME-Version: 1.0
=A3100 is about =A4150
18. Data Compression
Data compression is the art of reducing the
number of bits needed to store or transmit data.
Compression can be either lossless or lossy.
19. Lossless Compression – involve no loss of information. If data
have been losslessly compressed, the original data can be recovered
exactly from compress data. It is generally used for application that
cannot tolerate any difference between original and reconstructed
data.
Lossy Compression – involve some loss of information and data
have been compressed using lossy techniques generally cannot be
recovered or reconstructed exactly. In return for accepting this
distortion in reconstruction, can generally obtain much higher
compression ratios than is possible with lossless compression.
20. Steps of Data Compression
The compression of still images, audio and video data streams:
1. Picture preparation – generates an appropriate digital
representation of the information in the medium being
compressed.
2. Picture processing –is the first step that makes use of the
various compression algorithms.
3. Quantization – Values determined in the previous step cannot
and should not be processed with full exactness; instead they are
quantized according to a specific resolution and characteristic
curve.
4. Entropy encoding – with a sequential data stream of individual
bits and bytes, different techniques are used to perform a final,
lossless compression.
21. Steps of Data Compression
Major steps of image compression, can also be applied to audio and video data
Uncompressed
Picture
Picture
Preparation
Picture
Processing
Quantization
Entropy
Coding
Compressed
Picture
22. Image Compression
to represent images with less data in order to save storage
costs or transmission time.
possible to reduce file size to 10% from the original without
noticeable loss in quality.
Image compression can be lossless or lossy.
23. Image Compression
Lossless
- Image quality is not reduced.
Use in: artificial images that contain sharp-edged lines such as
technical drawings, textual graphics, comics, maps or logos.
Methods: run-length encoding (RLE), entropy coding (Huffman
coding) and dictionary coders (LZW).
24. Image Compression
Lossy
- reduces image quality. Cannot get the original image back &
lose some information.
Use in: natural images such as photos of landscapes
Methods: discrete cosine transform (DCT, used in JPEG) or
wavelet transform (used in JPEG 2000), color quantization
25. FORMAT FILE EXTENTION TYPE OF
COMPRESSION
METHODS USAGE
BMP (bitmap) .bmp Cosiderably
compressed with
lossless
ZIP used to store bitmap
digital images
JPEG
(Joint Photographic
Experts Group)
.jpg , .jpeg , .jpe Lossy
Lossless
- Discrete Cosine
Transform (DCT) &
Chroma Subsampling
- Run-Length
Encoding (RLE)
For natural images
GIF (Graphics
Interchange Format
.gif , .giff , .gfa Lossless LZW (Lempel-Ziv-
Welch)
For artificial images
(sharp-edge lines
and few colors) &
support animation
PNG (Portable
Network Graphics)
.png Lossless DEFLATE Better compression &
features than GIF,
but don’t support
animation
TIFF (Tagged Image
File Format)
.tiff , .tif Lossless RLE / LZW /
DEFLATE / ZIP
Flexible file format,
can store multiple
images in a single file
JPEG2000 jp2, .j2c, jpc, j2k, jpx Lossy & Lossless Discrete Wavelet
Transform (DWT)
Better image quality
than JPEG (up to
20%), not widely
used because of
some patent issues.
Comparison of graphics file formats
26. Block Diagram of JPEG Compression
Transformation
coding performed
using the Discrete
Cosine Transform
(DCT)
Quantization of all
DCT coefficients
( a lossy process)
Huffman coding
and arithmetic
coding as entropy
encoding methods
Source
image
JPEG compression
DCT Quantization Encoding
Compressed
image
27. Audio Compression
A form of data compression designed to reduce the size of
audio files
Audio compression can be lossless or lossy
Audio compression algorithms are typically referred to as
audio codecs.
28. Audio Compression
Lossless - allows one to preserve an exact
copy of one's audio files
Usage: For archival purposes, editing, audio
quality.
Codecs:
Free Lossless Audio Codec (FLAC)
Apple Lossless
MPEG-4 ALS
Monkey's Audio
Lossless Predictive Audio Compression (LPAC)
Lossless Transform Audio Compression (LTAC)
29. Audio Compression
Lossy - irreversible changes , achieves far greater
compression, use psychoacoustics to recognize that not all
data in an audio stream can be perceived by the human
auditory system.
Usage: distribution of streaming audio, or interactive
applications
Codecs:
MP2- MPEG-1Layer 2 audio codec
MP3 – MPEG-1 Layer 3 audio codec
MPC Musepack
Vorbis Ogg Vorbis
AAC Advanced Audio Coding (MPEG-2 and MPEG-4)
WMA Windows Media Audio
AC3 AC-3 or Dolby Digital A/52
30. Moving Picture Expert Group
(MPEG)
MPEG is an ISO/IEC working group, established in 1988 to
develop standards for digital audio and video formats.
MPEG-1
Designed for up to 1.5 Mbit/sec
Standard for the compression of moving pictures and audio. Most
popular is level 3 of MPEG-1 (MP3). MPEG-1 is the standard of
compression for VideoCD.
MPEG-2
Designed for between 1.5 and 15 Mbit/sec
Standard on which Digital Television set top boxes and DVD
compression is based. Designed for the compression and
transmission of digital broadcast television
31. MPEG (cont.)
• MPEG-4
Integrates several different audio components into one
standard: speech compression, perceptually based coders,
text-to-speech, and MIDI. MPEG-4 AAC (Advanced Audio
Coding), is similar to the MPEG-2 AAC standard, with some
minor changes
MPEG-7 (under development) - also called the Multimedia
Content Description Interface. In terms of audio:facilitate
the representation and search for sound content. Example
application supported by MPEG-7: automatic speech
recognition (ASR).
32. MPEG Audio Encoding
Uncompressed
Audio Signal
Division in 32
Frequency
Bands
Psychoacoustic
Model
Quantization
(if
applicable)
Entropy
Encoding
Compressed
Audio Data
controls
34. Audio Compression Format-MP3
Played by almost every portable digital audio device and many
DVD players, MP3 is still hard to go past if looking for maximum
compatibility for your files.
can get much better compression from other formats, hard disks
and blank CDs are cheap enough to justify the extra file size.
Stereo imaging is not terrific and encoding quality differs from
one software package to another.
Compression: 5.
Quality: 7.
Compatibility: 10.
Overall: 7.5.
35. Audio Compression Format-WMA
Window's Media Audio is Microsoft's contribution to high quality,
lossy audio compression. Like most other new formats, it
outperforms MP3 in terms of quality and compression, particularly
at lower bitrates.
WMA is probably the format of choice for streaming at low
bandwidths. Like MP3, however, the stereo imaging is not very
accurate.
WMA tends to overcompensate for its high compression with what
is often called 'overbrightness'.
Compression: 8.
Quality: 7.
Compatibility: 9.
Overall: 8.
36. Audio Compression Format- Ogg Vorbis
project attempting to replace all proprietary audio formats with an
open standard freeware codec. Version one was released in this past
fortnight and has been demonstrated to be very high quality and
outperforms MP3 by a long shot.
At low bitrates it doesn't compete with WMA, and at high bitrates it
falls short of MPC. Given that it is a work in progress, however, it has
strong potential to become a widely used audio codec.
Some portable device manufacturers are promising to support Ogg
Vorbis in future software releases.
Compression: 8.
Quality: 7.
Compatibility: 6.
Overall: 7.
37. Video Compression
Storing and transmitting uncompressed raw video is not an
efficient technique because it needs large amounts of
storage and bandwidth.
DVD, DSS, and internet video, all use digital data → take a
lot of space to store and large bandwidth to transmit.
Video compression technique is used to compress the data
for these applications
→ less storage space and less bandwidth to transmit data.
38. Video Compression
Videos are sequences of images displayed at a high rate. Each of
these images is called a frame.
Human eye can not notice small changes in the frames such as a
slight difference in color.
Therefore, video compression standards do not require the
encoding of all the details and some of the less important video
details are lost. This is because lossy compression is used due to
its ability to get very high compression ratios.
Typically 30 frames are displayed on the screen every second.
39. Video Compression Process
1. Start by encoding the first frame using a still image
compression method.
2. It should then encode each successive frame by identifying the
differences between the frame and its predecessor, and
encoding these differences. If the frame is very different from
its predecessor it should be coded independently of any other
frame.
3. In the video compression literature, a frame that is coded using
its predecessor is called inter frame (or just inter), while a
frame that is coded independently is called intra frame (or
just intra).
41. Video Compression Formats
The ISO/IEC, or International Organization for Standardization
and the International Electrotechnical Commission, have a group
called the Moving Pictures Experts Group or MPEG. MPEG is
responsible, for the familiar compression formats MPEG-1, MPEG-
2 and MPEG-4
The ITU-T standardizes formats for the International
Telecommunications Union, a United Nations Organization. Some
popular ITU-T compression formats include the H.261 and H.264
formats.
There are other compression formats, such as Intel Indeo and
RealVideo (based on the ITU-T H.263 codec), AVI, DivX,
Quicktime, Windows Media Video (WMV).
43. Encryption
• To carry sensitive information, a system must be able to
assure privacy.
• As the number of attacks increase and as the public
Internet is used to transmit private data, it is
increasingly difficult to protect information.
• One way to safeguard data from attacks is encrypting
the data.
• Practically, encryption is suitably done in presentation
layer besides transport and physical layer.
44. Encryption
Encryption – the conversion of data into a form,
called a ciphertext, that cannot be easily
understood by unauthorized people.
Decryption – the process of converting encrypted
data back into its original form, so it can be
understood.
46. Basic Terms and Concepts
Cryptography – The science of encrypting or hiding secrets
Cryptosystem – a disguises message that allows only selected
people to see through the disguise.
Cryptanalysis – The science of decrypting messages or breaking
codes and ciphers
Key – a value that is used by an algorithm to encrypt and decrypt
a message.
Cipher – an encryption/decryption algorithm tool that is used to
create encrypted/decrypted text
47. Encrytption/Decryption Keys
Symmetric Keys – Also called secret key encryption. It uses a
single key to encrypt and decrypt the message. This means the
person encrypting the message must give that key to the recipient
before they can decrypt it.
Eg.: Data Encryption Standard (DES), Triple DES (3DES),
Advanced Encryption Standard (AES)
48. Asymmetric Keys -Also called public key encryption. It uses two
different keys which is public key to encrypt the message, and a
private key to decrypt it. The public key can only be used to
encrypt the message and the private key can only be used to
decrypt it.
49. How Encryption Protects
Confidentiality - Allow only authorized users to access information.
Authentication - Verify who the sender was and trust the sender is
who they claim to be.
Integrity - Trust the information has not been altered
Nonrepudiation - Ensure that the sender or receiver cannot deny
that a message was sent or received.
Access Control - Restrict availability to information.
50. Advantages of Encryption
file is encrypted then the device that uses it doesn’t
need to be secure which means that because the data
is encrypted and secure that the means of storage or
transportation of it doesn’t need to be securing which
saves you money on extra protection software.
having the data encrypted it takes away the pain and
worry that is associated with data breaches and the
protection of intellectual property.
the advantage of Encryption is that it keeps data from
snoopers without compromising systems or storage
devices.
51. Disadvantages of Encryption
complexity of computer encryption, the usually, expensive cost,
the ability for it to be easily changed and its inability to organize
the data has been encoded. Even though the data doesn’t need to
be protected anymore because of the encryption, but instead it
puts a lot of pressure on IT employees.
If you forget your passphrase and/or keyfile then there is almost
no chance of recovering your data
takes a lot of processing, energy and computer power as
well. This means that even though data is protect the overall
performance of the computer could drop.
encryption won’t prevent hackers or viruses and it also may make
it hard to use the encrypted file as some restrictions may have
been placed on it.