Soundpres

Sound: Audio and Music

B.Sc. (Hons) Multimedia Computing Media Technologies

Agenda

 History and overview of sound systems
 Digitised Sound
 Sound Formats
 Sound for Multimedia
 Software: Sound Editors


Earliest recorded sounds
 Thomas Edison’s  Au Clair de la
voice (1890’s) Lune
 Vocal Scale
(Made by Scott, 1860)

How does it work?

A diagonal movement provides either the LEFT or the RIGHT channel.

A left-right movement provides the SUM of the L and R channels.

An up-down movement provides the DIFFERENCE between the L and R channels.

What’s in a CD?

A. A polycarbonate disc layer has the data
encoded by using ‘pits’.

C. A reflective layer reflects the laser back.

E. A lacquer layer is used to prevent oxidation

G. Artwork is screen printed on the top of the
disc.

I. A laser beam reads the polycarbonate disc, is
reflected back, and read by the player.

Encoding format on an audio CD (1) (Courtesy Wikipedia)

 The smallest entity in a CD is called a frame.
 A frame consists of 33 bytes and contains six complete 16-bit
stereo samples (2 bytes × 2 channels × six samples: equals 24
bytes). The other nine bytes consist of eight Cross-Interleaved
Reed-Solomon Coding (CIRC) error correction bytes and one
subcode byte, used for control and display.
 Each byte is translated into a 14-bit word using Eight-to-
Fourteen Modulation, which alternates with 3-bit merging
words. In total there are 33 × (14 + 3) = 561 bits.
 A 27-bit unique synchronization word is added, so that the
number of bits in a frame totals 588 (of which only 192 bits are
music).

Encoding format on an audio CD (2) (Courtesy Wikipedia)

 These 588-bit frames are in turn grouped into sectors.
 Each sector contains 98 frames, totaling 98 × 24 = 2352 bytes
of music.
 The CD is played at a speed of 75 sectors per second, which
results in 176,400 bytes per second.
 Divided by 2 channels and 2 bytes per sample, this results in a
sample rate of 44,100 samples per second.

Announced January 27th 2010…

Audio vs. Music

 Computer sounds can be digital (e.g. .mp3) or
synthesised (e.g. MIDI)
 Digital sound is referred to as Audio
 Synthesised sound is referred to as Music
 Digital sounds are recordings of real sounds
 Synthesized sounds are programmed
reproductions of sounds based on algorithms
and hardware tone generators.


•Digitized Sound
 Digital sound involves sampling, which means
the encoding of data in the form of ones and
zeros.
 The computer converts an analogue data
source to a digital data stream with an analog-
to-digital converter.
 These converters are devices that exist on the
computers sound cards/modules, and are
controlled by the software you use to process
the sound, SoundForge on the PC, Sound Edit
on the MAC, for example.


What is the optimum sample rate for an
audio / video channel?

The sample rate for the transmission of data through a
noisy channel of restricted bandwidth was established by
Shannon and Weaver in 1948…

Signal of amplitude 21

Plus noise of amplitude 3

Results in a noisy
signal of amplitude 24.

How fast must we sample to accurately
represent the signal?

Sample rate and resolution?
 There is no point in sampling  There is no point in sampling
at a frequency higher than at a resolution greater than
twice the bandwidth, that is, the maximum error – in this
at 2 * W samples per case, the error is the noise,
second. where W is the which is 3 units in 24 total
bandwidth of the channel. units.
 Bandwidth could be thought  This gives us a resolution
of as the frequency response (R)of 3 in 24 == 1 in 8 == 3
of the channel. bits.
 So R=3=log2 Signal / Noise
(log2 of 8 is 3)

Nearly there…
 Let’s call the sample rate R
(Bits / Sec)…
 Since there are 2 * W
samples per second,
 R = 2 * W * log2 (S+N)/N
 However, it’s more useful to
measure power rather than
amplitude (cables restrict
power, not amplitude), and
 Power is proportional to the
square of the amplitude,
so… R = sqrt(2 * W * log2 (S+N)/N)

We’re there…

R = sqrt(2 * W * log2 (S+N)/N)

An example…
 Let’s say we want to transmit human speech at a
frequency range of 5Khz and a distortion (error) rate of
0.1% (i.e. 1 in 1000)
 R = 2 * 5,000 * log2 (1000+1)/1 = 50,000 bits/sec
 If we can tolerate an error of 4% (1000:4), then:
 R = 40,000 bits / sec.
 So, a small increase in error (distortion) rate has
allowed a 20% reduction in the bandwidth of the
channel (hence cost).

Synthesized Sound and MIDI

 MIDI = Musical Instrument Digital Interface
 Synthesized sound isn't digitally recorded; its a
mathematical reproduction of a sound based
on a description. Synthesizers use hardware
and algorithms to generate sounds on-the-fly
from a description of the desired sound.


Sound Formats
 Audio
 .WAV (Developed by IBM and Microsoft):

Uncompressed digital samples.
 .AU

 .AIFF (Audio Interchange File Format)

 .MP3 (MPEG)

 .SND (Mac)

 Music
 .MID (Musical Instrument Digital Interface)

 and other proprietary formats


.mp3 (courtesy Wikipedia)

 The use in MP3 of a lossy compression algorithm is
designed to greatly reduce the amount of data required
to represent the audio recording and still sound like a
faithful reproduction of the original uncompressed
audio for most listeners.
 An MP3 file that is created using the setting of 128
kbits/s will result in a file that is about 1/11th the size of
the CD file created from the original audio source. An
MP3 file can also be constructed at higher or lower bit
rates, with higher or lower resulting quality.

.mp3 (contd.)
 The compression works by reducing the accuracy of
certain parts of sound that are deemed beyond the
auditory resolution ability of most people.
 This method is commonly referred to as perceptual
coding. It internally provides a representation of sound
within a short-term time/frequency analysis window, by
using psychoacoustic models to discard or reduce the
precision of components less audible to human hearing,
and recording the remaining information in an efficient
manner (think about a handclap in a quiet room as
oppose to in a noisy street –it’s masked by the noise).

Finally…
 Filters are used to split a signal into 32 bands, and a
masking level for each band is computed.
 Signals that fall below the threshold can be
discarded.
 Further compression is done, including resampling
and lowering the bit rate of parts that don’t need full
precision.

Sound for Multimedia
Generate sounds for multimedia scenarios
including:
 Sound effects

 Ambient sound for mood and effect

 Virtual Juke Box

 Voice-overs


Software: Sound Editors
 Sound Editors - edit waveforms of
digitised sounds
 Trim
 Apply effects - e.g. reverb, echo,
equalisation (EQ)
 Insert silence
 Fade-in, fade-out
 Combine sounds
 Adjust / Normalise volume (gain)


Software: Sound Sequencers
 Synthesized sounds played back via pre-
recorded or live MIDI note event data -
velocity, pitch, etc.
 Sound cards may also have prerecorded digital
samples stored on them.
 These 'samples' can then be played by
sequencing software that will play a number of
tracks simultaneously in an ensemble.
 Software such as Cakewalk, Steinberg's
Cubase, and Logic Pro.


Hardware: Creative X-Fi
•24-bit Analog-to-Digital conversion of analog inputs at 96kHz sample rate

•24-bit Digital-to-Analog conversion of digital sources at 96kHz to analog 7.1
speaker output

•24-bit Digital-to-Analog conversion of stereo digital sources at 192kHz to stereo
output

•ASIO (Audio Stream Input / Output) 2.0 support at 16-bit/44.1kHz, 16-bit/48kHz,
24-bit/44.1kHz 24-bit/48kHz and 24-bit/96kHz with direct monitoring

•16-bit to 24-bit recording sampling rates: 8, 11.025, 16, 22.05, 24, 32, 44.1, 48
and 96kHz

•Digital SPDIF interface support with 24bit/96kHz quality format

•Signal-to-Noise ratio: 109dB

•Total Harmonic Distortion + Noise at 1kHz = 0.004%

Please read…
 Chapman & Chapman Chapter 8 (in
your bundle if you have it)

Soundpres

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Soundpres

Ähnlich wie Soundpres (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Soundpres