3. Today’s digital audio systems have evolved from
technology first developed for the telecommunications
industry, which in turn used ideas dating back to the early
1930’s.
Introduction
4. Towards the end of the 1960’s, digital techniques began to offer substantial benefits over analogue set ups in
broadcast transmission and distribution systems.
Analogue land lines were becoming expensive and quality was relatively poor over long distances.
Introduction
8. • We have found that there are two basic characteristics of sound; amplitude (level), and
frequency (time).
• There are also two characteristics of digital audio, sampling (time) and quantisation (level).
• In order to create a digital representation of audio, the analogue audio signal is sampled. The
microphone translates the analogue signal to an electrical voltage, which in turn goes to an
analogue to digital converter (ADC).
An Introduction to Sampling
10. •This states the sample value must be at least twice the highest frequency to be recorded.
Anything less and you will start to lose quality.
•Highest kHz of human hearing – 20kHz; therefore 2x20kHz = 40kHz (44.1 kHz allows for
margin of error in video systems).
•44,100 aka 44.1 kHz = 44,100 samples per second. A sample every 1/44,000 of a sec.
The Nyquist Theorem
11. A signal
sampled twice per cycle
has enough information
to be reconstructed
The Nyquist Theorem
17. • Frequencies don’t just stop at 20KHz just because we humans can’t hear above that.
• We need to restrict the frequency range in a sampled audio system to comply with the Nyquist
theorem.
• This implies that some severe audio filtering is required to restrict the upper frequencies being input
to the sampling process, as well as to remove the unwanted image frequencies from the output
signal.
• If our sampling system operated at a 40kHz sampling rate and the original audio input happened to
contain a signal at 30kHz (which would be inaudible to humans, but could still be present), the
lower image of that signal would appear at 10kHz (a harmonic frequency).
• Not only would this be clearly audible, it would also be impossible to separate the unwanted image
frequency from the wanted audio band. This effect, where an unwanted signal appears in the wrong
place, is called aliasing.
Anti Aliasing Filter
18. Anti Aliasing Filter
Time
Amplitude
1
0
-1
If the sampling rate is too low the resultant signal will show a curve that is not representative of the original. This is called ALIASING.
Original Sample
Sample Point
New waveform
19. Anti Aliasing Filter
Anti Aliasing Filter
Audio signal to be
sampled
Level
Frequency
Anti Aliasing filter – A low pass filter must be introduced before digitisation to alleviate aliasing. An anti aliasing filter is
used to filter used to limit the frequency range of an analogue signal prior to A/D conversion so that the maximum
frequency doesn’t exceed half of the sampling rate.
21. PCM - Pulse Code Modulation
• The vast majority of digital audio systems encode the numbers which represent the original audio
waveform as binary data, by means of a process known as Pulse Code Modulation (PCM).
• The name PCM stems from the fact that it involves the 'modulation' or changing of the state of a
medium (eg. the voltage within a circuit) according to coded blocks of data in the form of a string of
binary digits.
• Both Wav and AIFF (Audio Interchange File Format) files use Linear Pulse-code modulation (LPCM).
• When stored raw, these two formats are known as lossless audio.
22. 367
hundreds tens units
In decimal, once a count of nine is exceeded, another digit is required. In binary, each additional
position to the left of the first digit increases the value by factors of two units - twos, fours, eights,
sixteens and so forth.
Decimal Example
24. In digital audio, the purpose of binary numbers is to express the values of samples which
represent analogue sound velocity or pressure waveforms.
Bit Resolution
26. The waveform is sampled at the frequency of the sample rate. Each section is held - its voltage
analysed - and then assigned the binary word that is closest to it.
Signal is sampled
0
Sample and Hold Circuit
28. 1.778v
2.0v (1110111)
1.5v (1110110)
The voltage is rounded off to the closest voltage with a binary word.
The rounding off process is known as quantise.
Sample and Hold Circuit
35. If the voltage is exactly at the halfway mark, proper rounding off becomes impossible. The half
value is kicked back to the Least Significant Bit (LSB) instead of allowing it to corrupt the value of
a more important bit that contains more information.
Quantise
36. With a greater bit depth, a larger number of intervals is used to cover the same range and the quantising error is
reduced
Quantise
38. The bit resolution gives us our dynamic range (amplitude). The higher the bit resolution the greater the
dynamic range.
16
bits
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
dB 6 12 18 24 30 36 42 48 54 60 66 72 80 86 92 98
Each additional bit reduces quantisation noise by 6dB, or a factor of 2
Quantise
40. Multiplexer
• Most digital audio recording and transmission is a serial process (data is processed as a single
stream of information).
• Since the output of an ADC is parallel, it must be converted to serial for storage and transmission.
• This is the duty of the Multiplexer. The multiple input circuit accepts parallel data words (number of
bits equal number of parallel connections) and outputs the data one bit at a time, serially, to form a
continuous bit stream.
42. • We know about quantising error. How can it be fixed? Through a process called dithering. Dithering is the
process of actually adding noise to the signal to correct quantise error.
• Dither is essentially a very small amount of white noise, which is deliberately added to the analogue audio
signal as it enters the A-D converter.
• Random noise numbers are calculated so that a different digital number (noise) is added to every sample.
• This is known as non-subtractive dither because the dither noise becomes a permanent part of the signal.
The dither noise is below the amplitude of the LSB.
• When a signal begins to flow in, the dither noise is pushed above the LSB floor.
• The noise can be heard if only the LSB has signal. But fortunately, as more bits kick in with increased
amplitude the dither becomes lost in the mix.
Dither
43. Setting levels
Adjust (massage) the input signal until it reaches maximum level into the digital domain and then
adjust the recorder’s channel output fader to achieve the desired monitor level.
POW-r #1: Uses a special dithering curve to minimize quantization noise.
POW-r #2: (Noise Shaping): Uses additional noise shaping over a wide frequency range, which can
extend the dynamic range of the bounced file by 5 to 10 dB.
POW-r #3: (Noise Shaping): Uses additional, optimized noise shaping, which can extend the dynamic
range by 20 dB within the 2 to 4 kHz range—the range the human ear is most sensitive to.
Note: Noise Shaping minimizes the side effects caused by bit reduction. It does this by moving the
quantization noise spectrum to the frequency range above 10 kHz—the range the human ear is least
sensitive to. Technically, this process is known as spectral displacement.
Dither
45. Bit Rate
Bit rate (not to be confused with bit depth) describes the amount of audio used per second.
Bit rate is derived from the bit depth and sample rate.
To find the number of bits per second:
bits per sample x samples per second
CD Quality = 16x44.1 = 705.6kbps (per audio stream)
Stereo = 705.6 x 2 = 1141.2kbps
http://www.youtube.com/watch?v=HqHIOA-Fcuw
47. • At its simplest, then, the clock identifies when each sample should be recorded
or replayed — and we call that a 'word clock’.
• Without doubt, the major difference between high-end and budget converters is
the quality, stability, and consistency of the internal clock circuitry — the part
that determines when a sample is taken. If this reference clock is not particularly
stable then the interval between samples — which should be absolutely precise
— will vary. This problem is known as jitter, and it affects many different aspects
of digital audio systems, but is of particularly critical importance in A-D and,
arguably to a less critical extent, D-A conversion.
Digital Clocking
48. Pro Tools Sync set up window showing Clock Source
Digital Clocking
Today’s digital audio systems have evolved from technology first developed for the telecommunications industry, which in turn used ideas dating back to the early 1930’s. The idea of representing an analogue waveform by using numbers was first patented in 1937.
Problems with tape.
Hiss on original tape – Caused by magnetic particles that are out of line. This is amplified when recording more that one track.
(n.b – Hiss is caused by some of the magnetic partials not getting correctly magnetised.)
Wow and flutter of tape media – Wow = denotes a slow, cyclic variation in pitch. If a disk is warped also, if the centre hole may be up to 5 mil off-centre. (check out the end of 19th Nervous breakdown by The Rolling stones). Flutter can be encounter with servo-controlled direct drive motors and also due to the record/mat interface.
Degradation of tape over time. Rust particles become flakey with time and fall off.
Linear access- to get from Introduction to the Ending, you must go through all of your verses and choruses.
Maintenance - regular cleaning and adjustment.
Frequencies don’t just stop at 20KHz just because we humans can’t hear above that. We need to restrict the frequency range in a sampled audio system to comply with the Nyquist theorem. This implies that some severe audio filtering is required to restrict the upper frequencies being input to the sampling process, as well as to remove the unwanted image frequencies from the output signal.
If our sampling system operated at a 40kHz sampling rate and the original audio input happened to contain a signal at 30kHz (which would be inaudible to humans, but could still be present), the lower image of that signal would appear at 10kHz (a harmonic frequency). Not only would this be clearly audible, it would also be impossible to separate the unwanted image frequency from the wanted audio band. This effect, where an unwanted signal appears in the wrong place, is called aliasing.
If the sampling rate is too low the resultant signal will show a curve that is not representative of the original. This is called ALIASING.
The vast majority of digital audio systems encode the numbers which represent the original audio waveform as binary data, by means of a process known as Pulse Code Modulation (PCM).
The name PCM stems from the fact that it involves the 'modulation' or changing of the state of a medium (eg. the voltage within a circuit) according to coded blocks of data in the form of a string of binary digits.
Both Wav and AIFF (Audio Interchange File Format) files use Linear Pulse-code modulation (LPCM).
Each number in the binary sequence is called a bit -- a contraction of BInary digiT.
Bits are grouped together to form ‘words’. An 8-bit word (8 bits together) forms a byte. This was settled on though trail and error and due to the limitations of early computer designs, and has been around for fifty years.
Eight bits together can represent any number up to 256. (2x2x2x2x2x2x2x2, written as 28 for ease).
The number on the extreme right representing units is called the least significant bit (or LSB).
The one on the extreme left representing the highest multiplier is called the most significant bit (or MSB).
In digital audio, the purpose of binary numbers is to express the values of samples which represent analogue sound velocity or pressure waveforms.
The waveform is sampled at the frequency of the sample rate. Each section is held - its voltage analysed - and then assigned the binary word that is closest to it.
The waveform is sampled at the frequency of the sample rate. Each section is held - its voltage analysed - and then assigned the binary word that is closest to it.
The voltage is rounded off to the closest voltage with a binary word.
The rounding off process is known as quantise.
Quantise
Quantise
Quantise
In a 3 bit quantising scale, a small number of quantising intervals covers the analogue voltage range, making the maximum quantising error quite large. The second sample in the picture will be assigned the value 010, for example, the corresponding voltage of which is somewhat higher than that of the sample.
If the voltage is exactly at the halfway mark, proper rounding off becomes impossible. The half value is kicked back to the Least Significant Bit instead of allowing it to corrupt the value of a more important bit that contains more information.
During D/A conversion the binary sample values from (b) would be turned into pulses with the amplitude shown in (c) where many samples have been forced to the same level owing to quantising.
During D/A conversion the binary sample values from (b) would be turned into pulses with the amplitude shown in (c) where many samples have been forced to the same level owing to quantising.
During D/A conversion the binary sample values from (b) would be turned into pulses with the amplitude shown in (c) where many samples have been forced to the same level owing to quantising.
During D/A conversion the binary sample values from (b) would be turned into pulses with the amplitude shown in (c) where many samples have been forced to the same level owing to quantising.
Setting levels
Adjust (massage) the input signal until it reaches maximum level into the digital domain and then adjust the recorder’s channel output fader to achieve the desired monitor level.
POW-r #1: Uses a special dithering curve to minimize quantization noise.
POW-r #2: (Noise Shaping): Uses additional noise shaping over a wide frequency range, which can extend the dynamic range of the bounced file by 5 to 10 dB.
POW-r #3: (Noise Shaping): Uses additional, optimized noise shaping, which can extend the dynamic range by 20 dB within the 2 to 4 kHz range—the range the human ear is most sensitive to.
Note: Noise Shaping minimizes the side effects caused by bit reduction. It does this by moving the quantization noise spectrum to the frequency range above 10 kHz—the range the human ear is least sensitive to. Technically, this process is known as spectral displacement.
Setting levels
Adjust (massage) the input signal until it reaches maximum level into the digital domain and then adjust the recorder’s channel output fader to achieve the desired monitor level.
POW-r #1: Uses a special dithering curve to minimize quantization noise.
POW-r #2: (Noise Shaping): Uses additional noise shaping over a wide frequency range, which can extend the dynamic range of the bounced file by 5 to 10 dB.
POW-r #3: (Noise Shaping): Uses additional, optimized noise shaping, which can extend the dynamic range by 20 dB within the 2 to 4 kHz range—the range the human ear is most sensitive to.
Note: Noise Shaping minimizes the side effects caused by bit reduction. It does this by moving the quantization noise spectrum to the frequency range above 10 kHz—the range the human ear is least sensitive to. Technically, this process is known as spectral displacement.
Setting levels
Adjust (massage) the input signal until it reaches maximum level into the digital domain and then adjust the recorder’s channel output fader to achieve the desired monitor level.
POW-r #1: Uses a special dithering curve to minimize quantization noise.
POW-r #2: (Noise Shaping): Uses additional noise shaping over a wide frequency range, which can extend the dynamic range of the bounced file by 5 to 10 dB.
POW-r #3: (Noise Shaping): Uses additional, optimized noise shaping, which can extend the dynamic range by 20 dB within the 2 to 4 kHz range—the range the human ear is most sensitive to.
Note: Noise Shaping minimizes the side effects caused by bit reduction. It does this by moving the quantization noise spectrum to the frequency range above 10 kHz—the range the human ear is least sensitive to. Technically, this process is known as spectral displacement.