SlideShare ist ein Scribd-Unternehmen logo
1 von 3
Downloaden Sie, um offline zu lesen
Presentation – Victoria Sloyan – 07/07/2010

                            Archiving Digital Audio Files

Introduction

Hello and welcome to my presentation “Archiving digital audio files”.

I am the trainee for the futureArch Project, which aims to expand the Bodleian’s
capability to archive and preserve born-digital material. As part of this we have been
developing a procedure for extracting digital files from deposited media such as
floppy disks and moving them into our secure repository. This procedure involves the
use of forensic imaging equipment to create bit-by-bit duplicates of files, but it is only
suitable for data files. Therefore my project brief was to research an effective way of
extracting and preserving audio files.

Audio material requires its own method for three reasons:
   1. Unlike data disks, audio CDs do not hold the audio data within a file system,
      and so forensic imaging kits have difficulty creating an image file.
   2. Audio files have different metadata. Some of the metadata fields are the same,
      such as ‘title’ and ‘creation date’, but other metadata is specific to audio files,
      such as ‘duration’ and ‘file format’.
   3. The final difference involves delivery to users: the way you access a word
      document is different from the way you access a sound recording.

Therefore, the overall aim of this project can be divided into three objectives:
   1. First is selecting the most appropriate format to store the audio in. There are
       many formats available such as MP3, FLAC and WAVE files and many
       others. So, the first decision to be made was which to use.
   2. Secondly, since imaging audio disks does not work I had to find a way to
       extract the files from their original media and move them onto the secure
       server.
   3. Finally, I had to devise an effective way of delivering digital audio to users.


Selecting a format

The most important consideration when choosing a format is finding one which is
uncompressed. Basically, formats fall into three categories: uncompressed formats
are, as the name suggests, uncompressed, which means sound and silence are encoded
at the same bit/time rate. Lossless compression is where the file is compressed to
shrink the file size, but done without reducing the sound quality. This is typically
achieved by compressing any silence. Lossy compression is where the whole file is
compressed. This can significantly reduce the file size, though often the reduction in
quality is unnoticeable by ear. For archiving purposes you want an exact replica of the
original data, therefore an uncompressed format must be used.

The second consideration is to find an open source format in order to aid accessibility,
because open source means you do not have to worry about licensing and legal issues.
Also, a standardised format is preferred, as this tends to mean it has been thoroughly
reviewed and is more likely to be longer lasting.

After evaluating all the formats I concluded the best one to use is WAVE. Crucially it
is an uncompressed format, also it is open source and standardised and it is
recommended by the International Association of Sound and Audiovisual Archives,
the Library of Congress and the British Library Sound Archive.


Capture audio and extract metadata

So, once the format was decided I needed a way of extracting files from a CD and the
best way is one you may well be familiar with: ripping, although the technical term
for it is ‘digital audio extraction’. However, normal ripping, like that done by
Windows Media Player is what is known as ‘fast ripping’, whereas for archiving we
wanted a ‘secure ripper’. The main difference between the two is that secure rippers
perform various validation tests to ensure maximum accuracy.

There are quite a few secure ripping programmes available but the one I decided to
use is Exact Audio Copy. It is well regarded by audio professionals, it works with
Windows OS, and it is free to download and rips to WAVE.

Here is a screen shot of EAC.
   ‱ Down the left-hand side you can see the available ripping options including
        ripping as a MP3 and burning to a CD, but for my project I was only interested
        in the first icon – ripping as a WAVE file.
   ‱ Along the toolbar there are various options to further specify the format. For
        instance you can state whether the recording is ripped in mono or stereo.
   ‱ You can also specify the sample rate to use, although the maximum sample
        rate available for WAVE files is 44.1 kHz. This is because Red Book Audio
        states that audio on CDs should be recorded at a sample rate of 44.1 kHz with
        a 16 bit-depth, thus there is little benefit to ripping audio at a higher rate.
        Moreover, from an archiving perspective, the British Library stipulates that
        audio transferred from one medium to another should retain the same sample
        rate.

Once EAC has ripped the files it will produce a report recording the rip result. Under
each track it will say either OK or Finished. If it says Finished you know the rip was
achieved but the resulting file is not identical to the original. This could occur for
several reasons, the most common being if the disk is dirty or scratched.

Once the disk has been ripped, metadata needs recording in a spreadsheet. This is
based on the futureArch project’s metadata spreadsheet for digital files, but has been
modified to suit audio material. Here you can see all the fields that need to be
completed. The ones in bold are ones that have been added for audio files.


Delivering audio
Archives exist not only to preserve material, but to also make it available to
researchers. Therefore, audio files need to be delivered in an efficient and effective
way. There are two potential problems with audio files. Firstly, they can be extremely
large, particularly if they are in an uncompressed format and secondly, the quality of
the recording can be quite poor. So, in order to combat this two issues I created two
versions of each file: I already had the master file so I processed this to create a
processed WAVE file and an optimised MP3 file. The table illustrates the intended
use for each derivative. MP3 files are lossy; therefore the file size is significantly
reduced.

Processing was done using Audacity, which looks like this. The processing done to a
file will depend on its content and quality, but the tools I most often used were:
     ‱ Silencing and cutting to trim the beginnings and ends of recordings and
         remove long pauses.
     ‱ Noise removal tool to either remove or reduce the volume of background
         noise, like high pitched hissing on poor quality recordings.
It is very important to record every change that is made to the master file, so I created
a Process History Spreadsheet to record these changes. This includes the name of the
original file (22cd), all actions done to it in detail (such as two second noise cut at
22:54) and the name of the resulting file (22cd_mp3).

After processing was finished each file was exported from Audacity, first as in the
WAVE format and then the MP3 format.

Once the files are processed the MP3 versions and possibly the processed WAVE
files can be made available to listen to in the reading room. The master WAVE files
will be stored in the repository and will not be touched be users. All digital material,
both data and audio, will be accessed via a specific laptop in the reading room. This
laptop will have a specially designed interface similar in feel to an internet browser
and audio will be streamed, so accessing audio will be a similar experience to using
something like MySpace.


Conclusion

So, to sum up very briefly, if we go back to the three aims you can see I’ve pretty
much answered them:
    1. The best format to use is WAVE
    2. The way to capture audio is by securely ripping it
    3. Audio will be processed and compressed and will be accessed by streaming
        the files through a self-contained interface within the reading room.

Weitere Àhnliche Inhalte

Was ist angesagt?

Chap72&73
Chap72&73Chap72&73
Chap72&73dkd_woohoo
 
Ppt on audio file formats
Ppt on audio file formatsPpt on audio file formats
Ppt on audio file formatsIshank Ranjan
 
Ig2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyIg2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyLewisB2013
 
Ig2task1worksheet
Ig2task1worksheetIg2task1worksheet
Ig2task1worksheetAlexballantyne
 
Ig2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyIg2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyLewisB2013
 
Ig2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyIg2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyLewisB2013
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossaryPaulinaKucharska
 
Digital audio formats
Digital audio formatsDigital audio formats
Digital audio formatsamels_john
 
Bl ig2 url edit
Bl ig2 url editBl ig2 url edit
Bl ig2 url editbenloynd
 

Was ist angesagt? (20)

Chap67
Chap67Chap67
Chap67
 
Chap66
Chap66Chap66
Chap66
 
Chap72&73
Chap72&73Chap72&73
Chap72&73
 
Chap62
Chap62Chap62
Chap62
 
Sound Formats
Sound FormatsSound Formats
Sound Formats
 
Chap70
Chap70Chap70
Chap70
 
Ppt on audio file formats
Ppt on audio file formatsPpt on audio file formats
Ppt on audio file formats
 
Codecs
CodecsCodecs
Codecs
 
Ig2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyIg2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copy
 
Audio Compression
Audio CompressionAudio Compression
Audio Compression
 
Ig2task1worksheet
Ig2task1worksheetIg2task1worksheet
Ig2task1worksheet
 
IG2 Task 1
IG2 Task 1 IG2 Task 1
IG2 Task 1
 
Ig2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyIg2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copy
 
Ig2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copyIg2 task 1 work sheet lewis brady copy
Ig2 task 1 work sheet lewis brady copy
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossary
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Digital audio
Digital audioDigital audio
Digital audio
 
Digital audio formats
Digital audio formatsDigital audio formats
Digital audio formats
 
Audio compression
Audio compressionAudio compression
Audio compression
 
Bl ig2 url edit
Bl ig2 url editBl ig2 url edit
Bl ig2 url edit
 

Andere mochten auch

A Recipe To Add Social Media To Your Marketing Mix
A Recipe To Add Social Media To Your Marketing MixA Recipe To Add Social Media To Your Marketing Mix
A Recipe To Add Social Media To Your Marketing MixDebkanyaD
 
6 10 10 Revised Marcs Published Work Presentation (2)[1]
6 10 10 Revised Marcs Published Work Presentation (2)[1]6 10 10 Revised Marcs Published Work Presentation (2)[1]
6 10 10 Revised Marcs Published Work Presentation (2)[1]thompm60
 
Vodaplex ppt presentation 060110
Vodaplex ppt presentation 060110Vodaplex ppt presentation 060110
Vodaplex ppt presentation 060110eagle472
 
Project showcase handout
Project showcase handoutProject showcase handout
Project showcase handoutOxford Trainees
 

Andere mochten auch (8)

Helen m
Helen mHelen m
Helen m
 
A Recipe To Add Social Media To Your Marketing Mix
A Recipe To Add Social Media To Your Marketing MixA Recipe To Add Social Media To Your Marketing Mix
A Recipe To Add Social Media To Your Marketing Mix
 
6 10 10 Revised Marcs Published Work Presentation (2)[1]
6 10 10 Revised Marcs Published Work Presentation (2)[1]6 10 10 Revised Marcs Published Work Presentation (2)[1]
6 10 10 Revised Marcs Published Work Presentation (2)[1]
 
Victoria
VictoriaVictoria
Victoria
 
Vodaplex ppt presentation 060110
Vodaplex ppt presentation 060110Vodaplex ppt presentation 060110
Vodaplex ppt presentation 060110
 
Sarah h
Sarah hSarah h
Sarah h
 
Sam
SamSam
Sam
 
Project showcase handout
Project showcase handoutProject showcase handout
Project showcase handout
 

Ähnlich wie Victoria presentation notes

Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheetluisfvazquez1
 
IG2 Task 1 Work Sheet
IG2 Task 1 Work SheetIG2 Task 1 Work Sheet
IG2 Task 1 Work Sheetwallinplanet
 
IG2 Task 1 Work Sheet
IG2 Task 1 Work SheetIG2 Task 1 Work Sheet
IG2 Task 1 Work SheetNathan_West
 
Sound recording glossary improved
Sound recording glossary improvedSound recording glossary improved
Sound recording glossary improvedItsLiamOven
 
anthony is Audio formats
anthony is Audio formatsanthony is Audio formats
anthony is Audio formatshaverstockmedia
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossaryPaulinaKucharska
 
Sound Recording Glossary
Sound Recording GlossarySound Recording Glossary
Sound Recording GlossaryPaulinaKucharska
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheetthomasmcd6
 
Ig2 task 1 work sheet (glossary) steph hawkins revisited
Ig2 task 1 work sheet (glossary) steph hawkins revisitedIg2 task 1 work sheet (glossary) steph hawkins revisited
Ig2 task 1 work sheet (glossary) steph hawkins revisitedstephlizahawkins123
 
Ig2 task 1 work sheet (glossary) steph hawkins
Ig2 task 1 work sheet (glossary) steph hawkinsIg2 task 1 work sheet (glossary) steph hawkins
Ig2 task 1 work sheet (glossary) steph hawkinsstephlizahawkins123
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossaryamybrockbank
 
Sound recording glossary improved mk2
Sound recording glossary improved mk2Sound recording glossary improved mk2
Sound recording glossary improved mk2davidhall1415
 
Ian definitions 3rd try 2
Ian definitions 3rd try 2Ian definitions 3rd try 2
Ian definitions 3rd try 2thomasmcd6
 
Jordan smith ig2 task 1 revisited v2
Jordan smith ig2 task 1 revisited v2Jordan smith ig2 task 1 revisited v2
Jordan smith ig2 task 1 revisited v2JordanSmith96
 
Call audio
Call audioCall audio
Call audioDiep Truong
 
Jordan smith ig2 task 1 revisited
Jordan smith ig2 task 1 revisitedJordan smith ig2 task 1 revisited
Jordan smith ig2 task 1 revisitedJordanSmith96
 
Beginning html5 media, 2nd edition
Beginning html5 media, 2nd editionBeginning html5 media, 2nd edition
Beginning html5 media, 2nd editionser
 

Ähnlich wie Victoria presentation notes (20)

Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
IG2 Task 1 Work Sheet
IG2 Task 1 Work SheetIG2 Task 1 Work Sheet
IG2 Task 1 Work Sheet
 
IG2 Task 1 Work Sheet
IG2 Task 1 Work SheetIG2 Task 1 Work Sheet
IG2 Task 1 Work Sheet
 
Sound recording glossary improved
Sound recording glossary improvedSound recording glossary improved
Sound recording glossary improved
 
Ig2 task 1 work sheet (1)
Ig2 task 1 work sheet (1)Ig2 task 1 work sheet (1)
Ig2 task 1 work sheet (1)
 
Chap12
Chap12Chap12
Chap12
 
Audio formats
Audio formatsAudio formats
Audio formats
 
anthony is Audio formats
anthony is Audio formatsanthony is Audio formats
anthony is Audio formats
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossary
 
Sound Recording Glossary
Sound Recording GlossarySound Recording Glossary
Sound Recording Glossary
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Ig2 task 1 work sheet (glossary) steph hawkins revisited
Ig2 task 1 work sheet (glossary) steph hawkins revisitedIg2 task 1 work sheet (glossary) steph hawkins revisited
Ig2 task 1 work sheet (glossary) steph hawkins revisited
 
Ig2 task 1 work sheet (glossary) steph hawkins
Ig2 task 1 work sheet (glossary) steph hawkinsIg2 task 1 work sheet (glossary) steph hawkins
Ig2 task 1 work sheet (glossary) steph hawkins
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossary
 
Sound recording glossary improved mk2
Sound recording glossary improved mk2Sound recording glossary improved mk2
Sound recording glossary improved mk2
 
Ian definitions 3rd try 2
Ian definitions 3rd try 2Ian definitions 3rd try 2
Ian definitions 3rd try 2
 
Jordan smith ig2 task 1 revisited v2
Jordan smith ig2 task 1 revisited v2Jordan smith ig2 task 1 revisited v2
Jordan smith ig2 task 1 revisited v2
 
Call audio
Call audioCall audio
Call audio
 
Jordan smith ig2 task 1 revisited
Jordan smith ig2 task 1 revisitedJordan smith ig2 task 1 revisited
Jordan smith ig2 task 1 revisited
 
Beginning html5 media, 2nd edition
Beginning html5 media, 2nd editionBeginning html5 media, 2nd edition
Beginning html5 media, 2nd edition
 

Mehr von Oxford Trainees

Mehr von Oxford Trainees (8)

Laurel Burn
Laurel BurnLaurel Burn
Laurel Burn
 
Laurel Burn project handout
Laurel Burn  project handoutLaurel Burn  project handout
Laurel Burn project handout
 
User education at the law bod
User education at the law bodUser education at the law bod
User education at the law bod
 
Lucy
LucyLucy
Lucy
 
Helen
HelenHelen
Helen
 
Charlotte
CharlotteCharlotte
Charlotte
 
Jess
JessJess
Jess
 
Alice And Susan
Alice And SusanAlice And Susan
Alice And Susan
 

KĂŒrzlich hochgeladen

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂșjo
 

KĂŒrzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Victoria presentation notes

  • 1. Presentation – Victoria Sloyan – 07/07/2010 Archiving Digital Audio Files Introduction Hello and welcome to my presentation “Archiving digital audio files”. I am the trainee for the futureArch Project, which aims to expand the Bodleian’s capability to archive and preserve born-digital material. As part of this we have been developing a procedure for extracting digital files from deposited media such as floppy disks and moving them into our secure repository. This procedure involves the use of forensic imaging equipment to create bit-by-bit duplicates of files, but it is only suitable for data files. Therefore my project brief was to research an effective way of extracting and preserving audio files. Audio material requires its own method for three reasons: 1. Unlike data disks, audio CDs do not hold the audio data within a file system, and so forensic imaging kits have difficulty creating an image file. 2. Audio files have different metadata. Some of the metadata fields are the same, such as ‘title’ and ‘creation date’, but other metadata is specific to audio files, such as ‘duration’ and ‘file format’. 3. The final difference involves delivery to users: the way you access a word document is different from the way you access a sound recording. Therefore, the overall aim of this project can be divided into three objectives: 1. First is selecting the most appropriate format to store the audio in. There are many formats available such as MP3, FLAC and WAVE files and many others. So, the first decision to be made was which to use. 2. Secondly, since imaging audio disks does not work I had to find a way to extract the files from their original media and move them onto the secure server. 3. Finally, I had to devise an effective way of delivering digital audio to users. Selecting a format The most important consideration when choosing a format is finding one which is uncompressed. Basically, formats fall into three categories: uncompressed formats are, as the name suggests, uncompressed, which means sound and silence are encoded at the same bit/time rate. Lossless compression is where the file is compressed to shrink the file size, but done without reducing the sound quality. This is typically achieved by compressing any silence. Lossy compression is where the whole file is compressed. This can significantly reduce the file size, though often the reduction in quality is unnoticeable by ear. For archiving purposes you want an exact replica of the original data, therefore an uncompressed format must be used. The second consideration is to find an open source format in order to aid accessibility, because open source means you do not have to worry about licensing and legal issues.
  • 2. Also, a standardised format is preferred, as this tends to mean it has been thoroughly reviewed and is more likely to be longer lasting. After evaluating all the formats I concluded the best one to use is WAVE. Crucially it is an uncompressed format, also it is open source and standardised and it is recommended by the International Association of Sound and Audiovisual Archives, the Library of Congress and the British Library Sound Archive. Capture audio and extract metadata So, once the format was decided I needed a way of extracting files from a CD and the best way is one you may well be familiar with: ripping, although the technical term for it is ‘digital audio extraction’. However, normal ripping, like that done by Windows Media Player is what is known as ‘fast ripping’, whereas for archiving we wanted a ‘secure ripper’. The main difference between the two is that secure rippers perform various validation tests to ensure maximum accuracy. There are quite a few secure ripping programmes available but the one I decided to use is Exact Audio Copy. It is well regarded by audio professionals, it works with Windows OS, and it is free to download and rips to WAVE. Here is a screen shot of EAC. ‱ Down the left-hand side you can see the available ripping options including ripping as a MP3 and burning to a CD, but for my project I was only interested in the first icon – ripping as a WAVE file. ‱ Along the toolbar there are various options to further specify the format. For instance you can state whether the recording is ripped in mono or stereo. ‱ You can also specify the sample rate to use, although the maximum sample rate available for WAVE files is 44.1 kHz. This is because Red Book Audio states that audio on CDs should be recorded at a sample rate of 44.1 kHz with a 16 bit-depth, thus there is little benefit to ripping audio at a higher rate. Moreover, from an archiving perspective, the British Library stipulates that audio transferred from one medium to another should retain the same sample rate. Once EAC has ripped the files it will produce a report recording the rip result. Under each track it will say either OK or Finished. If it says Finished you know the rip was achieved but the resulting file is not identical to the original. This could occur for several reasons, the most common being if the disk is dirty or scratched. Once the disk has been ripped, metadata needs recording in a spreadsheet. This is based on the futureArch project’s metadata spreadsheet for digital files, but has been modified to suit audio material. Here you can see all the fields that need to be completed. The ones in bold are ones that have been added for audio files. Delivering audio
  • 3. Archives exist not only to preserve material, but to also make it available to researchers. Therefore, audio files need to be delivered in an efficient and effective way. There are two potential problems with audio files. Firstly, they can be extremely large, particularly if they are in an uncompressed format and secondly, the quality of the recording can be quite poor. So, in order to combat this two issues I created two versions of each file: I already had the master file so I processed this to create a processed WAVE file and an optimised MP3 file. The table illustrates the intended use for each derivative. MP3 files are lossy; therefore the file size is significantly reduced. Processing was done using Audacity, which looks like this. The processing done to a file will depend on its content and quality, but the tools I most often used were: ‱ Silencing and cutting to trim the beginnings and ends of recordings and remove long pauses. ‱ Noise removal tool to either remove or reduce the volume of background noise, like high pitched hissing on poor quality recordings. It is very important to record every change that is made to the master file, so I created a Process History Spreadsheet to record these changes. This includes the name of the original file (22cd), all actions done to it in detail (such as two second noise cut at 22:54) and the name of the resulting file (22cd_mp3). After processing was finished each file was exported from Audacity, first as in the WAVE format and then the MP3 format. Once the files are processed the MP3 versions and possibly the processed WAVE files can be made available to listen to in the reading room. The master WAVE files will be stored in the repository and will not be touched be users. All digital material, both data and audio, will be accessed via a specific laptop in the reading room. This laptop will have a specially designed interface similar in feel to an internet browser and audio will be streamed, so accessing audio will be a similar experience to using something like MySpace. Conclusion So, to sum up very briefly, if we go back to the three aims you can see I’ve pretty much answered them: 1. The best format to use is WAVE 2. The way to capture audio is by securely ripping it 3. Audio will be processed and compressed and will be accessed by streaming the files through a self-contained interface within the reading room.