HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
Data storage
1. AN INTRODUCTION TO
FILE STRUCTURES
Author:
Bhagyashree Shetty
1MS10IS027
Department of ISE
MSRIT, Bangalore
1
2. INTRODUCTION
File structures was a consequence of the idea of storage devices.
The important factors of file structures are:
Structuring a file.
Searching a file.
Processing a file.
Sorting a file.
The following pages explains the concepts of structuring and
searching a file using only the “Diagrammatic approach of
teaching “wherein the aspects of file structures are depicted
pictorially.
The following pages begins with data storage devices wherein the
whole idea began and then various topics of file structures regarding
structuring and searching using figures.
2
4. FILE STRUCTURE
FILE
STRUCTURES
MANAGING DATA IN
FILES
METHODS FOR METHODS FOR
STRUCTURING SEARCHING
4
5. GOAL OF FILE STRUCTURE RELATED TO SPACE
SPACE
AVOID REDUNDANT COMPRESS
DATA
DEFRAGMENTATION
DATA
5
6. GOAL OF FILE STRUCTURE RELATED TO TIME
TIME
GET DATA GET DATA
GET RELATED
IN ONE IN MINIMUM
DATA AT ONCE
ACCESS ACESSES
6
7. FIRST ASPECT OF FS:
STRUCTURING DATA IN FILES METHOD
HIERARCHY OF DATA IN FILES
FIELD AND RECORD
PACKING AND BUFFEREING
DATA COMPRESSION IN FILES
CHANGE OF NOTATION
RUN LENGTH ENCODING
MORSE CODING
HUFFMAN CODING
7
9. FIELDS AND RECORD ORGANIZATION
1.FIIXED
1.FIIXED
LENGTH
LENGTH
2.LENGTH 4.SELF 2.FIXED
INDICATOR FIELDS DESCRIBING COUNT RECORD 4.INDEX
3.DELIMITED
3.DELIMITED
9
10. ORGANIZATION EXAMPLES
FIELDS
1
1 RECORDS
Rec 1
rec2
1MS10IS027 ANJU****** CSE
rec3
Record 1
2
2 Record 2
10IMS10IS02704ANJU03CSE
Record 3
3 171MS10IS027$ANJU$CSE17
3
1MS10IS027$ANJU$CSE 1MS10IS028$ANNA$CSE……
…………………………………
4
USN=1MS10IS027NAME=ANJU BRANCH=CSE 4
1MS10IS027$ANJU$cse$
$01#ims10IS024$asha$
10
11. DATA COMPRESSION IN FILES
Using different notation -use Eg: Bihar –BR Goa-GA
look up table Kerala-KL Orissa-OR
Run length encoding - Eg: TCGAAAAAGTCTC
compress repeated sequence of Compressed:
letters TCG#05AGTCTC
Eg: A .- B-…C-.-. D-.. E. F..-.
Morse coding- using symbols to G-. H… I.. J.-
represent data
(dots and dashes)
Eg: Algorithm will be followed
Eg: Reuse deleted space and moving
Huffman coding -using entire file to space available
greedy technique.
Optimizing data storage
11
12. SECOND ASPECT OF FS:
SEARCHING DATA IN FILES
METHOD USED:INDEXING
INDEXING
PRIMARY SECONDARY
INDEXING INDEXING
12
13. PRIMARY INDEX
INDEX = KEY + ADDRESS
IF KEY -> PRIMARY KEY, THEN INDEX-> PRIMARY INDEX
OPERATIONS PERFORMED ON INDEXES
CREATING NEW INDEX LIST
INDEX LIST
File Record Key addres
..Record …………
12 50
… …………
..Record …. 13 100
……….
16 300
13
14. ADDING AN ENTRY TO INDEX LIST
STEP 1:
File
NEW RECORD
..Record
…
..Record
……….
INDEX LIST
STEP 2:
Key addres
12 50
13 100
KEY AND
ADDRESS OF 15 200 PUSH DOWN
NEW RECORD 16 300 :
DELETING AN ENTRY FROM INDEX LIST
INDEX LIST
STEP 1: STEP 2:
Key addres RECORD TO BE
File
RECORD ..Record 12 50 DELETED
… 13 100
..Record
………. 15 200
PUSH
16 300 DOWN
14
15. EDITING AN ENTRY TO INDEX LIST
WHEN KEY IS MODIFIED(DELETION FOLLOWED BY INSERTION)
File File
RECORD ..Record
RECORD ..Record
… WITH NEW VALUE …
..Record ..Record
………. ……….
INDEX LIST
INDEX LIST
Key addres Key addres
12 50 12 50
RECORD
13 100 RECORD 13 150
IS DELETED IWITH NEW VALUE
15 200
PUSH IS INSERTED 15 200
16 300 DOWN
16 300
SEARCHING PRIMARY INDEX
Key addres Key addres File
EXTRACT ..Record
SEARCH 12 50 12 50 GO TO …
13 150 ADDRESS 13 150 THE ..Record
KEY
ADDRES ……….
15 200 OF REQD 15 200
16 300 KEY 16 300
15
16. SECONDARY INDEX INITIAL
SECONDARY
INDEX
IF KEY ->NON PRIMARY KEY,
THEN INDEX-> SECONDARY KEY ADDRES ADRS
INDEX
CREATING NEW INDEX LIST BT123 100 100
METHOD 1: BT266 222 222
IT205 652 652
File Record
..Record ………… BT347 900 900
… ………… CS111 1200
..Record …. 1000
……….
PROBLEM: NUMBER OF REFERENCE AGAINST SINGLE NAME
METHOD 2 : IMPROVED SECONDARY INDEX
ALI IT205
BEN BT347 BT266 BT123
16
17. PROBLEM WITH METHOD 2:NOPLACE TO ACCOMMODATE MORE THAN 3
RECORDS WITH SAME NAME
METHOD 3 : IMPROVED SECONDARY INDEX-USING LINKED LIST
17
18. ADDING AN ENTRY TO INDEX LIST
KEY ADDRES
File BT123 100
NEW ..Record ID
BT266 222
RECORD …
..Record ADDR IT205 652
………. PUSH DOWN
NAME IS222 700
BT347 900
CS111 1200
KEY ID
BEN BT123
BEN BT266
ALI IT205
SIA IS222 PUSH DOWN
BEN BT347
TIA CS111
Modification of data in secondary index is same as primary wherein old data is first
deleted and then the newer one is inserted in same place
18
19. SEARCHING THE SECONDARY INDEX
SEARCH KEY ID KEY ID
BEN EXTRACT
FOR NAME BT123 BEN
BEN BT123
BT266 NAME BEN
ALI BT266
SIA
IT205
IS222
READ NEXT ALI
SIA
IT205
BEN NAME IS222
BT347 BEN
TIA BT347
CS111 TIA
CS111
IF NEXT NAME = NAME
IF NEXT
NAME !=
NAME
DISPLAY MESSAGE
AND QUIT
This method is followed because secondary index is not unique
19
20. CONCLUSION
Using the approach of file structures for
storing and managing data, has been
depicted here, and ultimately it is the
usability and the environment wherein it
finds its place.
20