Data storage

AN INTRODUCTION TO
FILE STRUCTURES
Author:
Bhagyashree Shetty
1MS10IS027
Department of ISE
MSRIT, Bangalore

1

INTRODUCTION

 File structures was a consequence of the idea of storage devices.
The important factors of file structures are:
 Structuring a file.
 Searching a file.
 Processing a file.
 Sorting a file.

The following pages explains the concepts of structuring and
searching a file using only the “Diagrammatic approach of
teaching “wherein the aspects of file structures are depicted
pictorially.
The following pages begins with data storage devices wherein the
whole idea began and then various topics of file structures regarding
structuring and searching using figures.

2

DATA STORAGE DEVICES

 OLD METHODS OF MODERN METHODS OF
STORAGE STORAGE

3

FILE STRUCTURE

FILE
STRUCTURES

MANAGING DATA IN
FILES

METHODS FOR METHODS FOR
STRUCTURING SEARCHING

4

GOAL OF FILE STRUCTURE RELATED TO SPACE

SPACE

AVOID REDUNDANT COMPRESS
DATA
DEFRAGMENTATION
DATA

5

GOAL OF FILE STRUCTURE RELATED TO TIME

TIME

GET DATA GET DATA
GET RELATED
IN ONE IN MINIMUM
DATA AT ONCE
ACCESS ACESSES

6

FIRST ASPECT OF FS:
STRUCTURING DATA IN FILES METHOD

 HIERARCHY OF DATA IN FILES
FIELD AND RECORD

 PACKING AND BUFFEREING

 DATA COMPRESSION IN FILES
CHANGE OF NOTATION
RUN LENGTH ENCODING
MORSE CODING
HUFFMAN CODING

7

HIERARCHY OF DATA IN FILES

I/O

BUFFER

RECORDS

FIELDS

8

FIELDS AND RECORD ORGANIZATION

1.FIIXED
1.FIIXED
LENGTH
LENGTH

2.LENGTH 4.SELF 2.FIXED
INDICATOR FIELDS DESCRIBING COUNT RECORD 4.INDEX

3.DELIMITED
3.DELIMITED

9

ORGANIZATION EXAMPLES

 FIELDS
1
1 RECORDS
Rec 1
rec2
1MS10IS027 ANJU****** CSE
rec3

Record 1
2
2 Record 2
10IMS10IS02704ANJU03CSE
Record 3
3 171MS10IS027$ANJU$CSE17
3
1MS10IS027$ANJU$CSE 1MS10IS028$ANNA$CSE……
…………………………………
4
USN=1MS10IS027NAME=ANJU BRANCH=CSE 4
1MS10IS027$ANJU$cse$
$01#ims10IS024$asha$

10

DATA COMPRESSION IN FILES

 Using different notation -use Eg: Bihar –BR Goa-GA
look up table Kerala-KL Orissa-OR

 Run length encoding - Eg: TCGAAAAAGTCTC
compress repeated sequence of Compressed:
letters TCG#05AGTCTC

Eg: A .- B-…C-.-. D-.. E. F..-.
 Morse coding- using symbols to G-. H… I.. J.-
represent data
(dots and dashes)
Eg: Algorithm will be followed

Eg: Reuse deleted space and moving
 Huffman coding -using entire file to space available
greedy technique.

 Optimizing data storage
11

SECOND ASPECT OF FS:
SEARCHING DATA IN FILES
METHOD USED:INDEXING

INDEXING

PRIMARY SECONDARY
INDEXING INDEXING

12

PRIMARY INDEX
 INDEX = KEY + ADDRESS
IF KEY -> PRIMARY KEY, THEN INDEX-> PRIMARY INDEX

OPERATIONS PERFORMED ON INDEXES
 CREATING NEW INDEX LIST

INDEX LIST

File Record Key addres
..Record …………
12 50
… …………
..Record …. 13 100
……….
16 300

13

 ADDING AN ENTRY TO INDEX LIST
STEP 1:
File
 NEW RECORD
..Record
…
..Record
……….
INDEX LIST
STEP 2:
Key addres
12 50
13 100
KEY AND
ADDRESS OF 15 200 PUSH DOWN
NEW RECORD 16 300 :

DELETING AN ENTRY FROM INDEX LIST
INDEX LIST
STEP 1: STEP 2:
Key addres RECORD TO BE
File
RECORD ..Record 12 50 DELETED
… 13 100
..Record
………. 15 200
PUSH
16 300 DOWN

14

 EDITING AN ENTRY TO INDEX LIST
WHEN KEY IS MODIFIED(DELETION FOLLOWED BY INSERTION)

File File
RECORD ..Record
RECORD ..Record
… WITH NEW VALUE …
..Record ..Record
………. ……….

INDEX LIST
INDEX LIST

Key addres Key addres
12 50 12 50
RECORD
13 100 RECORD 13 150
IS DELETED IWITH NEW VALUE
15 200
PUSH IS INSERTED 15 200
16 300 DOWN
16 300

SEARCHING PRIMARY INDEX
Key addres Key addres File
EXTRACT ..Record
SEARCH 12 50 12 50 GO TO …
13 150 ADDRESS 13 150 THE ..Record
KEY
ADDRES ……….
15 200 OF REQD 15 200
16 300 KEY 16 300
15

SECONDARY INDEX INITIAL
SECONDARY
INDEX
 IF KEY ->NON PRIMARY KEY,
THEN INDEX-> SECONDARY KEY ADDRES ADRS
INDEX
CREATING NEW INDEX LIST BT123 100 100
METHOD 1: BT266 222 222
IT205 652 652
File Record
..Record ………… BT347 900 900
… ………… CS111 1200
..Record …. 1000
……….

PROBLEM: NUMBER OF REFERENCE AGAINST SINGLE NAME
METHOD 2 : IMPROVED SECONDARY INDEX

ALI IT205
BEN BT347 BT266 BT123
16

PROBLEM WITH METHOD 2:NOPLACE TO ACCOMMODATE MORE THAN 3
RECORDS WITH SAME NAME
METHOD 3 : IMPROVED SECONDARY INDEX-USING LINKED LIST

17

 ADDING AN ENTRY TO INDEX LIST
KEY ADDRES

File BT123 100
NEW ..Record ID
BT266 222
RECORD …
..Record ADDR IT205 652
………. PUSH DOWN
NAME IS222 700
BT347 900
CS111 1200

KEY ID
BEN BT123
BEN BT266
ALI IT205
SIA IS222 PUSH DOWN
BEN BT347
TIA CS111

Modification of data in secondary index is same as primary wherein old data is first
deleted and then the newer one is inserted in same place

18

SEARCHING THE SECONDARY INDEX

SEARCH KEY ID KEY ID
BEN EXTRACT
FOR NAME BT123 BEN
BEN BT123
BT266 NAME BEN
ALI BT266
 SIA
IT205
IS222
READ NEXT ALI
SIA
IT205
BEN NAME IS222
BT347 BEN
TIA BT347
CS111 TIA
CS111

IF NEXT NAME = NAME

IF NEXT
NAME !=
NAME

DISPLAY MESSAGE
AND QUIT

This method is followed because secondary index is not unique

19

CONCLUSION
 Using the approach of file structures for
storing and managing data, has been
depicted here, and ultimately it is the
usability and the environment wherein it
finds its place.

20

Data storage

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Data storage