1) Databases organize and store data efficiently using a storage hierarchy including cache, main memory, magnetic disks, optical disks, and tapes. Magnetic disks are commonly used secondary storage.
2) Indexing and file structures like B+ trees are data structures that allow efficient retrieval of records from database files based on indexed attributes. B+ trees in particular provide fast traversal and searching through a balanced tree structure.
3) RAID (Redundant Array of Independent Disks) uses multiple disks together to provide increased performance, redundancy, or both through techniques like disk striping and mirroring.
2. 1
Storage And File Structure
Why do we need to know about storage/file structure
Many database technologies are developed to utilize the
Storage architecture/hierarchy
Data in the database needs to be organized and
stored/retrieved efficiently
3. Storage Hierarchy
Magnetic Tape
Optical Disk
Magnetic Disk
Flash Memory
Cache
unit priceMemory
Volatile
primary storage
Non-volatile speed
Secondary
storage
Tertiary
storage
4. Primary Storage (Volatile)
Cache
Speed: 7 to 20 ns (1 nanosecond = 10–9 seconds)
Capacity:
A typical PC level 2 cache 64KB-2 MB.
Within processors, level 1 cache usually ranges in size from 8
KB to 64 KB.
Main memory
Speed: 10s to 100s of nanoseconds;
Capacity:
Up to a few Gigabytes widely used currently
per-byte costs have decreased roughly factor of 2 every 2 3
years)
5. Secondary Storage (Non-volatile)
Flash memory
Speed: Read speed similar to main memory, write is much slower
Capacity: 32M to 512M currently
Forms: SmartMedia, memory stick, secure digital, BIOS
Cost: roughly same as main memory
Magnetic-disk
Capacities: up to roughly 100 GB currently Growing constantly
and rapidly with technology improvements.
6. 1/14/2005
Yan Huang - CSCI5330 Database Implementation –
Storage and File Structure
Tertiary Storage (Non-volatile)
Optical storage
CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular
forms
CD-RW, DVD-RW, and DVD-RAM
Reads and writes are slower than with magnetic disk
Juke-box systems, with large numbers of removable disks,
a few drives, and a mechanism for automatic
loading/unloading of disks available for storing large
volumes of data
7. Indexing:
*Indexing in database systems is similar to what we see
in books.
* Indexing is a data structure technique to efficiently
retrieve records from the database files based on some
attributes on which the indexing has been done.
*Indexing is defined based on its indexing attributes.
Indexing can be of the following types
*Primary Index - Primary index is defined on an
ordered data file. The data file is ordered on a key field. The
key field is generally the primary key of the relation.
8. Indexing Types:
Secondary Index - Secondary index may be
generated from a field which is a candidate key and has a
unique value in every record, or a non-key with duplicate
values.
Clustering Index - Clustering index is defined on an
ordered data file. The data file is ordered on a non-key field.
Ordered Indexing.
Dense Index - In dense index, there is an index
record for every search key value in the database. This makes
searching faster but requires more space to store index
records itself. Index records contain search key value and a
pointer to the actual record on the disk.
9.
10. Sparse Index:
In sparse index, index records are not created for
every search key. An index record here contains a search key
and an actual pointer to the data on the disk.
To search a record, we first proceed by index record
and reach at the actual location of the data.
If the data we are looking for is not where we
directly reach by following the index, then the system starts
sequential search until the desired data is found.
11.
12. Multilevels Indexing:
Index records comprise search-key values and data pointers.
Multilevel index is stored on the disk along with the actual
database files.
As the size of the database grows, so does the size of the
indices.
If single-level index is used, then a large size index cannot be
kept in memory which leads to multiple disk accesses .
Multi-level Index helps in breaking down the index into
several smaller indices in order to make the outermost level so small
that it can be saved in a single disk block,
13.
14. Disk:
Hard disk drives are the most common secondary storage
devices in present computer systems. These are called magnetic disks
because they use the concept of magnetization to store information.
Hard disks are formatted in a well-defined order to store data
efficiently. A hard disk plate has many concentric circles on it,
called tracks. Every track is further divided into sectors. A sector on
a hard disk typically stores 512 bytes of data.
16. Disk Subsystem
Disk interface standards families
• ATA (AT adaptor) range of standards
• SCSI (Small Computer System Interconnect) range
of standards.
17. Disk Speed
Seek time
(milliseconds)
Rotation time/latency
milliseconds
Data-transfer rate
(4-8MB/sec)
Typical numbers:
16,000 tracks per platter
sectors per track: 200 – 400
512 bytes per sector
4-16KB per block
5,400 - 15,000 r p m
Access time = seek time + latency
Discuss ways to improve disk
reading speed
18. Redundant Array of Independent Disks
RAID or Redundant Array of Independent Disks,
is a technology to connect multiple secondary storage
devices and use them as a single storage media.
RAID consists of an array of disks in which
multiple disks are connected together to achieve different
goals. RAID levels define the use of disk arrays.
19. RAID 0
In this level, a striped array of disks is implemented. The
data is broken down into blocks and the blocks are distributed
among disks. Each disk receives a block of data to write/read in
parallel. It enhances the speed and performance of the storage
device. There is no parity and backup in Level 0.
20. RAID1
RAID 1 uses mirroring techniques. When data is sent to
a RAID controller, it sends a copy of data to all the disks in the
array. RAID level 1 is also called mirroring and provides
100% redundancy in case of a failure.
21. RAID 2
RAID 2 records Error Correction Code using Hamming
distance for its data, striped on different disks. Like level 0, each
data bit in a word is recorded on a separate disk and ECC codes of
the data words are stored on a different set disks. Due to its
complex structure and high cost, RAID 2 is not commercially
available.
22. RAID3
RAID 3 stripes the data onto multiple disks. The parity
bit generated for data word is stored on a different disk. This
technique makes it to overcome single disk failures.
23. RAID 4
In this level, an entire block of data is written onto data
disks and then the parity is generated and stored on a different
disk. Note that level 3 uses byte-level striping, whereas level 4
uses block-level striping. Both level 3 and level 4 require at
least three disks to implement RAID.
24. RAID 5
RAID 5 writes whole data blocks onto different disks, but
the parity bits generated for data block stripe are distributed
among all the data disks rather than storing them on a different
dedicated disk.
25. RAID 6
RAID 6 is an extension of level 5. In this level, two
independent parities are generated and stored in distributed fashion
among multiple disks. Two parities provide additional fault
tolerance. This level requires at least four disk drives to implement
RAID.
26. File Organization
File Organization defines how file records are mapped onto
disk blocks. We have four types of File Organization to organize
file records
27. Heap File Organization
When a file is created using Heap File Organization, the
Operating System allocates memory area to that file without
any further accounting details.
File records can be placed anywhere in that memory
area.
It is the responsibility of the software to manage the
records.
Heap File does not support any ordering, sequencing, or
indexing on its own.
28. Sequential File Organization
Every file record contains a data field (attribute) to uniquely
identify that record.
In sequential file organization, records are placed in the file
in some sequential order based on the unique key field or search
key.
Practically, it is not possible to store all the records
sequentially in physical form.
29. Hash File Organization
Hash File Organization uses Hash function computation
on some fields of the records.
The output of the hash function determines the location
of disk block where the records are to be placed.
30. Clustered File Organization
Clustered file organization is not considered good
for large databases. In this mechanism, related records from
one or more relations are kept in the same disk block, that
is, the ordering of records is not based on primary key or
search key.
31. Hashing:
Hashing uses hash functions with search keys as
parameters to generate the address of a data record.
Bucket A hash file stores data in bucket
format. Bucket is considered a unit of storage. A bucket
typically stores one complete disk block, which in turn
can store one or more records.
A hash function, h, is a mapping function that
maps all the set of search-keys K to the address where
actual records are placed. It is a function from search
keys to bucket addresses.
32. B+ Tree:
B+ tree is a (key, value) storage method in a tree like
structure. B+ tree has one root, any number of intermediary
nodes (usually one) and a leaf node. Here all leaf nodes will
have the actual records stored. Intermediary nodes will have
only pointers to the leaf nodes; it not has any data. Any node
will have only two leaves. This is the basic of any B+ tree.
33. STRUCTURE OF B+ TREE
A B+ tree index is a multilevel indexes , but it has a structure that differs from than of
the multilevel index-sequential file.
The bucket structure is used only if the search key does not from a primary key and if
the file is not sorted in the search key value in the order.
34. QUERIES ON B+ TREE
Process queries using a b+ tree . To find all the records with a search-key
value of k.
35. Leaf nodes must have between 2 and 4 values([(n-1)/2)] and n-1 , with
n=5).
Non-leaf nodes other than root must have between 3 and 5
children([(n/2)]and n with n=5).
Root must have at least 2 children.
36. UPADATES ON B+ TREES
INSERTION
If the search key value already appears in the leaf node , we add the new
record to the file and , if necessary , a pointer to the bucket.
37. DELETION
Using same technique as for lookup , we find the record to be deleted and
remove it from the file . The search key value is removed from the leaf node
if there is no bucket associated with that search key value or if the bucket
becomes empty as a result of the deletion.
38. B+TREE FILE ORGANIZATION
In a B+ tree file organization , the leaf nodes of the tree store records
instead of storing pointers to records . An example of a B+ tree file
organization . Since records are usually larger than pointers , the maximum
number of records that can be stored in the leaf nodes is less than the
number of pointers in a non leaf node.
39. MAIN GOAL OF B+ TREE IS:
Sorted Intermediary and leaf nodes
Since it is a balanced tree, all nodes should be sorted.
Fast traversal and Quick Search
Any record should be fetched very quickly. This is made by maintaining the
balance in the tree and keeping all the nodes at same distance
No overflow pages
B+ tree allows all the intermediary and leaf nodes to be partially filled – it will have
some percentage defined while designing a B+ tree. In our example above,
intermediary node with 108 is underflow. And leaf nodes are not partially filled,
hence it is an overflow. In ideal B+ tree, it should not have overflow or underflow
except root node.
40. Definition of a B-tree
A B-tree of order m is an m-way tree (i.e., a tree where each node
may have up to m children) in which:
The number of keys in each non-leaf node is one less than the
number of its children and these keys partition the keys in the
children in the fashion of a search tree.
All leaves are on the same level.
All non-leaf nodes except the root have at least m / 2 children.
The root is either a leaf node, or it has from two to m children
a leaf node contains no more than m – 1 keys.
The number m should always be odd.
41. An example B-Tree
B-Trees 41
51 6242
6 12
26
55 60 7064 9045
1 2 4 7 8 13 15 18 25
27 29 46 48 53
A B-tree of order 5
containing 26 items
Note that all the leaves are at the same level
42. Constructing a B-tree
Suppose we start with an empty B-tree and keys arrive in the
following order:1 12 8 2 25 5 14 28 17 7 52 16 48 68
3 26 29 53 55 45
We want to construct a B-tree of order 5
The first four items go into the root:
To put the fifth item in the root would violate condition 5
Therefore, when 25 arrives, pick the middle key to make a new
root
B-Trees 42
1 2 8 12
43. Inserting into a B-Tree
Attempt to insert the new key into a leaf
If this would result in that leaf becoming too big, split the leaf
into two, promoting the middle key to the leaf’s parent
If this would result in the parent becoming too big, split the
parent into two, promoting the middle key
This strategy might have to be repeated all the way to the top
If necessary, the root is split in two and the middle key is
promoted to a new root, making the tree one level higher
B-Trees 43
44. Removal from a B-tree
During insertion, the key always goes into a leaf. For deletion
we wish to remove from a leaf. There are three possible ways
we can do this:
1 - If the key is already in a leaf node, and removing it doesn’t
cause that leaf node to have too few keys, then simply remove
the key to be deleted.
2 - If the key is not in a leaf then it is guaranteed (by the nature
of a B-tree) that its predecessor or successor will be in a leaf --
in this case we can delete the key and promote the predecessor
or successor key to the non-leaf deleted key’s position.
B-Trees 44
45. Analysis of B-Trees
The maximum number of items in a B-tree of order m and
height h:
root m – 1
level 1 m(m – 1)
level 2 m2(m – 1)
. . .
level h mh(m – 1)
So, the total number of items is
(1 + m + m2 + m3 + … + mh)(m – 1) =
[(mh+1 – 1)/ (m – 1)] (m – 1) = mh+1 – 1
When m = 5 and h = 2 this gives 53 – 1 = 124
B-Trees 45
46. Static Hashing:
In static hashing, when a search-key value is provided, the
hash function always computes the same address.
For example, if mod-4 hash function is used, then it shall
generate only 5 values.
The output address shall always be same for that function.
The number of buckets provided remains unchanged at all times
Operation
When a record is required to be entered using static hash,
the hash function h computes the bucket address for search key K,
where the record will be stored.
47. Bucket address = h(K)
Search − When a record needs to be
retrieved, the same hash function can be used to
retrieve the address of the bucket where the data is
stored.
Delete − This is simply a search followed by a
deletion operation.
48.
49. Dynamic Hashing
The problem with static hashing is that it does not
expand or shrink dynamically as the size of the database grows
or shrinks.
Dynamic hashing provides a mechanism in which data
buckets are added and removed dynamically and ondemand.
Dynamic hashing is also known as extended hashing.
Hash function, in dynamic hashing, is made to produce
a large number of values and only a few are used initially.
50.
51. Multiple-Key Access
Use multiple indices for certain types of queries.
Example:
select ID
from instructor
where dept_name = “Finance” and salary = 80000
Possible strategies for processing query using indices on
single attributes:
52. Multiple Key Access
1. Use index on dept_name to find instructors with
department name Finance; test salary = 80000
2. Use index on salary to find instructors with a salary
of $80000; test dept_name = “Finance”.
3. Use dept_name index to find pointers to all records
pertaining to the “Finance” department.
Similarly use index on salary. Take
intersection of both sets of pointers obtained.