SlideShare ist ein Scribd-Unternehmen logo
1 von 74
raghu@theoracletrainer.com

www.theoracletrainer.com
Introduction to
Database Management Systems
(DBMS)
Database Management
System (DBMS)
Definitions:



Data: Known facts that can be recorded and
that have implicit meaning
Database: Collection of related data




Ex. the names, telephone numbers and addresses
of all the people you know

Database Management System: A
computerized record-keeping system

raghu@theoracletrainer.com

www.theoracletrainer.com
DBMS (Contd.)


Goals of a Database Management System:






To provide an efficient as well as convenient
environment for accessing data in a database
Enforce information security: database security,
concurrence control, crash recovery

It is a general purpose facility for:


Defining database



Constructing database



Manipulating database

raghu@theoracletrainer.com

www.theoracletrainer.com
Benefits of database approach








Redundancy can be reduced
Inconsistency can be avoided
Data can be shared
Standards can be enforced
Security restrictions can be applied
Integrity can be maintained
Data independence can be provided

raghu@theoracletrainer.com

www.theoracletrainer.com
DBMS Functions







Data Definition
Data Manipulation
Data Security and Integrity
Data Recovery and Concurrency
Data Dictionary
Performance

raghu@theoracletrainer.com

www.theoracletrainer.com
Database System
Users

DATABASE

Application Programs/Queries

SYSTEM
DBMS
Software

Software to process queries/programs
Software to access stored data

Stored Data Defn.
(META-DATA).

raghu@theoracletrainer.com

Stored Database

www.theoracletrainer.com
Data Model




A set of concepts used to desscribe the structure of a
database
By structure, we mean the data types, relationships,
and constraints that should holds for the data
Categories of Data
Models

Conceptual
raghu@theoracletrainer.com

Physical

Representational
www.theoracletrainer.com
Database Architecture
External level
(individual user
views)
Conceptual level
(community user view)

Internal level
(storage view)
Database

raghu@theoracletrainer.com

www.theoracletrainer.com
An example of the three levels
SNo FName LName

Age

Salary

Conceptual View
SNo FName LName

Age

External View1

SNo LName BranchNo
External View2
raghu@theoracletrainer.com

Salary

BranchNo
struct STAFF {
Internal
int staffNo;
View
int branchNo;
char fName[15];
char lName[15];
struct date dateOfBirth;
float salary;
struct STAFF *next;
/* pointer to next Staff record
*/
};
index staffNo; index branchNo;
/* define indexes for staff */

www.theoracletrainer.com
Schema




Schema: Description of data in terms of a data
model
Three-level DB Architecture defines following
schemas:


External Schema (or sub-schema)




Conceptual Schema (or schema)




Written using external DDL
Written using conceptual DDL

Internal Schema


Written using internal DDL or storage structure definition

raghu@theoracletrainer.com

www.theoracletrainer.com
Data Independence


Change the schema at one level of a database
system without a need to change the schema at
the next higher level


Logical data independence: Refers to the immunity
of the external schemas to changes in the conceptual
schema e.g., add new record or field

Physical data independence: Refers to the immunity
of the conceptual schema to changes in the internal
schema e.g., adding new index should not void
existing ones
www.theoracletrainer.com
raghu@theoracletrainer.com

TYPES OF DATABASE MODELS
HIERARCHICAL

NETWORK

COLUMN

ROW

VALUE

TABLE

RELATIONAL
raghu@theoracletrainer.com

www.theoracletrainer.com
DATABASE DESIGN PHASES
DATA ANALYSIS
Entities - Attributes - Relationships - Integrity Rules

LOGICAL DESIGN
Tables - Columns - Primary Keys - Foreign Keys

PHYSICAL DESIGN
DDL for Tablespaces, Tables, Indexes

raghu@theoracletrainer.com

www.theoracletrainer.com
Introduction to
Relational Databases:
RDBMS
Some Important Terms


Relation : a table



Tuple : a row in a table



Attribute : a Column in a table



Degree : number of attributes



Cardinality : number of tuples



Primary Key : a unique identifier for the table



Domain : a pool of values from which specific attributes
of specific relations draw their values

raghu@theoracletrainer.com

www.theoracletrainer.com
Keys


Key



Super Key



Candidate Keys





Primary Key
Alternate Key

Secondary Keys

raghu@theoracletrainer.com

www.theoracletrainer.com
Keys and Referential Integrity
Enrolled
sid
53666
53688
53650
53666

cid
grade
carnatic101
C
reggae203
B
topology112
A
history105
B

Foreign key referring to
sid of STUDENT relation
raghu@theoracletrainer.com

Student
sid

name

login

age

gpa

53666 Jones

Jones@cs

18

3.4

53688 Smith

Smith@eecs

18

3.2

53650 Smith

Smith@math

19

3.8

Primary key

www.theoracletrainer.com
raghu@theoracletrainer.com

www.theoracletrainer.com
Conceptual Design
Using the
Entity- Relationship
Model
Overview of Database Design


Conceptual design : (ER Model is used at this
stage.)



Schema Refinement : (Normalization)



Physical Database Design and Tuning

raghu@theoracletrainer.com

www.theoracletrainer.com
Design Phases…
Requirements Collection
& Analysis
Data Requirements

Functional Requirements
User Defined Operations
Data Flow Diagrams
Sequence Diagrams, Scenarios

Conceptual Design
Entity Types, Constraints , Relationships
No Implementation Details.

Logical Design

Ensures Requirements
Meets the Design

Data Model Mapping – Type of Database is identified
Physical Design
Internal Storage Structures / Access Path / File Organizations

raghu@theoracletrainer.com

www.theoracletrainer.com
E-R Modeling


Entity




Entity Set




a group of similar entities

Attribute




is anything that exists and is distinguishable

properties that describe an entity

Relationship


an association between entities

raghu@theoracletrainer.com

www.theoracletrainer.com
Notations
ENTITY TYPE ( REGULAR )

WEAK ENTITY TYPE
RELATIONSHIP TYPE

WEAK RELATIONSHIP TYPE

raghu@theoracletrainer.com

www.theoracletrainer.com
Entity
Attributes
ssn

name

Employee

lot

SSN
NAME
123- 22- 3666 Attishoo
231- 31- 5368 Smiley
131- 24- 3650 Smethurst

LOT
48
22
35

Entity Set
CREATE TABLE Employees
(ssn CHAR (11),
name CHAR (20),
lot INTEGER,
PRIMARY KEY (ssn))
raghu@theoracletrainer.com

www.theoracletrainer.com
Types of Relationships
1

1:1

student

1:M

students

M

M:M

students

M

raghu@theoracletrainer.com

Is issued

enrols in

take

1

ID card

1

course

M

tests

www.theoracletrainer.com
ER Model
ssn

lot

name

Employee
supervisor

since

Works_in

did

dname

budget

Department

Subordinate

Reports_To

raghu@theoracletrainer.com

www.theoracletrainer.com
ER Model (Contd.)
Works_ In
SSN
123-22-3666
123-22-3666
231-31-5368

DID
51
56
51

raghu@theoracletrainer.com

SINCE
1/1/91
3/3/93
2/2/92

CREATE TABLE Works_ In(
ssn CHAR (11),
did INTEGER,
since DATE,
PRIMARY KEY (ssn, did),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments)
www.theoracletrainer.com
Key Constraints

ssn

name

Employee

raghu@theoracletrainer.com

lot

since

Manages

did

dname

budget

Department

www.theoracletrainer.com
Key Constraints for Ternary Relationships

ssn

lot

name

Employee

since

Works_in

did

dname budget
Department

Location
address
raghu@theoracletrainer.com

capacity
www.theoracletrainer.com
Participation Constraints
ssn

name

Employee

lot

since

Manages

did

dname

budget

Department

Works_in
since
raghu@theoracletrainer.com

www.theoracletrainer.com
Weak Entities
ssn

name

Employee

raghu@theoracletrainer.com

lot

cost

policy

pname

age

Dependent

www.theoracletrainer.com
ISA (‘is a’) Hierarchies
ssn

name

lot

Employee
Hrly_wages
Hrs_worked

Hourly_Emp
raghu@theoracletrainer.com

IsA

contractid
Contract_Emp
www.theoracletrainer.com
Aggregation
ssn

name

lot

Employee
monitors

pid

pbudget

project
raghu@theoracletrainer.com

Started on

sponsors

until

did

dname budget

department
www.theoracletrainer.com
Entity vs. Attribute
Works_ In does not allow an employee to work in a department
for two or more periods (why?)

ssn

name

lot

Employee

raghu@theoracletrainer.com

from

to

Works_in

did

dname

budget

Department

www.theoracletrainer.com
Entity vs. Attribute (Contd.)

ssn

lot

name

Employee

from

raghu@theoracletrainer.com

did

Works_in

Duration

dname

budget

Department

to

www.theoracletrainer.com
Entity vs. Relationship

ssn

name

lot

Employee

since

DB

manages

did

dname

budget

Department

DB - Dbudget

raghu@theoracletrainer.com

www.theoracletrainer.com
Entity vs. Relationship
ssn

name

lot

Employee

did

manages

dname

budget

Department

since
Appt num

Mgr_appt
DBudget

raghu@theoracletrainer.com

www.theoracletrainer.com
Binary vs. Ternary Relationships

ssn

lot

name

Employee

pname

age

Dependent

covers
Policy

policyid
raghu@theoracletrainer.com

cost
www.theoracletrainer.com
Binary vs. Ternary Relationships
Better Design
ssn

name

lot

pname

Dependent

Employee

Beneficiary

purchaser

policyid
raghu@theoracletrainer.com

age

Policy

cost
www.theoracletrainer.com
Constraints Beyond the ER Model
• Some constraints cannot be captured in ER diagrams:
• Functional dependencies
• Inclusion dependencies
• General constraints

raghu@theoracletrainer.com

www.theoracletrainer.com
E-R Diagram
DEPARTMENT
1
SUPPLIER
DEPT_
EMP
M

M
M

PROJ_
WORK

M
PROJECT

EMPLOYEE
1

M

M

1

PROJ_
MGR

M

DEPENDENT

raghu@theoracletrainer.com

SUPP_
PART

M

EMP_
DEP
M

SUPP_
PART_
PROJ

PART
M

M
M

PART_
STRUC
TURE

www.theoracletrainer.com
Example to Start with ….


An Example Database Application called
COMPANY which serves to illustrate the ER
Model concepts and their schema design.
The following are collection from the Client.

raghu@theoracletrainer.com

www.theoracletrainer.com
Analysis…


Company :
Organized into Departments, Each Department
has a name, no and manager who manages the
department. The Company keeps track of the
date that employee managing the department. A
Department may have a Several locations.

raghu@theoracletrainer.com

www.theoracletrainer.com
Analysis…




Department :
A Department controls a number of Projects each of
which has a unique name , no and a single Location.
Employee :
Name, Age, Gender, BirthDate, SSN, Address, Salary.
An Employee is assigned to one department, may work
on several projects which are not controlled by the
department. Track of the number of hours per week is
also controlled.

raghu@theoracletrainer.com

www.theoracletrainer.com
Analysis….


Keep track of the dependents of each employee
for insurance policies : We keep each dependant
first name, gender, Date of birth and
relationship to the employee.

raghu@theoracletrainer.com

www.theoracletrainer.com
DEPARTMENT
( Name , Number , { Locations } , Manager, Start Date )
PROJECT
( Name, Number, Location , Controlling Department )
EMPLOYEE
(Name (Fname, Lname) , SSN , Gender, Address, Salary
Birthdate, Department , Supervisor , (Workson ( Project , Hrs))
DEPENDENT
( Employee, Name, Gender, Birthdate , Relationship )

raghu@theoracletrainer.com

www.theoracletrainer.com
Example …


Manage:






Department and Employee
Partial Participation
Relation Attribute : StartDate.

Works For:



Department and Employee
Total Participation

raghu@theoracletrainer.com

www.theoracletrainer.com
Example…


Control :
Department , Project
 Partial Participation from Department
 Total Participation from Project
 Control Department is a RKA.




Supervisor :
Employee, Employee
 Partial and Recursive


raghu@theoracletrainer.com

www.theoracletrainer.com
Example …


Works – On :
Project , Employee
 Total Participation
 Hours Worked is a RKA.




Dependants of:
Employee , Dependant
 Dependant is a Weaker
 Dependant is Total , Employee is Partial.


raghu@theoracletrainer.com

www.theoracletrainer.com
One Possible mapping of the Problem
Statement
Name
No

Lname
Fname

Work
s For

Sal
Sex

Loc

Department

SSN

Name
Employee

Sdate

Address

Control
s

manage
s

Bdate
Hours

Project

Work
sOn

Supe
rvise
s

Name

No

Depend On
Dependent

raghu@theoracletrainer.com
Name

Sex

Bdate

Relationship

www.theoracletrainer.com

Loc
raghu@theoracletrainer.com

www.theoracletrainer.com
raghu@theoracletrainer.com

www.theoracletrainer.com
raghu@theoracletrainer.com

www.theoracletrainer.com
raghu@theoracletrainer.com

www.theoracletrainer.com
Schema Refinement
and
Normalization
Normalization and Normal
Forms


Normalization:
Decomposing a larger, complex table into several
smaller, simpler ones.
 Move from a lower normal form to a higher Normal
form.




Normal Forms:
First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 *Higher Normal Forms (BCNF, 4NF, 5NF ....)




In practice, 3NF is often good enough.
www.theoracletrainer.com

raghu@theoracletrainer.com
Why Normal Forms


The first question to ask is whether any
refinement is needed!



If a relation is in a certain normal form (BCNF,
3NF etc.), it is known that certain kinds of
problems are avoided/ minimized. This can be
used to help us decide whether decomposing the
relation will help.

raghu@theoracletrainer.com

www.theoracletrainer.com
The Evils of Redundancy







Redundancy is at the root of several problems
associated with relational schemas
More seriously, data redundancy causes several
anomalies: insert, update, delete
Wastage of storage.
Main refinement technique: decomposition
(replacing ABCD with, say, AB and BCD, or
ACD and ABD).

raghu@theoracletrainer.com

www.theoracletrainer.com
Refining an ER Diagram - Before
ssn

name

lot

Employee

raghu@theoracletrainer.com

since

Works_in

did

dname

budget

Department

www.theoracletrainer.com
Refining an ER Diagram - After

ssn

name

since

did

dname

budget
lot

Employee

raghu@theoracletrainer.com

Works_in

Department

www.theoracletrainer.com
First Normal Form


A table is in 1NF, if every row contains exactly one
value for each attribute.



Disallow multivalued attributes, composite attributes
and their combinations.



1NF states that :




domains of attributes must include only atomic (simple,
indivisible) values and that value of any attribute in a tuple
must be a single value from the domain of that attribute.

By definition, any relational table must be in 1NF.
raghu@theoracletrainer.com

www.theoracletrainer.com
Functional Dependencies (FDs)


Provide a formal mechanism to express
constraints between attributes



Given a relation R, attribute Y of R is
functionally dependent on the attribute X of R
if & only if each X-value in R has associated
with it precisely one Y-value in R.

raghu@theoracletrainer.com

www.theoracletrainer.com
Full Dependency


Concept of full functional dependency


A FD x → y is a full functional dependency if
removal of any attribute A from X means that the
dependency does not hold any more.

raghu@theoracletrainer.com

www.theoracletrainer.com
Partial Dependency


An F.D. x → y is a partial dependency if there is
some attribute A ∈ X that can be removed
from X and the dependency will still hold.

raghu@theoracletrainer.com

www.theoracletrainer.com
Example: Constraints on Entity Set
S
N
123- 22- 3666 Attishoo
231- 31- 5368 Smiley
131- 24- 3650 Smethurst
434- 26- 3751 Guldu
612- 67- 4134 Madayan
S
N
123- 22- 3666 Attishoo
231- 31- 5368 Smiley
131- 24- 3650 Smethurst
434- 26- 3751 Guldu
612- 67- 4134 Madayan
raghu@theoracletrainer.com

L
48
22
35
35
35

H
40
30
30
32
40

L
48
22
35
35
35
R
8
8
5
5
8

R
8
8
5
5
8

W
10
10
7
7
10

H
40
30
30
32
40

R W
5 7
8 10
www.theoracletrainer.com
Second Normal Form (2NF)


A relation schema R is in 2NF if:


it is in 1NF and



every non-prime attribute A in R is fully functionally
dependent on the primary key of R.



2NF prohibits partial dependencies.

raghu@theoracletrainer.com

www.theoracletrainer.com
2NF: An Example


Emp{Eno, Dept, ProjCode, Hours}





Primary key: {Eno, ProjCode}
{Eno} -> {Dept}, {Eno, ProjCode} -> {Hours}

Test of 2NF





{Eno} -> {Dept}: partial dependency.
Emp is in 1NF, but not in 2NF.

Decomposition:


Emp {Eno, Dept}

 Proj {Eno, ProjCode,
raghu@theoracletrainer.com

Hours}

www.theoracletrainer.com
Transitive Dependency


An FD X → Y in a relation schema R is a
transitive dependency if


there is a set of attributes Z that is not a subset of
any key of R, and



both X → Z and Z → Y hold.

raghu@theoracletrainer.com

www.theoracletrainer.com
Third Normal Form


A relation schema R is in 3NF if


It is in 2NF and



No nonprime attribute of R is transitively dependent
on the primary key.



3NF means that each non-key attribute value
in any tuple is truly dependent on the Primary
Key and not even partially on other attributes.



3NF prohibits transitive dependencies.

raghu@theoracletrainer.com

www.theoracletrainer.com
3NF: An Example


Emp{Eno, Dept, Dept_Head}
Primary key: {Eno}
 {Eno} -> {Dept}, {Dept} -> {Dept_Head}




Test of 3NF
{Eno} -> {Dept} -> {Dept_Head}: Transitive dependency.
 Emp is in 2NF, but not in 3NF.




Decomposition:
Emp {Eno, Dept}
 Dept {Dept, Dept_Head}


raghu@theoracletrainer.com

www.theoracletrainer.com
Boyce –Codd Normal Form


The intention of BCNF is that- 3NF does not
satisfactorily handle the case of a relation
processing two or more composite or
overlapping candidate keys

raghu@theoracletrainer.com

www.theoracletrainer.com
BCNF ( Boyce Codd Normal
Form)


A Relation is said to be in Boyce Codd Normal
Form (BCNF) if and only if every determinant
is a candidate key.

raghu@theoracletrainer.com

www.theoracletrainer.com

Weitere ähnliche Inhalte

Was ist angesagt?

Relational database revised
Relational database revisedRelational database revised
Relational database revised
mnodalo
 
Key database terms
Key database termsKey database terms
Key database terms
listergc
 
Database system concepts
Database system conceptsDatabase system concepts
Database system concepts
Kumar
 

Was ist angesagt? (19)

overview of database concept
overview of database conceptoverview of database concept
overview of database concept
 
Database Concepts and Components
Database Concepts and ComponentsDatabase Concepts and Components
Database Concepts and Components
 
Unit01 dbms
Unit01 dbmsUnit01 dbms
Unit01 dbms
 
Basic Concept of Database
Basic Concept of DatabaseBasic Concept of Database
Basic Concept of Database
 
Database overview
Database overviewDatabase overview
Database overview
 
Database concepts
Database conceptsDatabase concepts
Database concepts
 
Chapter 6 Database SC025 2017/2018
Chapter 6 Database SC025 2017/2018Chapter 6 Database SC025 2017/2018
Chapter 6 Database SC025 2017/2018
 
Relational database revised
Relational database revisedRelational database revised
Relational database revised
 
Database
DatabaseDatabase
Database
 
DBMS introduction
DBMS introductionDBMS introduction
DBMS introduction
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Database
DatabaseDatabase
Database
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database Design
 
Database Concept by Luke Lonergan
Database Concept by Luke LonerganDatabase Concept by Luke Lonergan
Database Concept by Luke Lonergan
 
RDBMS
RDBMSRDBMS
RDBMS
 
Database Fundamental Concepts- Series 1 - Performance Analysis
Database Fundamental Concepts- Series 1 - Performance AnalysisDatabase Fundamental Concepts- Series 1 - Performance Analysis
Database Fundamental Concepts- Series 1 - Performance Analysis
 
Key database terms
Key database termsKey database terms
Key database terms
 
11 Database Concepts
11 Database Concepts11 Database Concepts
11 Database Concepts
 
Database system concepts
Database system conceptsDatabase system concepts
Database system concepts
 

Ähnlich wie Good PPT for RDBMS starter

Ähnlich wie Good PPT for RDBMS starter (20)

Dbms unit i
Dbms unit iDbms unit i
Dbms unit i
 
Unit 1 DBMS
Unit 1 DBMSUnit 1 DBMS
Unit 1 DBMS
 
Advanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.pptAdvanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.ppt
 
Database management systems
Database management systemsDatabase management systems
Database management systems
 
Data abs ind & mod
Data abs  ind  & modData abs  ind  & mod
Data abs ind & mod
 
data base
data basedata base
data base
 
dbms notes.ppt
dbms notes.pptdbms notes.ppt
dbms notes.ppt
 
unit 1.pptx
unit 1.pptxunit 1.pptx
unit 1.pptx
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
 
dbms-1.pptx
dbms-1.pptxdbms-1.pptx
dbms-1.pptx
 
Lecture1
Lecture1Lecture1
Lecture1
 
View of data DBMS
View of data DBMSView of data DBMS
View of data DBMS
 
unit 1.pptx
unit 1.pptxunit 1.pptx
unit 1.pptx
 
Database administration
Database administrationDatabase administration
Database administration
 
Ch-1-Introduction-to-Database.pdf
Ch-1-Introduction-to-Database.pdfCh-1-Introduction-to-Database.pdf
Ch-1-Introduction-to-Database.pdf
 
Chapter Five Physical Database Design.pptx
Chapter Five Physical Database Design.pptxChapter Five Physical Database Design.pptx
Chapter Five Physical Database Design.pptx
 
data base manage ment
data base manage mentdata base manage ment
data base manage ment
 
Computer lecture (1) m.nasir
Computer lecture (1) m.nasirComputer lecture (1) m.nasir
Computer lecture (1) m.nasir
 
Lecture 3 note.pptx
Lecture 3 note.pptxLecture 3 note.pptx
Lecture 3 note.pptx
 
Dbms module i
Dbms module iDbms module i
Dbms module i
 

Kürzlich hochgeladen

Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Kürzlich hochgeladen (20)

Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Good PPT for RDBMS starter

  • 1.
  • 4. Database Management System (DBMS) Definitions:   Data: Known facts that can be recorded and that have implicit meaning Database: Collection of related data   Ex. the names, telephone numbers and addresses of all the people you know Database Management System: A computerized record-keeping system raghu@theoracletrainer.com www.theoracletrainer.com
  • 5. DBMS (Contd.)  Goals of a Database Management System:    To provide an efficient as well as convenient environment for accessing data in a database Enforce information security: database security, concurrence control, crash recovery It is a general purpose facility for:  Defining database  Constructing database  Manipulating database raghu@theoracletrainer.com www.theoracletrainer.com
  • 6. Benefits of database approach        Redundancy can be reduced Inconsistency can be avoided Data can be shared Standards can be enforced Security restrictions can be applied Integrity can be maintained Data independence can be provided raghu@theoracletrainer.com www.theoracletrainer.com
  • 7. DBMS Functions       Data Definition Data Manipulation Data Security and Integrity Data Recovery and Concurrency Data Dictionary Performance raghu@theoracletrainer.com www.theoracletrainer.com
  • 8. Database System Users DATABASE Application Programs/Queries SYSTEM DBMS Software Software to process queries/programs Software to access stored data Stored Data Defn. (META-DATA). raghu@theoracletrainer.com Stored Database www.theoracletrainer.com
  • 9. Data Model   A set of concepts used to desscribe the structure of a database By structure, we mean the data types, relationships, and constraints that should holds for the data Categories of Data Models Conceptual raghu@theoracletrainer.com Physical Representational www.theoracletrainer.com
  • 10. Database Architecture External level (individual user views) Conceptual level (community user view) Internal level (storage view) Database raghu@theoracletrainer.com www.theoracletrainer.com
  • 11. An example of the three levels SNo FName LName Age Salary Conceptual View SNo FName LName Age External View1 SNo LName BranchNo External View2 raghu@theoracletrainer.com Salary BranchNo struct STAFF { Internal int staffNo; View int branchNo; char fName[15]; char lName[15]; struct date dateOfBirth; float salary; struct STAFF *next; /* pointer to next Staff record */ }; index staffNo; index branchNo; /* define indexes for staff */ www.theoracletrainer.com
  • 12. Schema   Schema: Description of data in terms of a data model Three-level DB Architecture defines following schemas:  External Schema (or sub-schema)   Conceptual Schema (or schema)   Written using external DDL Written using conceptual DDL Internal Schema  Written using internal DDL or storage structure definition raghu@theoracletrainer.com www.theoracletrainer.com
  • 13. Data Independence  Change the schema at one level of a database system without a need to change the schema at the next higher level  Logical data independence: Refers to the immunity of the external schemas to changes in the conceptual schema e.g., add new record or field Physical data independence: Refers to the immunity of the conceptual schema to changes in the internal schema e.g., adding new index should not void existing ones www.theoracletrainer.com raghu@theoracletrainer.com 
  • 14. TYPES OF DATABASE MODELS HIERARCHICAL NETWORK COLUMN ROW VALUE TABLE RELATIONAL raghu@theoracletrainer.com www.theoracletrainer.com
  • 15. DATABASE DESIGN PHASES DATA ANALYSIS Entities - Attributes - Relationships - Integrity Rules LOGICAL DESIGN Tables - Columns - Primary Keys - Foreign Keys PHYSICAL DESIGN DDL for Tablespaces, Tables, Indexes raghu@theoracletrainer.com www.theoracletrainer.com
  • 17. Some Important Terms  Relation : a table  Tuple : a row in a table  Attribute : a Column in a table  Degree : number of attributes  Cardinality : number of tuples  Primary Key : a unique identifier for the table  Domain : a pool of values from which specific attributes of specific relations draw their values raghu@theoracletrainer.com www.theoracletrainer.com
  • 18. Keys  Key  Super Key  Candidate Keys    Primary Key Alternate Key Secondary Keys raghu@theoracletrainer.com www.theoracletrainer.com
  • 19. Keys and Referential Integrity Enrolled sid 53666 53688 53650 53666 cid grade carnatic101 C reggae203 B topology112 A history105 B Foreign key referring to sid of STUDENT relation raghu@theoracletrainer.com Student sid name login age gpa 53666 Jones Jones@cs 18 3.4 53688 Smith Smith@eecs 18 3.2 53650 Smith Smith@math 19 3.8 Primary key www.theoracletrainer.com
  • 22. Overview of Database Design  Conceptual design : (ER Model is used at this stage.)  Schema Refinement : (Normalization)  Physical Database Design and Tuning raghu@theoracletrainer.com www.theoracletrainer.com
  • 23. Design Phases… Requirements Collection & Analysis Data Requirements Functional Requirements User Defined Operations Data Flow Diagrams Sequence Diagrams, Scenarios Conceptual Design Entity Types, Constraints , Relationships No Implementation Details. Logical Design Ensures Requirements Meets the Design Data Model Mapping – Type of Database is identified Physical Design Internal Storage Structures / Access Path / File Organizations raghu@theoracletrainer.com www.theoracletrainer.com
  • 24. E-R Modeling  Entity   Entity Set   a group of similar entities Attribute   is anything that exists and is distinguishable properties that describe an entity Relationship  an association between entities raghu@theoracletrainer.com www.theoracletrainer.com
  • 25. Notations ENTITY TYPE ( REGULAR ) WEAK ENTITY TYPE RELATIONSHIP TYPE WEAK RELATIONSHIP TYPE raghu@theoracletrainer.com www.theoracletrainer.com
  • 26. Entity Attributes ssn name Employee lot SSN NAME 123- 22- 3666 Attishoo 231- 31- 5368 Smiley 131- 24- 3650 Smethurst LOT 48 22 35 Entity Set CREATE TABLE Employees (ssn CHAR (11), name CHAR (20), lot INTEGER, PRIMARY KEY (ssn)) raghu@theoracletrainer.com www.theoracletrainer.com
  • 27. Types of Relationships 1 1:1 student 1:M students M M:M students M raghu@theoracletrainer.com Is issued enrols in take 1 ID card 1 course M tests www.theoracletrainer.com
  • 29. ER Model (Contd.) Works_ In SSN 123-22-3666 123-22-3666 231-31-5368 DID 51 56 51 raghu@theoracletrainer.com SINCE 1/1/91 3/3/93 2/2/92 CREATE TABLE Works_ In( ssn CHAR (11), did INTEGER, since DATE, PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments) www.theoracletrainer.com
  • 31. Key Constraints for Ternary Relationships ssn lot name Employee since Works_in did dname budget Department Location address raghu@theoracletrainer.com capacity www.theoracletrainer.com
  • 34. ISA (‘is a’) Hierarchies ssn name lot Employee Hrly_wages Hrs_worked Hourly_Emp raghu@theoracletrainer.com IsA contractid Contract_Emp www.theoracletrainer.com
  • 36. Entity vs. Attribute Works_ In does not allow an employee to work in a department for two or more periods (why?) ssn name lot Employee raghu@theoracletrainer.com from to Works_in did dname budget Department www.theoracletrainer.com
  • 37. Entity vs. Attribute (Contd.) ssn lot name Employee from raghu@theoracletrainer.com did Works_in Duration dname budget Department to www.theoracletrainer.com
  • 38. Entity vs. Relationship ssn name lot Employee since DB manages did dname budget Department DB - Dbudget raghu@theoracletrainer.com www.theoracletrainer.com
  • 39. Entity vs. Relationship ssn name lot Employee did manages dname budget Department since Appt num Mgr_appt DBudget raghu@theoracletrainer.com www.theoracletrainer.com
  • 40. Binary vs. Ternary Relationships ssn lot name Employee pname age Dependent covers Policy policyid raghu@theoracletrainer.com cost www.theoracletrainer.com
  • 41. Binary vs. Ternary Relationships Better Design ssn name lot pname Dependent Employee Beneficiary purchaser policyid raghu@theoracletrainer.com age Policy cost www.theoracletrainer.com
  • 42. Constraints Beyond the ER Model • Some constraints cannot be captured in ER diagrams: • Functional dependencies • Inclusion dependencies • General constraints raghu@theoracletrainer.com www.theoracletrainer.com
  • 44. Example to Start with ….  An Example Database Application called COMPANY which serves to illustrate the ER Model concepts and their schema design. The following are collection from the Client. raghu@theoracletrainer.com www.theoracletrainer.com
  • 45. Analysis…  Company : Organized into Departments, Each Department has a name, no and manager who manages the department. The Company keeps track of the date that employee managing the department. A Department may have a Several locations. raghu@theoracletrainer.com www.theoracletrainer.com
  • 46. Analysis…   Department : A Department controls a number of Projects each of which has a unique name , no and a single Location. Employee : Name, Age, Gender, BirthDate, SSN, Address, Salary. An Employee is assigned to one department, may work on several projects which are not controlled by the department. Track of the number of hours per week is also controlled. raghu@theoracletrainer.com www.theoracletrainer.com
  • 47. Analysis….  Keep track of the dependents of each employee for insurance policies : We keep each dependant first name, gender, Date of birth and relationship to the employee. raghu@theoracletrainer.com www.theoracletrainer.com
  • 48. DEPARTMENT ( Name , Number , { Locations } , Manager, Start Date ) PROJECT ( Name, Number, Location , Controlling Department ) EMPLOYEE (Name (Fname, Lname) , SSN , Gender, Address, Salary Birthdate, Department , Supervisor , (Workson ( Project , Hrs)) DEPENDENT ( Employee, Name, Gender, Birthdate , Relationship ) raghu@theoracletrainer.com www.theoracletrainer.com
  • 49. Example …  Manage:     Department and Employee Partial Participation Relation Attribute : StartDate. Works For:   Department and Employee Total Participation raghu@theoracletrainer.com www.theoracletrainer.com
  • 50. Example…  Control : Department , Project  Partial Participation from Department  Total Participation from Project  Control Department is a RKA.   Supervisor : Employee, Employee  Partial and Recursive  raghu@theoracletrainer.com www.theoracletrainer.com
  • 51. Example …  Works – On : Project , Employee  Total Participation  Hours Worked is a RKA.   Dependants of: Employee , Dependant  Dependant is a Weaker  Dependant is Total , Employee is Partial.  raghu@theoracletrainer.com www.theoracletrainer.com
  • 52. One Possible mapping of the Problem Statement Name No Lname Fname Work s For Sal Sex Loc Department SSN Name Employee Sdate Address Control s manage s Bdate Hours Project Work sOn Supe rvise s Name No Depend On Dependent raghu@theoracletrainer.com Name Sex Bdate Relationship www.theoracletrainer.com Loc
  • 58. Normalization and Normal Forms  Normalization: Decomposing a larger, complex table into several smaller, simpler ones.  Move from a lower normal form to a higher Normal form.   Normal Forms: First Normal Form (1NF)  Second Normal Form (2NF)  Third Normal Form (3NF)  *Higher Normal Forms (BCNF, 4NF, 5NF ....)   In practice, 3NF is often good enough. www.theoracletrainer.com raghu@theoracletrainer.com
  • 59. Why Normal Forms  The first question to ask is whether any refinement is needed!  If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of problems are avoided/ minimized. This can be used to help us decide whether decomposing the relation will help. raghu@theoracletrainer.com www.theoracletrainer.com
  • 60. The Evils of Redundancy     Redundancy is at the root of several problems associated with relational schemas More seriously, data redundancy causes several anomalies: insert, update, delete Wastage of storage. Main refinement technique: decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD). raghu@theoracletrainer.com www.theoracletrainer.com
  • 61. Refining an ER Diagram - Before ssn name lot Employee raghu@theoracletrainer.com since Works_in did dname budget Department www.theoracletrainer.com
  • 62. Refining an ER Diagram - After ssn name since did dname budget lot Employee raghu@theoracletrainer.com Works_in Department www.theoracletrainer.com
  • 63. First Normal Form  A table is in 1NF, if every row contains exactly one value for each attribute.  Disallow multivalued attributes, composite attributes and their combinations.  1NF states that :   domains of attributes must include only atomic (simple, indivisible) values and that value of any attribute in a tuple must be a single value from the domain of that attribute. By definition, any relational table must be in 1NF. raghu@theoracletrainer.com www.theoracletrainer.com
  • 64. Functional Dependencies (FDs)  Provide a formal mechanism to express constraints between attributes  Given a relation R, attribute Y of R is functionally dependent on the attribute X of R if & only if each X-value in R has associated with it precisely one Y-value in R. raghu@theoracletrainer.com www.theoracletrainer.com
  • 65. Full Dependency  Concept of full functional dependency  A FD x → y is a full functional dependency if removal of any attribute A from X means that the dependency does not hold any more. raghu@theoracletrainer.com www.theoracletrainer.com
  • 66. Partial Dependency  An F.D. x → y is a partial dependency if there is some attribute A ∈ X that can be removed from X and the dependency will still hold. raghu@theoracletrainer.com www.theoracletrainer.com
  • 67. Example: Constraints on Entity Set S N 123- 22- 3666 Attishoo 231- 31- 5368 Smiley 131- 24- 3650 Smethurst 434- 26- 3751 Guldu 612- 67- 4134 Madayan S N 123- 22- 3666 Attishoo 231- 31- 5368 Smiley 131- 24- 3650 Smethurst 434- 26- 3751 Guldu 612- 67- 4134 Madayan raghu@theoracletrainer.com L 48 22 35 35 35 H 40 30 30 32 40 L 48 22 35 35 35 R 8 8 5 5 8 R 8 8 5 5 8 W 10 10 7 7 10 H 40 30 30 32 40 R W 5 7 8 10 www.theoracletrainer.com
  • 68. Second Normal Form (2NF)  A relation schema R is in 2NF if:  it is in 1NF and  every non-prime attribute A in R is fully functionally dependent on the primary key of R.  2NF prohibits partial dependencies. raghu@theoracletrainer.com www.theoracletrainer.com
  • 69. 2NF: An Example  Emp{Eno, Dept, ProjCode, Hours}    Primary key: {Eno, ProjCode} {Eno} -> {Dept}, {Eno, ProjCode} -> {Hours} Test of 2NF    {Eno} -> {Dept}: partial dependency. Emp is in 1NF, but not in 2NF. Decomposition:  Emp {Eno, Dept}  Proj {Eno, ProjCode, raghu@theoracletrainer.com Hours} www.theoracletrainer.com
  • 70. Transitive Dependency  An FD X → Y in a relation schema R is a transitive dependency if  there is a set of attributes Z that is not a subset of any key of R, and  both X → Z and Z → Y hold. raghu@theoracletrainer.com www.theoracletrainer.com
  • 71. Third Normal Form  A relation schema R is in 3NF if  It is in 2NF and  No nonprime attribute of R is transitively dependent on the primary key.  3NF means that each non-key attribute value in any tuple is truly dependent on the Primary Key and not even partially on other attributes.  3NF prohibits transitive dependencies. raghu@theoracletrainer.com www.theoracletrainer.com
  • 72. 3NF: An Example  Emp{Eno, Dept, Dept_Head} Primary key: {Eno}  {Eno} -> {Dept}, {Dept} -> {Dept_Head}   Test of 3NF {Eno} -> {Dept} -> {Dept_Head}: Transitive dependency.  Emp is in 2NF, but not in 3NF.   Decomposition: Emp {Eno, Dept}  Dept {Dept, Dept_Head}  raghu@theoracletrainer.com www.theoracletrainer.com
  • 73. Boyce –Codd Normal Form  The intention of BCNF is that- 3NF does not satisfactorily handle the case of a relation processing two or more composite or overlapping candidate keys raghu@theoracletrainer.com www.theoracletrainer.com
  • 74. BCNF ( Boyce Codd Normal Form)  A Relation is said to be in Boyce Codd Normal Form (BCNF) if and only if every determinant is a candidate key. raghu@theoracletrainer.com www.theoracletrainer.com

Hinweis der Redaktion

  1. Table of Contents 1. Introduction to Database Management Systems (DBMS) (Page : 3-16) 1.1 Database Management System: Definitions 1.2 DBMS 1.3 Benefits of database approach 1.4 DBMS functions 1.5 Database System 1.6 Data Model 1.7 Database Architecture 1.8 An Example of the Three Levels 1.9 Schema 1.10 Data Independence 1.11 Types Of Database Models 1.12 Database Design Phases 2. Introduction to RDBMS (Page : 17-24 ) 2.1 Definition: RDBMS 2.2 Features Of an RDBMS 2.3 Some Important Terms 2.4 Properties of Relations 2.5 Keys 2.6 Referential Integrity 2.10 Summary 3. Relational Algebra(Page : 25-36) 3.1 Relational Query Languages 3.2 Example Instances 3.3 Relational Algebra 3.4 Projection 3.5 Selection 3.6 Union, Intersection, Set Difference 3.7 Cross Product 3.8 Joins 3.9 Equi-Joins 3.10 Division 3.11 Summary 4. Introduction to Query Optimization(Page : 37-43) 4.1 Processing a high-level query 4.2 Techniques for Query Optimization 4.3 Motivating Examples 4.2 Summary
  2. 5. Conceptual Design Using The Entity-Relational Model (Page : 44-69) 5.1 Overview Of Database Design 5.2 E-R Modeling 5.3 Graphical Representaion 5.4 Types Of Relationships 5.5 E-R Diagram: Some Examples 5.6 Summary and Case Studies 6. Schema Refinement and Normalization (Page : 70-95) 6.1 Normalization and Normal Forms 6.2 Why Normal Forms 6.3 The Evils Of Redundancy 6.4 Refining an ER Diagram 6.5 First Normal Form 6.6 Functional Dependencies 6.7 Example: Constraints On Entity Set 6.8 Second Normal Form 6.9 Transitive Dependency 6.10 Third Normal Form 6.11 Boyce Codd Normal Form (BCNF) 6.12 Decomposition of a Relation Scheme 6.13 Lossless Join Decompositions 6.14 Summary and Examples 7. Transaction, Concurrency Control and Recovery(Page : 96-116) 7.1 Transactions 7.2 The ACID Properties 7.3 Why Have Concurrent Processes? 7.4 Schedules 7.5 Serializable Schedules 7.6 Serializability Violations 7.7 Cascading Aborts 7.8 Recoverable Schedules 7.9 Locking: A Technique For Concurrency Control 7.10 Two-Phase Locking 7.11 Handling A Lock Request 7.11 Recovery 7.12 Logging 7.13 Handling the Buffer Pool 7.14 Write Alead Logging 7.15 Checkpoints in the System Log 7.16 Summary Bibliographic Reference : Page 117)
  3. Topics Covered : Database Management System: Definitions DBMS Benefits of database approach DBMS functions Database System Data Model Database Architecture An Example of the Three Levels Schema Data Independence Types Of Database Models Database Design Phases
  4. Modern day Computer-based Information Systems (IS) are capable of serving a variety of complex tasks in a coordinated manner. Such systems handle large volumes of data, multiple users and several applications for activities occurring in a central and/ or distributed environment. The heart of an IS is Database Management. This is because most IS have to handle massive amounts of data. This core module of an IS is called as Database Management System (DBMS). A DBMS provides for storage, retrieval and updation of data in an organized manner. An Example: Consider the situation in a library. Here, we have data corresponding to books, authors, suppliers, borrowers, etc. The total volume of data stored and handled in a library may be quite large. The Library DBMS may require several operations such as issue, return or purchase of books; handle queries relating to book information, borrowing information, etc. Moreover, there are different types of users who operate various stages or activities. For example, a borrower may merely view certain information, whereas an issuer may be allowed to update the status of a book during issue or return. The Library staff may on the other hand add new books, their supplier, price and other information to the database. Each user category has a different access right on both the data as well as the processing capabilities. Multiple users may concurrently operate the Library DBMS performing several tasks at the same time. They may even try to access the same data simultaneously. It is the job of a DBMS to handle the data and its processing in an integrated, coordinated and consistent manner. Finally, the Library DBMS must have mechanisms to handle system failure (e.g., failure of power, disk crash, etc.) so that the database can be recovered to a consistent state.
  5. A database management system (DBMS) is a collection of programs that facilitates the process of defining, constructing and manipulating databases. Defining a database involves specifying the types of data to be stored in the database. Constructing the database is the process of storing the data. Manipulating a database includes querying the database, updating the database and generating reports from the data. A DBMS does the following: Adding new, empty files to the database Inserting new data into existing files Retrieving data from existing files Updating data in existing files Deleting data from existing files Removing existing files, empty or otherwise, from the database
  6. DBMS Functions : Data Definition Data Manipulation Data Security and Integrity Data Recovery and Concurrency Data Dictionary Performance
  7. A database management system is a complex piece of software that usually consists of a number of modules. The DBMS may be considered as an agent that allows communication between the various types of users with the physical database and the operating system without the users being aware of every detail of how it is done. To enable the DBMS to fulfil its tasks, the database management system must maintain information about the data itself that is stored in the system. This information would normally include what data is stored, how it is stored, who has access to what parts of it and so on. The information (data) about the data in a database is called the metadata. In addition to information listed above, some information regarding the use of a database is often collected to monitor the system's performance. This metadata helps management in maintaining an effective and efficient database system. Three broad classes of users Application programmers: Responsible for writing application programs that use the database End users: Interact with the system from workstations or terminals. A given end user can access the database via one of the applications, or can use an interface provided as an integral part of the database system software (such interfaces are also supported by means of applications, of course, but those applications are built-in, not user-written, e.g., query language processor) Database Administrator (DBA): Creates the actual database and implements technical controls needed to enforce various policy decisions. The DBA is also responsible for ensuring that the system operates with adequate performance and for providing a variety of other related technical services
  8. One fundamental characteristic of the database approach is that it provides some level of data abstraction by hiding details of data storage that are not needed by most database users. A data model is the main tool for providing this abstraction. A data model is a set of concepts that can be used to describe the structure of a database. It is a collection of high-level data description constructs that hide many low-level storage details. Categories of Data Models : Many data models have been proposed. We can categorize data models based on the types of concepts they provide to describe the data structure. High Level or conceptual data models: provide concepts that are close to the way many users perceive data. Use concepts such as entities, attributes, and relationships, where Entity represents a real world object (e.g., student, employee) or concepts (e.g., course, company), Attribute represents properties that describes objects (e.g., color, name) while Relationships represent an interaction or links among entities (e.g., works-on, is-a, has, etc.) Low-level or physical data models: provide concepts that describe the details of how data is stored in the computer. Concepts provided by low-level data models are generally meant for computer specialists, not for typical end users. Represent information such as record formats, record orderings, and access paths (structure that makes the search for particular database records efficient i.e. indexing) Representational or implementation: Between above two extremes is a class of representational (or implementation) data models, which provide concepts that may be understood by end users but that are not too far removed from the way data is organized within the computer. Representational data models hide some details of data storage but can be implemented on a computer system in a direct way.
  9. Three important characteristics of the database approach are (a) Insulation of programs and data (program-data and program-operation independence). (b) Support of multiple user views. (c) Use of a catalog to store database description. The three schema architecture was proposed to achieve these characteristic. The Three levels of architecture : The goal of the three schema architecture is to separate the user applications and the physical database. The internal level is the one closest to the physical storage, i.e., it is the one concerned with the way data is physically stored The external level is the one closest to the user, i.e., it is the one concerned with the way data is viewed by individual users The conceptual level is a level of indirection between the other two There will be many distinct external views, each consisting of a more or less abstract representation of some portion of the total database, and there will be one conceptual view, consisting of a similarly abstract representation of the database in its entirety. Likewise there will be precisely one internal view, representing the total Database as physically stored.
  10. Mappings The conceptual/internal mapping : defines the correspondence between the conceptual view and the stored database; it specifies how conceptual records and fields are represented at the internal level The external/conceptual mapping : defines the correspondence between a particular external view and the conceptual view
  11. A description of data in terms of a data model is called a schema. The description of a database is called database schema, which is specified during database design and is not expected to change frequently. The Internal View/ Schema : The internal view (or stored database) is a low-level representation of the entire database. The internal view is defined by the internal schema, which defines the various stored record types and specified what indexes exist, how stored fields are represented and what physical sequence the stored records are in, etc. The Conceptual View / Schema : The conceptual view is a representation of the entire content of the database, in a form that is somewhat abstract in comparison with the way in which the data is physically stored. The conceptual view is defined by means of the conceptual schema, which includes definitions of each of the various conceptual record types. The External View / Schema : Each external view is defined by means of an external schema. External schema consists of definitions of each of the various external record types in that external view. There must be a definition of the mapping between the external schema and the underlying conceptual schema.
  12. The three level database architecture allows a clear separation of the information meaning (conceptual view) from the external data representation and from the physical data structure layout. A database system that is able to separate the three different views of data is likely to be flexible and adaptable. This flexibility and adaptability is data independence. Physical data independence: The separation of the conceptual view from the internal view enables us to provide a logical description of the database without the need to specify physical structures. This is often called physical data independence. Logical data independence: Separating the external views from the conceptual view enables us to change the conceptual view without affecting the external views. This separation is sometimes called logical data independence. Functions of the DBA (Database Administrator): Defining the conceptual schema -- conceptual database design Defining the internal schema -- physical database design and define the associated mapping between the internal and conceptual schemas Liaison with users Defining security and integrity rules Defining backup and recovery procedures Monitoring performance and responding to changing requirements
  13. The most well-known record-based models are the relational model, the network model and the hierarchical model. Relational model: In this model, each database item is viewed as a record with attributes. A set of records with similar attributes is called a table. Most of the popular commercial DBMS products like Oracle, Sybase, MySQL, etc. are based on relational model. Network model: represents data as record types. However, unlike the relational model, here we have explicit linkages (expressed in the form of pointers) which relate various records. Each record has a link field corresponding to every relationship which it participates in. IDS (Integrated Data Store) is one of the DBMS product based on network models. Hierarchical Model: represents data as hierarchical tree. This is a special kind of a network model in which the relationship is essentially a tree-like structure, where one parent may have many children but one child can not have more than one parent. The relationship borrower to books in a library system satisfies this condition. One of the popular DBMS based on hierarchical model is Information Management System (IMS) from IBM. Object Oriented model: represents DB in terms of objects, their attributes, and their behaviors.
  14. THE FOUR PHASES TO DESIGN ANY DATA BASE SYSTEM ARE: 1. FORMULATION OF INFORMATION REQUIREMENT & ANALYSIS PHASE: This phase is also called Feasibility phase. In this phase, through the interviews and reviewing all related documents and policies in the organization, the following items are identified: a. Clear and concise definition of the problem b. Local dependency lists c. local dependency diagrams d. Local Schema 2. LOGICAL SCHEMA DESIGN PHASE: In this phase the following items are performed: a. Consolidation of dependency lists. b. Consolidation of logical schema. The output of this phase is a logical schema that is independent of all computer hardware and software systems. 3. IMPLEMENTATION DESIGN PHASE: In this phase the logical schema which was designed in the Logical Design Phase is modified to fit the specific data model, hardware and software system that the designer wants to use. This new schema is called IMPLEMENTATION SCHEMA. 4. PHYSICAL DESIGN PHASE: In this phase the Implementation Schema which was designed in the Implementation Phase is programmed using the DDL (Data Definition Language) or any other software language which is available for the programmer.
  15. Topics Covered : Definition: RDBMS Features of an RDBMS Some Important Terms Properties of Relations Keys Referential Integrity Summary
  16. Domain : An attribute of an entity set has a particular value. The set of possible values that a given attribute can have is called its domain. For example, the set of values that the attribute EMPLOYEE.id can assume is a positive integer of 5 digits. Primary Key : A unique identifier for the table (a column or a column combination with the property that at any given time no two rows of the table contain the same value in that column or column combination)
  17. Key: An attribute or set of attributes whose values uniquely identify each entity in an entity set is called a key for that entity set. Super Key: If we add additional attributes to a key, the resulting combination would still uniquely identify an instance of the entity set. Such augmented keys are called super keys. Primary key: It is a minimum super key. Candidate Keys : There may be two or more attributes or combinations of attributes that uniquely identify an instance of an entity set.These attributes or combinations of attributes are called candidate keys. In such a case we must decide which of the candidate keys will be used as the primary key. The remaining candidate keys would be considered alternate keys. Secondary Key: A secondary key is an attribute or combination of attributes that may not be a candidate key but that classifies the entity set on a particular characteristic. A case in point is the entity set EMPLOYEE having the attribute department, which identifies by its value all instances EMPLOYEE who belong to a given department. Any key consisting of a single attribute is called a simple key while that consisting of a combination of attributes is called a composite key.
  18. A set of fields is a key for a relation if : 1. No two distinct tuples can have same values in all key fields, and 2. This is not true for any subset of the key. If there’s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key . Eg. sid is a key for Students. (What about name ?) The set {sid, gpa} is a superkey. Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key . Foreign key: Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’. Eg. sid is a foreign key referring to Students: – Enrolled (sid: string, cid: string, grade: string)– If all foreign key constraints are enforced, referential integrity is achieved, ie., no dangling references. Enforcing Referential Integrity Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students. What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!) What should be done if a Students tuple is deleted? – Also delete all Enrolled tuples that refer to it. – Disallow deletion of a Students tuple that is referred to. – Set sid in Enrolled tuples that refer to it to a default sid – (In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’) Similar if primary key of Students tuple is updated.
  19. Summary : A tabular representation of data. Simple and intuitive, currently the most widely used. Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. – Two important Integrity Constraints: primary and foreign keys – In addition, we always have domain constraints. Powerful and natural query languages exist.
  20. Topics Covered : Database Design E-R Modeling Example E-R Diagrams Summary Case Studies
  21. The database design can be divided into following steps: Requirement Analysis: First of all, we should be clear about what the users want from database, what data to be stored, and operations to be performed. Conceptual Design: The information gathered in the requirements analysis step is used to develop a high level description of the data to be stored in the database. In this step we have to address the following: -What are the entities and relationships in the enterprise? -What information about these entities and relationships should we store in the database? -What are the integrity constraints or business rules that hold? This step is often carried out using the ER model, or a similar high-level model. A database `schema’ in the ER Model can be represented pictorially ( ER diagrams ). Logical Database Design: We must choose a DBMS to implement our database design, and convert the conceptual database design into a database schema in the data model of the chosen DBMS. For example, we can map an ER diagram into a relational database schema. Schema Refinement (Normalization): Check relational schema for redundancies and related anomalies. Physical Database Design and Tuning : Consider typical workloads and further refine the database design.
  22. The Basic Design Phases is divided into different Phases:1. Requirement Collection & Analysis : - The Database Designers Interview Prospective Database users to understand andDocument their Data requirements. The result of this step is concisely written set of users requirements. This concept of user defined operations that will be applied to the database and they include both retrievals and updates in soft ware design. 2. Conceptual Design :It is a concise description of the data requirements of the users and include detailed descriptions of the entity types , relationships and constraints and they are expressed using The concepts provided by the high level data model. 3. Logical Design : Identification of Data Model Mapping is done here. RDBMS / DBMS / Object Model 4. Physical Design : The Internal storage structures / access paths and file organizations for the database files are specified. These Activities and application programs are designed and implemented as database transactions corresponding to the high level specifications.
  23. Entity : An Entity is a thing that exists and is distinguishable. For example, each chair is an entity. So is each person and each automobile. Entities can have concrete existence or constitute ideas or concepts. Concepts like love and hate are entities. Entity Set : A group of similar entities forms an entity set. Examples of entity sets are: 1. All Persons 2. All Automobiles 3. All Emotions Attributes : Attributes are the properties that characterize an entity set. For Example, employees of an organization are modeled by the entity set EMPLOYEE. We must include in the model the properties of the employees that may be useful to the organization. Some of these properties are name, address, skill etc. Relationship: It is an association between two or more entities. For example, we may have the relationship that an employee works in a department.
  24. There are two types of entities: regular and weak. A regular (independent) entity does not depend on any other entity for its existence. For example, Employee is a regular entity. A regular entity is depicted using a rectangle. An entity whose existence depends on the existence of another entity is called a weak (or dependent) entity. For example, the dependent of an employee is a weak entity, whose existence depends on the entity Employee. A dependent entity is depicted in a double-lined box, or a darkened rectangle. Similarly, relationships can also be regular or weak.
  25. Entity: Real- world object distinguishable from other objects. It could be an object, place, person, concept or activity about which an enterprise records data. To qualify something as an entity, it should – Have an independent existence – Be of interest to us. An entity is described (in DB) using a set of attributes . Entity Set : A collection of similar entities. Eg., all employees. – All entities in an entity set have the same set of attributes. (Until we consider ISA hierarchies, anyway!) – Each entity set has a key . – Each attribute has a domain . – Can map entity set to a relation easily
  26. A relationship is defined as an association among entities. For example, there is a relationship between students and course, which can be named as ‘enrols in’. A relationship set is an association of entity sets (eg. student- course) while a relationship instance is an association of entity instances (eg. Ravi- DBMS). An n- ary relationship set R relates n entity sets E1 ... En; each relationship in R involves entities e1 E1, ..., en En Same entity set could participate in different relationship sets, or in different “roles” in same set. A relationship is depicted by a diamond, with the name of the relationship type. There are three types of relationships: - One-to-one: One student is issued only one card (and vice-versa). - One-to-many (or many-to-one): One Student can enrol for only one course, but one course can be offered to many students. - Many-to-many: One Student can take many tests, and one test can be taken by many Students.
  27. In above figure, we show the relationship set Works_in, in which each relationship indicates a department in which an employee works. The entities are described by a set of attributes and identified by primary keys (PK). Employee: Attributes ssn, name, lot PK: ssn Department: Attributes: did, dname, budget PK: did The entity sets that participate in a relationship set need not be distinct; sometimes a relationship might involve two entities in the same entity set. For example, in Reports_To relationship set, every relationship is of the form (emp1, emp2). An instance of a relationship set is a set of relationships. Intuitively, an instance can be thought of as a ‘snapshot’ of the relationship set at some instance in time.
  28. Relationship sets can also have descriptive attributes (e. g., the since attribute of Works_ In). A relationship must be uniquely identified by the participating entities, without reference to the descriptive attributes. In the Works_in relationship set, for example, each Works_in relationship must be uniquely identified by the combination of employee ssn and department did. Thus, for a given employee-department pair, we cannot have more than one associated since value. Thus, in translating a relationship set to a relation, attributes of the relation must include: Keys for each participating entity set (as foreign keys). This set of attributes forms superkey for the relation. All descriptive attributes.
  29. A key constraint between an entity set S and a relationship set restricts instances of the relationship set by requiring that each entity of S participate in at most one relationship. Consider Manages: Each dept has at most one manager, according to the key constraint on ‘Manages’ relationship (In contrast, Works_In relationship of earlier slide shows that an employee can work in many departments and a dept can have many employees). The arrow from Department to Manages indicates that each Department entity appears in at most one Manages relationship in any allowable instance of Manages. Thus given a Department entity, we can uniquely determine the Manages relationship in which it appears. Translating ER Diagrams with Key Constraints: Map relationship to a table: Note that did is the key now! – Separate tables for Employees and Departments. Since each department has a unique manager, we could instead combine Manages and Departments. Manages table without Key constraint: CREATE TABLE Manages( ssn CHAR( 11), did INTEGER, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments)
  30. Ternary Relationship: A relationship set involving three entity sets is known as a ternary Relationship. Eg. Works_in relationship involving Employee, Department and Location Entity sets. In above slide, we show a ternary relationship with a key constraint. The key constraint indicates that each employee works in at most one department, and at a single location. Notice that each department can be associated with several employees and locations, and each location can be associated with several departments and employees; however, each employee is associated with a single department, and location.
  31. The key constraint on Manages tells us that a Department has at most one Manager (indicated by arrow). Let us now ask: Does every department have a manager? If so, this is a participation constraint: the participation of Departments in Manages is said to be total (vs. partial ). The total participation is indicated by a dark line between entity and relationship. A participation that is not total is said to be partial. Eg. participation of Employee in Manages is partial. The participation constraint specifies whether the existence of an entity depends on its being related to another entity via the relationship type. A participation constraint between an entity set S and a relationship set restricts instances of the relationship set by requiring that each entity of S participate in at least one relationship. Every did value in Department table must appear in a row of the Manages table (with a non- null ssn value!). Similarly, every ssn value in Employee table must appear in a row of the Works_in table. Participation Constraints in SQL: We can capture participation constraints involving one entity set in a binary relationship, but little else (without resorting to CHECK constraints). CREATE TABLE Dept_ Mgr( did INTEGER, dname CHAR( 20), budget REAL, ssn CHAR( 11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE NO ACTION )
  32. A weak entity’s existence is dependent on another (owner) entity. Hence a weak entity will not have it’s own key. It can be identified uniquely only by considering the primary key of it’s owner entity. – Owner entity set and weak entity set must participate in a one-to-many relationship set (1 owner, many weak entities). – Weak entity set must have total participation in this identifying relationship set. Translating Weak Entity Sets: Weak entity set and identifying relationship set are translated into a single table. – When the owner entity is deleted, all owned weak entities must also be deleted. Eg. If the employee quits, any policy owned by the employee is terminated. All the relevant policy and dependent information is also deleted from the database. To indicate that Dependent is a weak entity and policy is its identifying relationship, we draw both with dark lines. CREATE TABLE Dep_ Policy ( pname CHAR( 20), age INTEGER, cost REAL, ssn CHAR( 11) NOT NULL, PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE )
  33. As in C++, or other Programming Languages, attributes are inherited. If we declare A ISA B, every A entity is also considered to be a B. entity. (Query answers should reflect this: unlike C++!) Overlap constraints : Can Joe be an Hourly_ Emp as well as a Contract_ Emp entity? ( Allowed/ disallowed ) Covering constraints : Does every Employee entity also have to be an Hourly_ Emp or a Contract_ Emp entity? (Yes/ no) Reasons for using ISA : – To add descriptive attributes specific to a subclass . – To identify entities that participate in a relationship Translating ISA Hierarchies to Relations: General approach: – 3 relations: Employee, Hourly_ Emp and Contract_ Emp. Hourly_ Emp : Every employee is recorded in Employee. For hourly emps, extra info recorded in Hourly_ Emp ( hourly_ wages, hours_ worked, ssn) ; must delete Hourly_ Emps tuple if referenced Employees tuple is deleted). Queries involving all employees easy, those involving just Hourly_ Emp require a join to get some attributes. Alternative: Just Hourly_ Emp and Contract_ Emp. – Hourly_ Emp : ssn, name, lot, hourly_ wages, hours_ worked. – Contract_ Emp : ssn, name, lot, contractid. – Each employee must be in one of these two subclasses
  34. Aggregation Aggregation is meant to represent a relationship between a whole object and its component parts. Used when we have to model a relationship involving (entitity sets and) a relationship set . – Aggregation allows us to treat a relationship set as an entity set for purposes of participation in (other) relationships. – Eg. A Project is sponsored by a Department. This is a simple relationship. An Employee monitors this Sponsorship (and not Project or Department). This is aggregation. – Monitors mapped to table like any other relationship set. Aggregation vs. ternary relationship: Can we express relationships involving other relationships without using aggregation? – The use of aggregation vs. ternary relationship may be guided by certain integrity constraints. – Eg. we can impose a constraint that each sponsorship is monitored by at most one employee (not possible without aggregation).
  35. Conceptual Design Using the ER Model Design choices: – Should a concept be modelled as an entity or an attribute? – Should a concept be modelled as an entity or a relationship? – Identifying relationships: Binary or ternary? Aggregation? Entity vs. Attribute Should address be an attribute of Employees or an entity (connected to Employees by a relationship)? Depends upon the use we want to make of address information, and the semantics of the data: If we have several addresses per employee, address must be an entity (since attributes cannot be set- valued). If the structure (city, street, etc.) is important, e. g., we want to retrieve employees in a given city, address must be modelled as an entity (since attribute values are atomic). Otherwise, address can be used as an attribute of Employee.
  36. Similar to the problem of wanting to record several addresses for an employee: we want to record several values of the descriptive attributes for each instance of this relationship. Consider that an employee works in a given department over more than one period. This possibility is ruled out by the ER diagram’s semantics of previous slide. The problem is that we want to record several values for descriptive attributes for each instance of Works_in relationship. We can address this problem by introducing an entity set called Duration, with attributes from and to.
  37. ER diagram OK if a manager gets a separate discretionary budget for each dept. What if a manager gets a discretionary budget that covers all managed depts? – Redundancy of dbudget, which is stored for each dept managed by the manager. – Misleading: suggests dbudget tied to managed dept.
  38. One of the possible designs to resolve the two issues of the previous ER diagram: We model the appointment as an entity set, say Mgr_appt, and use a ternary relationship, say manages, to relate a manager, an appointment, and a department. The dbudget is now associated with the appointment of the employee as manager of a group of departments. The details of an appointment (such as the discretionary budget) are not repeated for each department that is included in the appointment now, although there is still one Manages relationship instance per such Department.
  39. Above figure models a situation in which an employee can own several policies, each policy can be owned by several employees, and each dependent can be covered by several policies. Suppose we have following constraint: Each policy is owned by just 1 employee – Key constraint on Policy would mean policy can only cover 1 dependent!
  40. The key constraints allow us to combine Purchaser with Policy and Beneficiary with Dependent. Participation constraints lead to NOT NULL constraints. CREATE TABLE Policy ( policyid INTEGER, cost REAL, ssn CHAR( 11) NOT NULL, PRIMARY KEY (policyid), FOREIGN KEY (ssn) REFERENCES Employee, ON DELETE CASCADE ) CREATE TABLE Dependent ( pname CHAR( 20), age INTEGER, policyid INTEGER, PRIMARY KEY (pname, policyid), FOREIGN KEY (policyid) REFERENCES Policy, ON DELETE CASCADE )
  41. Constraints in the ER Model: – A lot of data semantics can (and should) be captured. – But some constraints cannot be captured in ER diagrams. Need for further refining the schema: – Relational schema obtained from ER diagram is a good first step. But ER design subjective & can’t express certain constraints; so this relational schema may need refinement. Functional dependencies: – e. g., A dept can’t order two distinct parts from the same supplier . Can’t express this wrt ternary Contracts relationship. – Normalization refines ER design by considering FDs. Inclusion dependencies: – Special case: Foreign keys (ER model can express these). – e. g., At least 1 person must report to each manager. (Set of ssn values in Manages must be subset of supervisor_ ssn values in Reports_ To.) Foreign key? Expressible in ER model? General constraints: – e. g., Manager’s discretionary budget less than 10% of the combined budget of all departments he or she manages .
  42. Regular Entities : Each regular entity type maps into a base relation The database will thus contain 5 base relations : DEPT, EMP, Supplier, Part and Project; the primary keys for these relations being : DEPT#, EMP#, S#, P# and J# Weak Entities : The relationship from a weak entity type to the entity type on which it depends is of course a many-to-one relationship. However the foreign key rules for that relationship be as follows : DELETE CASCADES UPDATE CASCADES
  43. An Entity Type Department has attributes Name , Number, Location , Manager and Manager Start date. Location Is a Multi Valued attribute. Name and Number are key attributes since each was specified to be Unique. An Entity Type Project with attributes Name, Number , Locaiton and Controlling Department. Both Name and Number are key attributes. Employee Entity with attributes name , SSN ( Social Security Number ) , Gender, Birth Date , Salary , Supervisor. Both name and address are composite in nature. Dependent Type is an Weaker Entity , SSN, Name of Dependant , Gender , Date of Birth , Relationship ( To the Employee). NOTE : The Design is called Chen Design for Identifying Entities before implenting ER Diagram.
  44. Number of People Work in the location can be a derived type.
  45. Summary of Conceptual Design Conceptual design follows requirements analysis, – Yields a high- level description of data to be stored ER model popular for conceptual design – Constructs are expressive, close to the way people think about their applications. Basic constructs: entities, relationships, and attributes (of entities and relationships). Some additional constructs: weak entities, ISA hierarchies, and aggregation . Note: There are many variations on ER model. Summary of ER Several kinds of integrity constraints can be expressed in the ER model: key constraints, participation constraints, and overlap/ covering constraints for ISA hierarchies. Some foreign key constraints are also implicit in the definition of a relationship set. – Some of these constraints can be expressed in SQL only if we use general CHECK constraints or assertions. – Some constraints (notably, functional dependencies ) cannot be expressed in the ER model. – Constraints play an important role in determining the best database design for an enterprise. ER design is subjective . There are often many ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise. Common choices include: Entity vs. attribute, entity vs. relationship, binary or n- ary relationship, whether or not to use ISA hierarchies, and whether or not to use aggregation. Ensuring good database design: resulting relational schema should be analyzed and refined further. FD information and normalization techniques are especially useful.
  46. Case Studies: 1. Prescriptions-R-X chain The Prescriptions-R-X chain of pharmacies has offered to give you a free lifetime supply of medicines if you design its database. Given the rising cost of health care, you agree. Here's the information that you gather: Patients are identifed by an SSN, and their names, addresses, and ages must be recorded. Doctors are identifed by an SSN. For each doctor, the name, specialty, and years of experience must be recorded. Each pharmaceutical company is identified by name and has a phone number. For each drug, the trade name and formula must be recorded. Each drug is sold by a given pharmaceutical company, and the trade name identifes a drug uniquely from among the products of that company. If a pharmaceutical company is deleted, you need not keep track of its products any longer. Each pharmacy has a name, address, and phone number. Every patient has a primary physician. Every doctor has at least one patient. Each pharmacy sells several drugs and has a price for each. A drug could be sold at several pharmacies, and the price could vary from one pharmacy to another. Doctors prescribe drugs for patients. A doctor could prescribe one or more drugs for several patients, and a patient could obtain prescriptions from several doctors. Each prescription has a date and a quantity associated with it. You can assume that if a doctor prescribes the same drug for the same patient more than once, only the last such prescription needs to be stored. Pharmaceutical companies have long-term contracts with pharmacies. A pharmaceutical company can contract with several pharmacies, and a pharmacy can contract with several pharmaceutical companies. For each contract, you have to store a start date, an end date, and the text of the contract. Pharmacies appoint a supervisor for each contract. There must always be a supervisor for each contract, but the contract supervisor can change over the lifetime of the contract. 1. Draw an ER diagram that captures the above information. Identify any constraints that are not captured by the ER diagram. 2. How would your design change if each drug must be sold at a fixed price by all pharmacies? 3. How would your design change if the design requirements change as follows: If a doctor prescribes the same drug for the same patient more than once, several such prescriptions may have to be stored.
  47. 2. Dane County Airport Computer Sciences Department frequent have been complaining to Dane County Airport officials about the poor organization at the airport. As a result, the officials have decided that all information related to the airport should be organized using a DBMS, and you've been hired to design the database. Your first task is to organize the information about all the airplanes that are stationed and maintained at the airport. The relevant information is as follows: Every airplane has a registration number, and each airplane is of a specic model. The airport accommodates a number of airplane models, and each model is identified by a model number (e.g., DC-10) and has a capacity and a weight. A number of technicians work at the airport. You need to store the name, SSN, address, phone number, and salary of each technician. Each technician is an expert on one or more plane model(s), and his or her experitise may overlap with that of other technicians. This information about technicians must also be recorded. Traffic controllers must have an annual medical examination. For each Traffic controller, you must store the date of the most recent exam. All airport employees (including technicians) belong to a union. You must store the union membership number of each employee. You can assume that each employee is uniquely identified by the social security number. The airport has a number of tests that are used periodically to ensure that air-planes are still airworthy. Each test has a Federal Aviation Administration (FAA) test number, a name, and a maximum possible score. The FAA requires the airport to keep track of each time that a given airplane is tested by a given technician using a given test. For each testing event, the information needed is the date, the number of hours the technician spent doing the test, and the score that the airplane received on the test. 1. Draw an ER diagram for the airport database. Be sure to indicate the various attributes of each entity and relationship set; also specify the key and participation constraints for each relationship set. Specify any necessary overlap and covering constraints as well (in English). 2. The FAA passes a regulation that tests on a plane must be conducted by a technician who is an expert on that model. How would you express this constraint in the ER diagram? If you cannot express it, explain briefly.
  48. 3. University Database: Consider the following information about a university database: Professors have an SSN, a name, an age, a rank, and a research specialty. Projects have a project number, a sponsor name (e.g., NSF), a starting date, an ending date, and a budget. Graduate students have an SSN, a name, an age, and a degree program (e.g., M.S. or Ph.D.). Each project is managed by one professor (known as the project's principal investigator). Each project is worked on by one or more professors (known as the project's co-investigators). Professors can manage and/or work on multiple projects. Each project is worked on by one or more graduate students (known as the project's research assistants). When graduate students work on a project, a professor must supervise their work on the project. Graduate students can work on multiple projects, in which case they will have a (potentially different) supervisor for each one. Departments have a department number, a department name, and a main office. Departments have a professor (known as the chairman) who runs the department. Professors work in one or more departments, and for each department that they work in, a time percentage is associated with their job. Graduate students have one major department in which they are working on their degree. Each graduate student has another, more senior graduate student (known as a student advisor) who advises him or her on what courses to take. Design and draw an ER diagram that captures the information about the university. Use only the basic ER model here, that is, entities, relationships, and attributes. Be sure to indicate any key and participation constraints.
  49. Topics Covered : Normalization and Normal Forms Why Normal Forms The Evils Of Redundancy Refining an ER Diagram First Normal Form Functional Dependencies Example: Constraints On Entity Set Second Normal Form Transitive Dependency Third Normal Form Boyce Codd Normal Form (BCNF) Decomposition of a Relation Scheme Lossless Join Decompositions Summary and Examples
  50. Normalization is a step-by-step decomposition of complex records into simple records. It results in the formation of tables that satisfy certain specified constraints, and represent certain normal forms. Normalization reduces redundancy using the principle of non-loss decomposition. A fully normalized record consists of - A primary key that identifies an entity -A set of attributes that describe the entity Several normal forms have been identified, the most important and widely used of which are first normal form second normal form third normal form and Boyce-Codd normal form.
  51. In order to produce good database design, we should ask questions like: 1) Does this design ensure that all database operations will be efficiently performed and that the design does not make the DBMS perform expensive consistency checks which could be avoided? 2) Is the information unnecessarily replicated? Unless these issues are properly handled several difficulties like redundancy and loss of information may arise. There are several methods to avoid the above mentioned problems. One such method is database decomposition through normalization, which tries to minimize redundancy and the efforts of checking of constraints and dependencies.
  52. Redundancy problems associated with relational schemas: – redundant storage, insert/ delete/ update anomalies Integrity constraints, in particular functional dependencies, can be used to identify schemas with such problems and to suggest refinements. Decomposition should be used judiciously: – Is there reason to decompose a relation? – What problems (if any) does the decomposition cause?
  53. Consider the above ER diagram, with the Works_in relation having a Key constraint indicating that an employee can work in at most one department. ER diagram can be translated into two relations: Worker (ssn, name, lot, since, did) Department (did, dname, budget) – Lots associated with workers. Suppose all workers in a dept are assigned the same lot: D  L ie. did functionally determines lot.This leads to redundancy.
  54. The redundancy in earlier slide can be fixed by breaking the relation Worker as: Workers (ssn, name, since, did) Dept_ Lots( did, lot) Can fine- tune this: Workers (ssn, name, since, did) Department (did, dname, budget, lot)
  55. EMP_PROJ = {eno, ename, {pnumber, hours}}  mutivalued eno is the primary key Above relation not in 1NF Pnumber is the partial primary key of each nested relation. Within each tuple, the nested relation must have unique values of pnumber Break EMP_PROJ as: EMP_PROJ1(eno, ename) EMP_PROJ2(eno, pnumber, hours)
  56. Given a relation R, attribute A is functionally dependent on B if each A in R is associated with precisely one value of B. We say B functionally determines A and represent it as B  A This means that there can be no two tuples which have the same value of attribute A and different values in attribute B. An FD is a statement about all allowable relations. – Must be identified based on semantics of application. – Given some allowable instance r1 of R, we can check if it violates some FD f, but we cannot tell if f holds over R! K is a candidate key for R means that K  R – However, K  R does not require K to be minimal! Role of FDs in detecting redundancy: – Consider a relation R with 3 attributes, ABC. No FDs hold: There is no redundancy here. Given A  B: Several tuples could have the same A value, and if so, they’ll all have the same B value! Reasoning About FDs Given some FDs, we can usually infer additional FDs: ssn  did, did  lot implies ssn  lot
  57. Full Dependency: An attribute B of a relation R is fully functional dependent on attribute A of R if it is functionally dependent on A & not functionally dependent on any proper subset of A. {Eno, Pnumber}  HOURS Full functional dependency: Eno hours and Pnumber Hours DOESN’T HOLD
  58. {Eno, Pnumber}  Ename Partial dependency: Eno  Ename holds.
  59. Consider relation obtained from Hourly_ Emps: – Hourly_ Emps ( ssn, name, lot, rating, hrly_ wages, hrs_ worked ) Notation : We will denote this relation schema by listing the attributes: SNLRWH – This is really the set of attributes {S, N, L, R, W, H}. – Sometimes, we will refer to all attributes of a relation by using the relation name. (e. g., Hourly_ Emps for SNLRWH) Some FDs on Hourly_ Emps: – ssn is the key: S  SNLRWH – rating determines hrly_ wages : R  W Problems due to R  W : – Update anomaly : Can we change W in just the 1st tuple of SNLRWH? – Insertion anomaly : What if we want to insert an employee and don’t know the hourly wage for his rating? – Deletion anomaly : If we delete all employees with rating 5, we lose the information about the wage for rating 5!
  60. General Definition of 2NF : A table is said to be in 2NF when it is in 1NF and every non-prime attribute in the record is functionally dependent upon the whole key, and not just part of the key. The steps for converting a database to 2NF are: Find and remove attributes that are related to only a part of the key Group the removed items in another table Assign the new table a key that consists of that part of the old composite key If a relation is not in 2NF, it can be further normalized into a number of 2NF relations. EP1 Eno, Pnumber, Hours EP2 Eno, Ename EP3 Pnumber, Pname, Plocation EP1, EP2 AND EP3 satisfy 2NF.
  61. The data stored in the table Emp{Eno, Dept, ProjCode, Hours} is in 1NF. The Primary key here is composite: {Eno, ProjCode} The attributes of this table depend upon only part of the Primary key: Eno + ProjCode functionally determines Hours. Eno functionally determines Dept. Attribute Dept has no dependency on ProjCode. The situation could lead to the following problems: Insertion: The record of employee cannot be entered until the employee is assigned a project. Updation: For a given employee, the employee code and department is repeated several times. Hence, if an employee is transferred to another department, this change will have to be recorded in every instance or record of the employee. Any omissions will lead to inconsistencies. Deletion: If an employee completes work on a project, the employee’s record will be deleted. The information regarding the department the employee belongs to will also be lost. This table should therefore be decomposed without any loss of information as: Emp {Eno, Dept} Proj {Eno, ProjCode, Hours}
  62. EMP_DEPT Ename, Eno, Bdate, Addr, Dnumber, Dname, DMgrNo Eno  DMgrNo is a transitive dependency. Dependency of DMgrNo on key attribute Eno is transitive via Dnumber because Eno  Dnumber and Dnumber  DMgrNo hold well. Dnumber is not a subset of the key of EMP_DEPT.
  63. General Definition of 3NF : A relation schema R is in 3NF if whenever a functional dependency X  A hold in R, then either (a) X is a superkey of R or (b) A is a prime attribute of R R is in 3NF if every nonprime attribute of R is (a) fully functionally dependent on every key of R and (b) non-transitively depedent on every key of R. If 3NF is violated by X  A, one of the following holds: X is a subset of some key K We store (X, A) pairs redundantly. X is not a proper subset of any key. There is a chain of FDs K  X  A, which means that we cannot associate an X value with a K value unless we also associate an A value with an X value.
  64. Consider the table Emp: Emp{Eno, Dept, Dept_Head} The primary key here is Eno. The attribute dept is dependent on Eno. The attribute Dept_Head is dependent on Dept. Notice that there is an indirect dependence on the primary key. Emp is in 2NF but not in 3NF because of transitive dependency of Dept_Head on Eno via Dept;. The problems with dependency of this kind are: Insertion: The department head of a new department that does not have any employees as yet cannot be entered. Updation: For a given department, the particular head’s code is repeated several times. Hence, if a department head moves to another department, the changes will have to be made consistently across the table. Deletion: If a particular employee’s record is deleted, the information regarding the head of the department will be a loss of information. The relation is therefore decomposed to the following two relations: Emp{Eno, Dept} Dept{Dept, Dept_Head} Emp and Dept are in 3NF. Natural join of Emp and Dept will recover original EMP table.