Weitere ähnliche Inhalte
Ähnlich wie Database Design 2009
Ähnlich wie Database Design 2009 (20)
Database Design 2009
- 1. Database Design
1
What is a Database?
A collection of data that is organised in a predictable
structured way
Any organised collection of data in one place can be
considered a database
Examples
filing cabinet
library
floppy disk
2
©Chisholm Institute
- 2. What is Data?
The heart of the DBMS.
Two kinds
Collection of information that is stored in the
database.
A Metadata, information about the database.
Also known as a data dictionary.
3
Relational Data Model
A relational database is perceived as a
collection of tables.
Each table consists of a series of rows &
columns.
Tables (or relations) are related to each
other by sharing a common
characteristic. (EG a customer or product
(E m p
table)
A table yields complete physical data
independence.
4
©Chisholm Institute
- 3. Features of the relational data
model
Logical and Physical separated
Simple to understand Easy to use
understand. use.
Powerful nonprocedural (what, not how) language to
access data.
Uniform access to all data.
Rigorous database design principles
principles.
Access paths by matching data values, not by
following fixed links.
5
Terminology
Relation
A 2-dimensional table of values with these properties:
No duplicate rows
Rows can be in any order
y
Columns are uniquely named by Attributes
Each cell contains only one value
Employee Job Manager
Jack Secretary Jill
Jill Executive Bozo
Bozo Director
Lulu Clerk Jill
The special value is NULL which implies that there is no
corresponding value for that cell. This may mean the value does not
apply or that it is unavailable. Entire rows of NULLs are not
allowed.
6
©Chisholm Institute
- 4. Terminology
Tuple
Commonly referred to as a row in a relation.
C l f d i l i
Eg:
Jack Clerk Jill
Attribute
• A name given to a column in a relation Each column must have a
relation.
unique attribute. This are often referred to as the fields.
Employee Job Manager
7
Terminology: Domain
A pool of atomic values from which cells a given column take
their values. Each attribute has a domain.
Attributes may share domains
m y m
Tom Mary
Attribute Domain Bozo Kali........
Employee Person Name Typist Manager
Job Job Name Clerk........
Manager Person Name
Here again we use the
same domain as above in
employee.
An attribute value (a value in a column labelled by the attribute)
must be from the corresponding domain or may be NULL ( ).
8
©Chisholm Institute
- 5. Terminology:Relation Schema
A Relational Schema is a named set of attributes. This refers to the
structure only of a relation. It is derived from the traditional set
notation displayed below
EMPLOYEE = { Employee, Job, Manager }
This is usually written in the modified version for database purposes:
EMPLOYEE( Employee, Job, Manager ) referring to the Table
EMPLOYEE
Employee Job Manager
9
Terminology:Integrity Constraint and
Domain Constraint
An Integrity Constraint is a condition that prescribes what
values are allowable in a relation. This permits the restriction of the
type of value that can be placed in a particular cell. Eg. only
numbers for telephone numbers
The Domain Constraint is a condition on the allowable values for an
attribute.
e.g. Salary < $60,000
Employee Job Manager Salary
Jack Secretary Jill 25,000 This restricts the
EMPLOYEE salary to be under
Jill Executive Bozo 40,000 a set value.
Bozo Director 50,000
Lulu Clerk Jill 30,000
10
©Chisholm Institute
- 6. Terminology:Key Constraint
A condition that no value of an attribute or set of attributes be
repeated in a relation.
e.g. Employee(the attribute) has only unique values in
EMPLOYEE (the relation)
relation).
The following relation violates this constraint:
EMPLOYEE
Employee Job Manager Salary
Jack appears twice.
Jack Secretary Bozo 25,000
This means that
This violates the Jack Secretary Jill 25,000
Key Constraint Jill Executive Bozo 40,000
Bozo Director 50,000
Lulu Clerk Jill 30,000
11
Terminology:Key Constraint
An attribute (or set of attributes) to which a key constraint applies is
called a key ( or candidate key). Every relation schema must have a key.
EMPLOYEE Another possible key
key.
Employee Job Manager Salary The combination of
Job and manager is
Jack Secretary Bozo 25,000
also unique
Key Kim Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director Bozo 50,000
Lulu Clerk Jill 30,000
Simple Key Composite Key:
If a key constraint applies to a set of attributes, it is
called a composite or Concatenated Key. Otherwise it is a
simple key.
12
©Chisholm Institute
- 7. Terminology:Key Constraint
A key cannot have a NULL ( ) value.
For example, If we change the table so that the Employee Bozo
does not have a manager then Job+Manager cannot be a key.
Employee Job Manager Salary
Jack Secretary Bozo 25,000
Kim
K Secretary Jill
J ll 25,000
25 000
Jill Executive Bozo 40,000
Bozo Director 50,000
Lulu Clerk Jill 30,000
13
Terminology:Key Constraint
A primary key is a special preassigned key that can
always be used to uniquely identify tuples. We have to
choose a Primary Key for every Relation. We must consider
all of the Candidate Keys and choose between them
them.
Employee is a primary key for EMPLOYEE is usually
written as:
EMPLOYEE( Employee, Job, Manager, Salary )
Employee Job Manager Salary
Here we have chosen Jack Secretary Bozo 25,000
the Simple Key Employee
Over the concatenated Kim Secretary Jill 25,000
option of both Jill Executive Bozo 40,000
Job and Manager
Bozo Director Bozo 50,000
Lulu Clerk Jill 30,000
14
©Chisholm Institute
- 8. A Database is more than multiple tables you
must be able to “relate” them
Cus-code Cus-Name Area-Code Phone Agent-Code
10010 Ramus 615 844-2573 502
10011 Dunne 713 894-1238 501
10012 Smith 615 894-2205 502
10013 Olowaski 615 894-2180 502
10014 Orlando 615 222-1672 501
10015 O’Brian 713 442-3381 503
10016 Brown 615 297-1226 502
10017 Williams 615 290-2556 503
10018 Farris 713 382-7185 501
10019 Smith 615 297-3809 503
The link is through the Agent-Code
Agent-Code Agent-Name Agent-AreaCode Agent-Phone
501 Alby 713 226-1249
502 Hahn 615 882-1244
503 Okon 615 123-5589 15
Terminology: Relational Database
A Relational Database is just a set of Relations.
For example
EMPLOYEE Employee Job Manager Salary
Jack Secretary Bozo 25,000
Kim Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director 50,000
Lulu Clerk Jill 30,000
JOB Job Salary
y
Secretary 25,000
,
Which Attribute do you think
Secretary 25,000
relates these two tables
Executive 40,000 together?
Director 50,000
Clerk 30,000
16
©Chisholm Institute
- 9. Terminology:Relational Database Schema
A Relational Database Schema a set of Relation Schemas, together
with a set of Integrity Constraints.
For example the Relations that you have been looking at
with the headings
EMPLOYEE
Employee Job Manager Salary
JOB
Job Salary
are usually written as
EMPLOYEE(Employee, Job, Manager)
JOB(Job, Salary)
Notice how the Primary Keys are underlined
17
Terminology :Referential Integrity Constraint
This constraint says that –
All the values in one column should also appear in another column.
Look at the table below. Every entry in the Job column of the Employee
table must appear in the Job column of the Job table
EMPLOYEE FK PK JOB
Employee Job Manager Job Salary
Jack Secretary Bozo Secretary 25,000
Kim Secretary Jill Secretary
S t 25,000
25 000
Jill Executive Bozo Executive 40,000
Bozo Director Director 50,000
Lulu Clerk Jill Clerk 30,000
PK FK
18
©Chisholm Institute
- 10. Referential Integrity Constraint
Why does the following relational database violate the
referential integrity constraints?
EMPLOYEE FK PK JOB
Employee Job Manager Job Salary
Jack Secretary Bozo Director 50,000
Kim Secretary Jill Clerk 30,000
Bozo Director
Lulu Clerk Jill
PK FK
In other words, Why can’t Employee(Job) be a Foreign Key to
Job(Job), or Employee(Manager) be a Foreignfor the answers
Click here Key to
Employee(Employee)?
19
Why Use Relational
Databases
Their major advantage is they minimise the
need t store the same data i a number of
d to t th d t in b f
places
This is referred to as data redundancy
20
©Chisholm Institute
- 11. Example of Data
Redundancy (1)
21
Example of Data
Redundancy (2)
The names and addresses of all students are
being
b i maintained i th
i t i d in three places
l
If Owen Money moves house, his address
needs to be updated in three separate
places
Consider what might happen if he forgot to
mg pp f f g
let library administration know
22
©Chisholm Institute
- 12. Example of Data
Redundancy (3)
23
Example of Data
Redundancy (4)
Data redundancy results in:
wastage of storage space by recording duplicate
f d d l
information
difficulty in updating information
inaccurate
inaccurate, out-of-date data being maintained
out of date
24
©Chisholm Institute
- 13. Other Advantages of Relational
Databases
Flexibility
relationships
l h (links) are not implicitly defined by
(l k ) l l d f d
the data
Data structures are easily modified
Data can be added, deleted, modified or
queried easily
25
Summary of Some Common
Relational Terms
Entity - an object (person, place or thing) that we
wish to store data about
Relationship - an association between two entities
Relation - a table of data
Tuple - a row of data in a table
Attribute - a column of data in a table
Primary Key - an attribute (or group of attributes) that
uniquely identify individual records in a table
Foreign Key - an attribute appearing within a table
that is a primary key in another table
26
©Chisholm Institute
- 14. Network Diagrams
27
Terminology: Network Diagram
Referential Integrity constraints can easily be represented by
arrows FK PK. The arrow points from the Foreign Key to the
matching Primary Key
g y y
EMPLOYEE(Employee, Job, Manager) JOB(Job, Salary)
A relational database schema with referential integrity constraints can
also be represented by a network diagram. A Referential Integrity
Constraint is notated as an arrow labeled by the foreign key. You must
always write the label of the Foreign Key on the arrow. Sometimes the
same attribute h s diff
s tt ib t has different titl s i diff
t titles in different t bl s
t tables.
EMPLOYEE Job JOB
Manager Network Diagram
Notice here, the label is Manager and not Employee.
28
©Chisholm Institute
- 15. Personnel Database: Consider the following Tables
PRIOR_JOB EXPERTISE
E_NUMBER PRIOR_TITLE E_NUMBER SKILL ASSIGNMENT SKILL
1001 Junior consultant 1001 Stock market E_NUMBER P_NUMBER AREA
1001 Research analyst 1001 Investments
1002 Junior consultant 1002 Stock market 1001 26713 Stock Market
1002 Research analyst 1003 Stock market 1002 26713 Taxation
1003 Junior consultant 1003 Investments 1003 23760 Investments
1004 Summer intern 1004 Taxation 1003 26511 Management
1005 Management 1004 26511
PROJECT 1004 28765
1005 23760
NAME P_NUMBER MANAGER ACTUAL_COST EXPECTED_COST
New billing system 23760 Yates 1000 10000
Common stock issue 28765 Baker 3000 4000
Resolve bad debts 26713 Kanter 2000 1500
New office lease 26511 Yates 5000 5000
Revise documentation 34054 Kanter 100 3000
Entertain new client 87108 Yates 5000 2000
New TV commercial 85005 Baker 10000 8000
EMPLOYEE TITLE
NAME E_NUMBER DEPARTMENT E_NUMBER CURRENT_TITLE
Kanter 1111 Finance 1001 Senior consultant
Yates 1112 Accounting 1002 Senior consultant
Adams 1001 Finance 1003 Senior consultant
Baker 1002 Finance 1004 Junior consultant
Clarke 1003 Accounting 1005 Junior consultant
Dexter 1004 Finance 29
Early 1005 Accounting
Personnel Database Schema
What are the connecting Foreign Keys to Primary Keys?
Not FK, we will look at this
later
PROJECT (NAME, P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )
ASSIGNMENT (E_NUMBER, P_NUMBER) SKILL (AREA)
PRIOR_JOB (E_NUMBER, PRIOR_TITLE)
EXPERTISE (E_NUMBER, SKILL)
TITLE (E NUMBER CURRENT TITLE )
(E_NUMBER,
EMPLOYEE (NAME, E_NUMBER, DEPARTMENT) 30
©Chisholm Institute
- 16. Personnel Database Network Diagram
SKILL EMPLOYEE PROJECT
Once you have produced your Schema and identified the Primary and
Foreign Keys you can create the Network Diagram.The Network Diagram
shows each of the tables with their links. Each of the Tables (Relations)
are represented in a rectangle as shown. They are then connected by
arrows that show the FKs pointing to the PKs, The arrow head points
towards the PK, while the FK name written is the same as the attribute of
the table that has the
th t bl th t h th FK i it in it.
EXPERTISE PRIOR_JOB TITLE ASSIGNMENT
31
Personnel Database Network Diagram
SKILL EMPLOYEE PROJECT
EXPERTISE PRIOR_JOB TITLE ASSIGNMENT
32
©Chisholm Institute
- 17. Summary: Questions
What is a Relational Database?
What is a relation?
What are Constraints?
What is a Schema?
What is a Network Diagram and why is it used?
33
Summary: Answers
A relational database is based on the relational data model.
It is one or more Relations(Tables) that are Related to each other
A relation is a table composed of rows (tuples) and columns, satisfying 5 properties
• No duplicate rows
• Rows can be in any order
• Columns are uniquely named by Attributes
• Each cell contains only one value
• No null rows.
Constraints are central to the correct modeling of business information. Here we
have seen them limit the set up of your tables: Referential Constraint
The Network Diagram is used to navigate complex database structures. It is a
compact way to show the relationships between Relations (Tables)
34
©Chisholm Institute
- 18. Activities
Consider the following relational database
schemas.
h
Suppliers(suppId, name, street, city,state)
Part(partId,partName,weight,length,composition)
Products(prodId, prodName,department)
Supplies(partId,suppId)
Uses(partId,prodId)
Make
M k reasonable assumptions about the meaning of attribute and
s n bl ss mpti ns b t th m nin f tt ib t nd
relations, identify the primary and foreign keys and draw a
network diagram showing the relations and foreign keys.
35
Answer
Supplier Part Product
P d
Supplies Uses
36
©Chisholm Institute
- 19. Show the foreign keys on the network diagrams
Orders
Ordnum ordDate custNumb
12489 2/9/91 124
Customer
custNumb custName Address Balance credLim Slsnumber
124 Adams 48 oak st 418.68 500 3
SalesRep
Slsnumber Name address totCom commRate
3 Mary 12 Way 2150 .05
Part
Part Desc onHand IT wehsNumb unitPrice
AX12 Iron 1.4 HW 3 17.95
37
OrLine
ordNum Part ordNum quotePrice
38
©Chisholm Institute
- 20. Answer
SalesRep
Part
SlsNumber
Part
Customer OrLine
CustNumb orLine
Orders
39
Activities
What problems many arise from this table?
What
h data redundancies are there?
d d d h
What changes would you make? (hint make
another table.
What if I wanted to search by surname?
40
©Chisholm Institute
- 21. Activities
What is wrong with this table?
41
Functional
Dependence FDD
42
©Chisholm Institute
- 22. Functional Dependency
Diagrams
A FUNCTIONAL DEPENDENCY DIAGRAM is a way of
representing the structure of information needed to
support a business or organization
It can easily be converted into a design for a relational
database to support the operations of the business.
43
Data Analysis and Database
Design Using Functional
Dependency Diagrams
1. The
1 Th steps of D
f Data Analysis i FDD are
l i in
1.1 Look for Data Elements
1.2 Look for Functional Dependencies
1.3 Represent Functional Dependencies in a
diagram
1.4 Eli i
1 4 Eliminate R d d
Redundant FFunctional
i l
Dependencies
2. Data Design, after we have our final version of the FDD
2.1 Apply the Synthesis Algorithm
44
©Chisholm Institute
- 23. Starting points for drawing
functional dependency
diagrams
To start the process of constructing our FDD we do the following:
We must Understand the data
We Examine forms, reports,data entry and output screens
etc…
We Examine sample data
We consider Enterprise (business) rules
We examine narrative descriptions and conduct interviews.
We apply our Experiences/Practice and that of others
45
Enterprise Rules
What are Enterprise / Business Rules?
An enterprise rule (in the context of data analysis) is a
statement made by the enterprise (organisation, company,
officer in charge etc.) which constrains data in some way.
ff h ) h h d
Functional dependencies are the most important type of
constraint on data and are often expressed in the form of
enterprise rules.
e.g
No two employees may have the same employee number.
An order is made by only one customer
An employee can belong to only one department at a time.
46
©Chisholm Institute
- 24. Drawing FDDs - Data
Elements
We often refer to Data Elements during the FDD process
A data element is a elementary piece of recorded
information
Every data element has a unique name.
A data element is either a
Label, e.g PersonName, Address,
g
BulidingCode, or
Measurement, e.g. Height, Age, Date
A data element must take values that can be written
down. 47
Functional Dependency
Diagrams
Using the Method of
Decomposition
Given the Sample Data Tables
Problem ONF Eliminate
Repeating
Groups
OR, here is the same
Attribute process using the FDD Universal
& Functional Relation
Dependencies approach
1NF
Functional Eliminate
Dependency Part Key
Diagram Now we have the Dependencies
Database Design
2NF
Relation
Method of 3NF Eliminate Non Key
Synthesis Relation Dependencies 48
©Chisholm Institute
- 25. Data Element
Examples
Here are some examples
PersonName h values Jeff, Jill, G Enid
P N has l ff ll Gio, E d
Address has values 1 John St, 25 Rocky Road
Height has values 171cm, 195cm
Age has values 21,52,93,2
Date has values 20th May 1947, 2nd March 1997
JobName has values Manager Secretary Clerk
Manager, Secretary,
Manager might not be a data element, but
ManagerName could be. It could be a value of another
data element e.g. JobName
49
Drawing FDDs Data
Elements
Start drawing the Functional Dependency Diagram by
representing the Data Elements. A Data Element is
represented by its name placed in a box: Data El
D t Element
t
Every data element must have a unique name in the
functional dependency diagram.
A data element cannot be composed of other data
elements i.e.
it cannot be broken down into smaller components
m mp
A Data Element is also known as an ATTRIBUTE,
because it generally describes a property of some
thing which we will later call an ENTITY
50
©Chisholm Institute
- 26. Drawing FDDs –Using
Elements
A functional Dependency is a relationship between Attributes.
It is shown as an arrow e.g A B
It means that for every value of A, there is only one value for B
It reads “A determines B”.
A is called a determinant attribute
attribute.
B is called the dependent attribute.
51
Data Element Examples
Here are some examples of finding the Data Elements
on a typical form
Surname . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
On a form gives rise to the element
Surname
CREDIT CARD Bankcard Mastercard Visa Other
On a form gives rise to the element
CreditCardType
52
©Chisholm Institute
- 27. Functional Dependency
Examples
Students and their family names
“Each student (identified by student number) has only one
( f y ) y
family name”
Students FamilyName
1 Smith
2 Jones
3 Smith
4 Andrews
Considering the rules stated above we should be able to
draw a FDD for this. What are the elements of
interest?
53
FDDs Answer
Students FamilyName
1 Smith
2 Jones
3 Smith
4 Andrews
Data elements of interest are Student# and FamilyName.
Students determine FamilyName
(or FamilyName depends on Students)
Students FamilyName
Each student has exactly one family name, but the name could be the
name of many students.
So FamilyName does not determine Student# e.g. “Smith is the
name of students 1 and 3
54
©Chisholm Institute
- 28. FDDs Examples
Employees and the departments
they work for.
Department Name
p Accounting
g Department Name
p Sales
Employee Number 11 Employee Number 45
2 27
31
Enterprise Rule: “Each employee works on only one department”
In this example the tables are representing some interesting data
of th b sin ss W see that Empl
f the business. We s th t Employees with the ID numbers 11,2
s ith th n mb s 11 2
and 31 all work in the Accounting Dept and that Employees with the
ID numbers 45 and 27 work in the Sales Dept.
Do you think that you could draw an FDD to represent this? Have a
go and then check your answers
55
FDD Answers
Employees and the departments
they work for.
Department Name Accounting Department Name Sales
Employee Number
E l N b 11 Employee Number 45
2 27
31
Data elements of interest are Employee# and DeptName”
Employee# DeptName
Employee#
p y DeptName
p
11 Acc
So we could make
this following Table 2 Acc
45 Sales
31 Acc
27 Acc 56
©Chisholm Institute
- 29. FDDs Examples
The quantity of parts held in a warehouse
and their suppliers
“Parts are uniquely identified by part numbers”
“Suppliers are uniquely identified by Supplier Names”
“A part is supplied by only one supplier”
A supplier
“A part is held in only one quantity”
Parts Suppliers Name QOH
1 Wang Electronics 23
2 Cumberland Enterprises 80
3 Wang Electronics 4
4 Roscoe Pty Ltd
Pty. 58
Part# determines SupplierName & Part# determines QOH
Parts SupplierName
Parts QOH
Should QOH be a determinant? No, common sense tells us that is not a reliable 57
choice. We could have had repeating values
FDDs Examples
Students and their subjects enrolled.
“Each student is given a unique student number”
“A subject is uniquely identified by its name”
“A student may choose several subjects”
A subjects
Student SubjectName Data element of interest are
1 History Student# and SubjectName
1 Geography
Student
1 Mathematics
1 History
2 English
E li h SubjectName
2 English
There us no functional dependency here.
3 Mathematics
Student# does not determine
3 English SubjectName,
4 French nor does SubjectName determine Student#
4 Geography 58
©Chisholm Institute
- 30. FDDs Examples
Results obtained by each student for
each subject.
“Each student is given a unique student
number”
b ”
“A subject is uniquely identified by its name”
“A student may choose several subjects”
“A student is allocated a result for each subject”
“Each student has only one name.”
Data elements are
Student#, StudentName, SubjectName and Grade
59
FDDs Examples
Results obtained by each student for each
subject.
Student Subject
Student Grade
Name Name
1 Smith History A
1 Smith Geography B
1 Smith Mathematics A
2 Jones History C
2 Jones English C
3 Smith English A
3 Smith Mathematics A
4 Andrews English D
4 Andrews French C
4 Andrews Geography C
Try and construct an FDD for this table considering 60
the given Business Rules and the Data Elements
©Chisholm Institute
- 31. FDDs Examples
Results obtained by each student
for each subject.
We can see that there is only one and only one student name for
each student number, even th
h t d t b though th
h there might be more than one
i ht b th
student with the same name. So….
Student # StudentName
But the subject grade for any student cannot be determined by
the subject name or the student# by itself. A student can have
many grades depending on the subject. How can we cater for
subject
this?
61
FDDs Answer
Results obtained by each student
for each subject.
We need to combine the two Elements to say that there is
one and only one grade for a student doing a particular
subject. Here then is the complete diagram
StudentName
Student
SubjectName Grade
This is called the
Composite Determinant
62
©Chisholm Institute
- 32. FDDs Examples
Customer Orders
Order Part# CustomerName Address
454 12 David Smith 1 John St, Hawthorn
454 23 David Smith 1 John St, Hawthorn
455 32 Emily Jones 45 Grattan St, Parkville
455 49 Emily Jones 45 Grattan St, Parkville
455 54 Emily Jones 45 Grattan St, Parkville
456 12 Mary Ho 44 Park St, Hawthorn
St
456 54 Mary Ho 44 Park St, Hawthorn
Validating functional dependencies
Using simple data and populating the table, check there is only one value of the
dependent.
63
FDDs Examples
“Orders is uniquely identified by its names”
“Customers are uniquely identified by their
names”
“A customer has only one address”
“An order belongs to only one customer”
“A part may be ordered only once one each
order”
Order Parts Ordered CustomerName Address
454 23, 12 David Smith 1 John St, Hawthorn
455 54, 49, 32 Emily Jones 45 Grattan St, Parkville
456 54, 12 Mary Ho 44 Park St, Hawthorn
Order CustomerName Address
Part#
64
©Chisholm Institute
- 33. FDDs Examples
Employees and their tax files
numbers
“Each employee has a unique employee
number”
“Each employee has a unique tax file number ”
Employee TaxFile#
Employee# determines taxfile#
1 1024-5321
Employee# Taxfile#
2 3456-3294
3 8246-7106
Taxfile# determines Employee#
4 8861 6750
8861-6750
Taxfile# Employee#
5 1234-4765
Taxfile# Employee#
Alternative keys 65
Obtain Tutorial 1 from your tutor.
66
©Chisholm Institute
- 34. Functional Dependency
Diagrams
Database Design
Let’s look at the process of converting
the FDD into a schema. We have a 12
step process to do so, that has an
iterative component to it (loop).
The 12 steps are outlined in the next
series of slides.
67
Functional Dependency Diagram
Preparation
1. Represent each d t element as a box.
1 R t h data l t b
2. Represent each functional dependency by an arrow.
3. Eliminate augmented dependencies.
4. Eliminate transitive dependencies.
5. Eliminate pseudo-transitive dependencies.
By this t
B thi stage, intersecting attributes should have b
i t ti tt ib t h ld h been
eliminated.
68
©Chisholm Institute
- 35. Deriving 3NF Schema: Synthesis Algorithm
6. Pick any (unmarked) arrow in the diagram.
7. Follow it back to its source, and write down
the name of the source.
S
S
8. Follow all arrows from the source data item,
and write down the names of their destinations.
A
S B
S, A, B, C
C
S is now the key of a 3NF relation (S , A, B, C).
69
Synthesis Algorithm: Deriving 3NF Schema
9. Mark all the arrows just processed. A
S B
C
10. If there are any unmarked arrows in the diagram, go back to step 6.
11. Finally, determine the Universal Key. Any attribute which is not
determined by any other attribute (ie. has no arrow going into it) is part of
the Universal Key.
U1 U2 U3
12. If the universal key is not already contained in any of the above relations, make
it into a relation. The universal key is the key of the new relation.
70
©Chisholm Institute
- 36. A Fully Worked Example
We will now work from a given set of forms to produce an FDD
then use the 12 steps to produce the Schema. The forms that
p p f
follow show the time spent by a particular employee on a
particular project. They contain details of the employee along
with details of the project. In addition they also state the
hours that the employee has spent on any one project to date.
This is important to the FDD. Notice also that the employee
can have many previous titles and have a number of skills. This
also has to be dealt with in the FDD and then later after we
have used the synthesis technique to create the Schema. Have
h d th nth i t hni t t th S h m H
a good look at the forms on the next 2 slides and try to
develop the FDD yourself.
71
Personnel Database Forms 1
EMPLOYEE
______________________________________________________________________________________________________________
NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES
SKILLS_
SKILLS
______________________________________________________________________________________________________________
Adams 1001 Finance 9th Floor Senior consultant Junior consultant Stock market
Research analyst Investments
______________________________________________________________________________________________________________
PROJECTS
______________________________________________________________________________________________________________
NAME TIME_SPENT P_NUMBER MANAGER ACTUAL_COST EXPECTED_COST
______________________________________________________________________________________________________________
Resolve bad debts 35 26713 Kanter 2000 1500
______________________________________________________________________________________________________________
We say that this table is in “zero normal form” (0NF)
This is because the cells have multiple values, eg. Prior titles and
Skills. The next slide shows forms that demonstrate that an employee
can work on many projects.
72
©Chisholm Institute
- 37. Personnel Database Forms 2
EMPLOYEE
__________________________________________________________________________________________________________
NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES
SKILLS
__________________________________________________________________________________________________________
Baker 1002 Finance 9th Floor Senior consultant Junior consultant Stock market
Research analyst
_____________________________________________________________________________________________________________________
_
PROJECTS
__________________________________________________________________________________________________________
NAME TIME_SPENT P_NUMBER MANAGER_NUM ACTUAL_COST EXPECTED_COST
__________________________________________________________________________________________________________
Res bad debts 18 26713 Kanter 2000 1500
__________________________________________________________________________________________________________
________________________________________________________________________________________________________________
EMPLOYEE
_________________________________________________________________________________________________________
NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES
SKILLS
_________________________________________________________________________________________________________
Clarke 1003 Accounting 8th Floor Senior consultant Junior consultant Stock market
Investments
_________________________________________________________________________________________________________
PROJECTS
_________________________________________________________________________________________________________
NAME TIME_SPENT P_NUMBER MANAGER_NUM ACTUAL_COST EXPECTED_COST
_________________________________________________________________________________________________________
New billing system 26 23760 Yates 1000 10000
New office lease 10 26511 Yates 5000 5000
___________________________________________________________________________________________________________________________
73
Personnel Database FD Diagram
From the forms given we can produce the following
FDD
EXPECTED_COST
_
PROJECT_NAME
ACTUAL_COST
TIME_SPENT
MANAGER_NUM P_NUMBER
EMPLOYEE_NAME
PRIOR_TITLE
E_NUMBER
CURRENT_TITLE
SKILL
DEPARTMENT_NAME LOCATION
74
©Chisholm Institute
- 38. Personnel Database FD Diagram -Synthesis
Let us just consider the section of the FDD that
looks at the project number as the determinant
EXPECTED_COST
PROJECT_NAME
ACTUAL_COST
MANAGER_NUM P_NUMBER
By using the synthesis method we can choose an arrow, trace it
back to the source, and gather together all of the attributes that
the source points to. Try this and see if you can create the
schema for this table.
75
Personnel Database FD Diagram - Synthesis
Again, if we choose another arrow that has not been chosen
before and follow it back to the determinant we find
DEPARTMENT_NAME
DEPARTMENT NAME is a determinant. Gathering all of the
d t min nt G th in ll f th
attributes that it points to we only have the location
attribute. Hence this is a simple table consisting of
DEPARTMENT_NAME as the Primary key and LOCATION as
the only other attribute.
DEPARTMENT_NAME LOCATION
So the table
DEPT(DEPARTMENT_NAME, LOCATION) is created
76
©Chisholm Institute
- 39. Personnel Database FD Diagram - Synthesis
EMPLOYEE_NAME
EMPLOYEE NAME
E_NUMBER CURRENT_TITLE
Likewise for the section of the
FDD based around the
E_NUMBER, creating the following
table for the Employees details.
DE
DEPARTMENT_N ME
MEN NAME
EMPLOYEE (EMPLOYEE_NAME, E_NUMBER, DEPARTMENT, CURRENT
TITLE )
77
Personnel Database FD Diagram - Synthesis
Here we have a slightly more complicated one. The Time spent on the
project is dependent on both the Project number and the Employee
name,
name as it is the time spent by a particular employee on a particular
project. This is demonstrated by the boxing of both the above
attributes together pointing to the TIME_SPENT
P_NUMBER
TIME_SPENT
E_NUMBER
Try to create the Assignment table for this part
of the FDD.When you think you have it have a
look at ours and see if you are right. 78
©Chisholm Institute
- 40. Personnel Database
FD Diagram - Synthesis
P_NUMBER TIME_SPENT
E_NUMBER
The main difference here is that when choosing the arrow to follow back
to the determinant we find that we have 2. This is OK, we just have to
make sure that in the table both of them are the primary Key. We have a
Composite Primary Key consisting P_NUMBER and E_NUMBER. When we
then gather up all of the attributes that they point to together we get
TIME_SPENT. Hence the table is written as
ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)
See the composite primary
key 79
Personnel Database FD Diagram - Universal
Key
Now, the last part of the synthesis is often forgotten. We must collect up
all of the attributes that do not have arrows pointing into them and place
them in the one table called the Universal Key. Every attribute collected
then becomes part of the composite Primary Key. In this case we have the
following attributes inside the box below. Notice how Skill is there, as it
sits by itself. Nothing is its determinant.
P_NUMBER
PRIOR_TITLE
PRIOR TITLE
SKILL E_NUMBER
UK (E_NUMBER, P_NUMBER, PRIOR_TITLE, SKILL)
80
©Chisholm Institute
- 41. Foreign Keys
In the Synthesis Algorithm, a foreign key will arise from any
attribute that is:
A. both a determinant and part of another determinant,
OR
B. both a determinant and a dependent.
TIME_SPENT ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)
A.
P_NUMBER
E_NUMBER EMPLOYEE (E_NUMBER, DEPARTMENT_NAME)
B.
DEPARTMENT_NAME
LOCATION DEPT(DEPARTMENT_NAME, LOCATION)
81
ISA = Is A
In the case of the manager we say that the manager number is
contained within the employee number
Every MANAGER value is a E_NUMBER value.
MANAGER_NUM
ISA
E_NUMBER
MANAGER_NUM
EMPLOYEE PROJECT
Gives rise to a new Foreign Key 82
©Chisholm Institute
- 42. Personnel Database Schema
Generated by Synthesis
PROJECT (NAME, P_NUMBER, MANAGER_NUM, ACTUAL_COST, EXPECTED_COST )
ASSIGNMENT (E_NUMBER,
P_NUMBER, TIME_SPENT) This foreign key
is a result of
MANAGER ISA
UK (E_NUMBER, P_NUMBER, PRIOR_TITLE, SKILL) E_NUMBER
EMPLOYEE (NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )
DEPT(DEPARTMENT, LOCATION)
83
Personnel Database Network Diagram
Generated by Synthesis
DEPT
DEPARTMENT_NAME
MANAGER_NUM
EMPLOYEE PROJECT
E_NUMBER P_NUMBER
ASSIGNMENT
E_NUMBER + P_NUMBER
UK 84
©Chisholm Institute
- 43. A Fully Worked
Example
We now have to take care of the multi-valued areas such as skills and
prior titles. Our FDD synthesis takes care of everything up to that.
titles that
It converts the FDD to what we call “Third normal Form”. We know
that an individual can have many skills and many Prior Titles. They can
also work on many Projects. Knowing the Employee number will not
tell us one and only one value of the Skills that they have. We show
this on the extended FDD with a double arrow notation.The notation
for such a relationship is shown here where E_NUMBER is a
determinant for many values of skill. Consequently the resulting
representation shown on the next slide can be constructed, giving rise
p , g g
to the splitting of the UK to form three more relations
E_NUMBER
SKILL
85
Personnel Database
Multivalued Dependency-Decomposition
MultiValued Dependency ASSIGN (E_NUMBER,
(E NUMBER
P_NUMBER, P_NUMBER)
PRIOR_TITLE
Employees are associated with
MVDs Projects, Titles and Skills
E_NUMBER independently. There is no
direct relationship between
SKILL Projects, Titles and Skills.
PRIOR_JOB (E_NUMBER, PRIOR_TITLE)
EXPERTISE (E_NUMBER, SKILL) Hence we have the
three new relations
ASSIGN, PRIOR_JOB
and EXPERTISE
86
©Chisholm Institute
- 44. Personnel Database FD Diagram with
MVDs and Inclusion
PROJECT_NAME
EXPECTED_COST
MANAGER_NUM
ACTUAL_COST
P_NUMBER
TIME_SPENT
MVD
ISA
EMPLOYEE_NAME
E_NUMBER CURRENT_TITLE
PRIOR_TITL
E
MVD
SKILL
DEPARTMENT_NAME LOCATION 87
Final Personnel Database Schema
PROJECT (NAME, P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )
ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)
Decomposed PRIOR_JOB (E_NUMBER, PRIOR_TITLE)
from UK
EXPERTISE (E_NUMBER, SKILL)
EMPLOYEE (NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )
DEPT(DEPARTMENT, LOCATION)
88
©Chisholm Institute
- 45. Final Personnel Database Network Diagram
DEPT
DEPARTMENT_NAME
MANAGER_NUM
EMPLOYEE PROJECT
E_NUMBER
E_NUMBER E_NUMBER P_NUMBER
EXPERTISE PRIOR_JOB ASSIGNMENT
89
Personnel Database
FD Diagram - Synthesis
EXPECTED_COST
PROJECT_N ME
OJE NAME
ACTUAL_COST
MANAGER P_NUMBER
Choosing any of the arrows and following it back leads you to the
project number (P N b ) Thi is then the P i
j t b (P_Number). This i th th Primary K Key. If you then
th
gather all of the attributes that P_Number points to and place them in
the brackets you get the table Project with P_Number as the
primary Key.
PROJECT (PROJECT_NAME,P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )
90
©Chisholm Institute
- 46. Role Splitting In Functional
Dependency Diagrams
In a Functional Dependency Diagram any group of
attributes can be related in only one way
way.
For example, a pair of attributes can be related
by an FD or not.
Sometimes data can be related in more one way.
For example, a department can have an employee
as its head or as a member.
The member relationship is represented in the
FDD:
E_NUMBER DEPARTMENT_NAME
But the head relationship is represented in the
FDD:
DEPARTMENT_NAME E_NUMBER 91
Role Splitting In Functional
Dependency Diagrams
We c n ch s t
W can choose to split the E NUMBER attribute into E NUMBER and
th E_NUMBER tt ibut int E_NUMBER nd
HOD.
But the foreign key constraint that a Head of Department is an Employee
is lost on the FDD.
E_NUMBER DEPARTMENT_NAME
FDD
Synthesis HOD
ISA
NetworkD DEPARTMENT_NAME
EMPLOYEE DEPT
HOD
92
©Chisholm Institute
- 47. Role Splitting In FDDs
Alternatively, we can choose to split the
DEPARTMENT_NAME attribute into EMPLOYING_DEPT and
HEADED_DEPT.
But h f
B the foreign key constraint that an Employing
k h E l
Department must be a Headed Department is again lost on
the FDD.
E_NUMBER EMPLOYING_DEPT
FDD
Synthesis
S nth sis HEADED_DEPT ISA
NetworkD
EMPLOYING_DEPT
EMPLOYEE DEPT
E_NUMBER
93
Role Splitting Example
Consider this example. We have the Employee
p p y
with many Skills, Prior Titles, as before but we
also have equipment that belongs to a particular
employee, such as a computer and a fax. An
employee can have many different pieces of
equipment. It is worthwhile recognizing them on
the diagram and then decomposing them into
smaller relations as part of the schema
ll l f h h
94
©Chisholm Institute
- 48. Suppose each item of
equipment (identified by
SERIAL#) belongs to an
employee.
SERIAL# DESCRIPTION
PRIOR_TITL
E
MVDs EMPLOYEE_NAME
SKILL
E_NUMBER CURRENT_TITLE
UK ISA
HOD
DEPARTMENT_NAME LOCATION
•MVDs not necessarily embodied in the UK.
•Better to decompose on MVDs first.
•MVDs partition attributes into independent sets.
95
Obtain Tutorial 2 from your tutor.
96
©Chisholm Institute
- 49. ENTITY RELATIONSHIP
ANALYSIS
In this area of the course we concentrate an
another modelling technique called Entity
Relationship Modelling (ERM or ER).
The first stage of this process will look at the
following:
ER Data Model and Notation
Strong E titi
St Entities
Discovering Entities, Attributes
Identifying Entities
Discovering Relationships
97
Critique of FD Analysis
We originally concentrated on the modelling technique
called Functional Dependency Diagrams. They have
limitations as follows:
Disadvantages of FDD
Does not represents real world objects, but only
data;
Cannot represent MVDs or specialization;
Cannot represent multiple relationships without
artificial splitting of attributes;
Entities fragmented during analysis;
98
©Chisholm Institute