Gen AI in Business - Global Trends Report 2024.pdf
Database Normalization Research
1. Research Work
o Database Normalization
Definition
It is the process of organizing data in a database. This includes creating tables
and establishing relationships between those tables according to rules designed both to
protect the data and to make the database more flexible by eliminating redundancy and
inconsistent dependency.
History
Normalization was developed by an IBM researcher named E.F. Codd in the
early 1970s; he also invented the relational database.
Importance
It serves to remove duplication from the database records.
For example if you have more than one place (tables) where the name of a
person could come up you move the name to a separate table and reference it everywhere else.
This way if you need to change the person name later you only have to change it in one place.
It highlights constraints and dependency in the data and hence aid the
understanding the nature of the data
Normalization controls data redundancy to reduce storage requirement
and standard maintenance
Normalization provide unique identification for records in a database
Each stage of normalization process eliminate a particular type of
undesirable dependency
Normalization permits simple data retrieval in response to reports and
queries
The third normalization form produces well designed database which
provides a higher degree of independency
Normalization helps define efficient data structures
Normalized data structures are used for file and database design
Normalization eliminate unnecessary dependency relationship within a
database file
2. Stages of Normal Forms
1. First Normal Form
Eliminate repeating groups in individuals.
Create a separate table for each set related data.
Identify each set of related data with a primary key.
Refers to the first step where preliminary data structures are transforming into the first
normal form by eliminating any repeating sets of data elements. A relation table is said to be on
the first normal form, if and only if it contains no repeating groups that is it has no repeated
value for a particular attribute with a single record. Any repeated group of attribute is isolated to
form a new relation. In other words first normal form (1nf) means that a table has no multiple
value attribute or composite attribute, In the 1nf, each column holds one attribute and each row
holds a single occurrence of the entity.
Tables should have only two dimensions. Since one student has several classes, these classes
should be listed in a separate table. Fields Class1, Class2, and Class3 in the above records are
indications of design trouble.
Spreadsheets often use the third dimension, but tables should not. Another way to look at this
problem is with a one-to-many relationship, do not put the one side and the many side in the
same table. Instead, create another table in first normal form by eliminating the repeating group
(Class#), as shown below:
Stud# Advisor Adv- Room Class#
1022 Jones 412 101-07
1022 Jones 412 143-01
1022 Jones 412 159-02
4123 Smith 216 201-01
4123 Smith 216 211-02
4123 Smith 216 214-01
3. 2. Second Normal Form
Create separate tables for sets of values that apply to multiply
records.
Relate these tables with a foreign key.
2nf concentrated on records with concatenated keys, they check the non key attribute for
dependency on the entire key, and any data element that dependent only on part of the key is
moved to a new entity.
Note the multiple Class# values for each Student# value in the above table. Class# is not
functionally dependent on Student# (primary key), so this relationship is not in second normal
form.
The following two tables demonstrate second normal form:
Students:
Registration:
3. Third Normal Form
Eliminate fields that do not depend on the key.
All data element in the third normal form must be a function of the key. To reach the 3nf,
you need to review the structure’s non-key data elements and identify any data element
dependent on an attribute other than the key, if there is all these data elements should be
moved to a new entity
Adv-Room (the advisor's office number) is functionally dependent on the Advisor attribute. The
solution is to move that attribute from the Students table to the Faculty table, as shown below:
Student# Advisor Adv-Room
1022 Jones 412
4123 Smith 216
Student# Class#
1022 101-07
1022 143-01
1022 159-02
4123 201-01
4123 211-02
4123 211-01
4. Student:
Stud# Advisor
1022 Jones
4123 Smith
Faculty:
Name Room Dept
Jones 412 42
Smith 216 42
4. Fourth Normal Form
It deals with data element with issues of multi-value dependency (when one attributes
determine another attribute sets). A relation is said to be in the 4nf formal form if and if only all
existing multi-value dependency is converted into functional dependency.
CourseId Instructor Textbook
MGS404 Clay Hansen
MGS404 Clay Kroenke
MGS404 Drake Hansen
MGS404 Drake Kroenke
By placing the multivalued attributes I tables by themselves, we can convert the above to 4NF.
Change to:
COURSE-INST(Course-Id, Instructor)
COURSE-TEXT(Course-Id, Textbook)
5. 5. Fifth Normal Form
Where the join dependency is removed, the 5nf is also known as the projection join
normal form(PJNF), and refers to the separation of one relation into any sub-relations or having
sub-relations into one relation and can produce join dependencies.
brand
Traveling Salesman Brand Product type
Jack Schneider Acme Vacuum cleaner
Jack Schneider Acme Breadbox
Willy Loman Robusto Pruning Shears
Willy Loman Robusto Vacuum Cleaner
Willy Loman Robusto Telescope
Willy Loman Robusto Umbrella stand
Louis Ferguson Robusto Vacuum cleaner
Louis Ferguson Robusto Telescope
Louis Ferguson Acme Vacuum Cleaner
Louis Ferguson Acme Lava Lamp
Louis Ferguson Nimbus Tie Rack