The Student Room Group

Step-by-step Normalisation

Someone asked about this and I've also seen a number of posts in the past expressing concern about this process.

I'll use an example database (I found the attributes on Google Images) and explain the normalisation process step-by-step.



Scenario: Customers belong to a particular branch, this branch is supervised by a particular manager. Customers purchase stock. Each stock has a particular title and format.

UNF: List all attributes under a primary key. It's usually obvious which primary key to choose.

CustomerID
Name
Address
BranchNo
BranchManager
StockID
Title
Format

1NF: Separate repeating data into another table and assign compound key. Sometimes this can be done in more than one way and sometimes you can even separate repeating data into three tables. Here's a tip: Look carefully at your UNF, identify which attributes are dependant on the same one attribute and then place it in another table.

CustomerID
Name
Address
BranchNo
BranchManager

CustomerID
StockID
Title
Format

Now we have two tables. When you separate the repeating data into another table, you must assign it the same primary key as the table you separated it from, you must also assign it the key for the attribute the repeating data was dependant on, in this case StockID. Together this forms a compound key (CustomerID and StockID).

2NF: Part-key dependencies are removed.

CustomerID
Name
Address
BranchNo
BranchManager

CustomerID
StockID

StockID
Title
Format

A part-key dependency is when the attributes are only dependant on one of the keys in the compound key above. Hence the name 'part-key'. This is separated into another table, with the dependant attribute as primary key, because Title and Format are only dependant on the StockID and not the CustomerID. Since customers can purchase stock, we know which customers have purchased which stock here. The two bottom tables above link to show this.

3NF: Remove inter-key dependencies, identify foreign keys,


CustomerID
Name
Address
BranchNo*

BranchNo
BranchManager

CustomerID
StockID

StockID
Title
Format

An inter-key dependency is when attributes in a table are not dependant on the key, but instead, another normal attribute in the table. Notice how the branch manager was included in the Customer table, this isn't necessary because the branch manager is only dependant on the BranchNo (number). Therefore this is an inter-key dependency and we separate it into another table. An asterix (*) is placed next to BranchNo in the original table it was separated from to identify it as a foreign key. We now know which customers belong to which branches in this way. It is another link. Or more appropriately, another relationship.


---

If you have any comments, questions etc. let me know. If you have any databases you would like normalised or need help with, feel free to post them here.

NOTE: All tutorials and examples of database normalisation in this thread are not permitted to be redistributed for financial purposes and/or plagiarism. You may link to this thread instead.
(edited 8 years ago)
Reply 1
You have not provided a scenario or given any background info on the business. I will therefore make the following assumptions and normalise accordingly:

Β· Each customer is assigned a unique customer number (CustomerNo) that refers to their personal details i.e. name, address, phone etc.

Β· There are many depots, each depot (DepotName) is assigned a unique number to identify which depot it is (DepotNo)

Β· When the customer orders a product, an order is created detailing the quantity, the product, the depot it is being delivered from and which customer ordered it

Β· Each product is assigned a unique number (ProductNo) which references that particular product’s details i.e. name and pricing etc.

UNF: List all attributes under a primary key. You’ve already provided this information. Here it is below:

CustomerOrderNo
CustomerNo
CustomerName
CustomerAddress
CustomerTelNo
DepotNo
DepotName
ProductNo
ProductName
ProductQuantity
ProductPrice

1NF: Separate repeating data into another table and assign compound key. Sometimes this can be done in more than one way and sometimes you can even separate repeating data into three tables. Here's a tip: Look carefully at your UNF, identify which attributes are dependent on the same one attribute and then place it in another table.

CustomerOrderNo
CustomerNo
CustomerName
CustomerAddress
CustomerTelNo
DepotNo
DepotName

CustomerOrderNo
ProductNo
ProductName
ProductQuantity
ProductPrice

Now we have two tables. When you separate the repeating data into another table, you must assign it the same primary key as the table you separated it from, you must also assign it the key for the attribute the repeating data was dependant on, in this case ProductNo. Together this forms a compound key (CustomerOrderNo and ProductNo).

2NF: Part-key dependencies are removed.

CustomerOrderNo
CustomerNo
CustomerName
CustomerAddress
CustomerTelNo
DepotNo
DepotName

CustomerOrderNo
ProductNo
ProductQuantity

ProductNo
ProductName
ProductPrice

A part-key dependency is when the attributes are only dependant on one of the keys in the compound key above. Hence the name 'part-key'. This is separated into another table, with the dependant attribute as primary key, because ProductName and ProductPrice is only dependant on the ProductNo and not the CustomerOrderNo.

3NF: Remove inter-key dependencies, identify foreign keys.

CustomerOrderNo
*CustomerNo
*DepotNo

CustomerNo
CustomerName
CustomerAddress
CustomerTelNo

DepotNo
DepotName

CustomerOrderNo
ProductNo
ProductQuantity

ProductNo
ProductName
ProductPrice

DepotNo
DepotName

An inter-key dependency is when attributes in a table are not dependant on the key, but instead, another normal attribute in the table. In this case, CustomerNo and DepotNo were normal attributes but had several other normal attributes in the same table dependant on them, so they were taken out. The asterix (*) marks it as a foreign key to establish a link/relationship between the tables.

We now know which customer placed what order and can trace the details all the way back to the other tables because of the unique numbers.

NOTE: Something doesn’t seem right, I will have another look at it later.

EDIT: Updated! Hopefully that should be correct.

NOTE: All tutorials and examples of database normalisation in this thread are not permitted to be redistributed for financial purposes and/or plagiarism. You may link to this thread instead.
(edited 8 years ago)
Reply 2
Edited, should be okay now. The reason for the two tables containing CustomerOrderNo is because ProductQuantity is dependant on both the ProductNo (because which product?) and the CustomerOrderNo (because which order?). And you also need to know which customer placed the order and the depot assigned to that order.

In reality, your software/RDBMS/code would automate this and generate it almost simultaneously based on the customer's orders. Both CustomerOrderNo would be the same, however for the compound key the ProductNo could be different allowing for multiple products to be ordered in the same order by the customer. Sorry if that doesn't make sense.

End up with 6 tables.

NOTE: All tutorials and examples of database normalisation in this thread are not permitted to be redistributed for financial purposes and/or plagiarism. You may link to this thread instead.
(edited 8 years ago)
Scenario no 1:
UNF: Course_Summary (Course_Code [Course#], Course_Name, Course_Level, Start_Date, Finish_Date, Course_Cost), {Staff#, L#, Name, Center, C_Phone, Class_Size}, {S#, Name, Phone, Center, L#})

Secnario no 2:
UNF: Course_Booking (Booking_Ref_No, Booking_Date, Student_No, Student_Name, Student_Address, Student_Telephone, Course_Code, Course_Name, Course_Level, Start_Date, Finish_Date, Lecturer_Name, Centre_Name, Course_Cost)

Please Any one help me to Normalize it to 3NF. It's Urgent.

Thank you
Reply 4
Currently im trying to do an assignment question which is assigned to me which is asking me the normalization process. im aware of the 3 normal form. but im stuck with the 2nd and 3rd normalisation form because i feel that im going to get the similar answer. Am i doing something wrong? My UNF and 1NF forms are below:

UNF:

Employee_Id, Employee_Gender, Employee_Position, Employee_Name, Employee_First, Employee_Last, Member_Id, Member_ Name, Member_ First, Member_ Last, Member_ Feedback, Location_Branch_Id, Dvd_Branch_Stock, Branch_Location, Location_Area, DVD_Id, DVD_Title, DVD_Quantity, DVD_Genres, DVD_Released_Date, Rental_ID, Rental_Borrowed_Date, Rental_Due_Date, Rental_Overdue, Rental_Dvd_Title,Rental_Status

1NF:

Employee_Id
Employee_Gender
Employee_Position

Employee_Id
Employee_Name
Employee_First
Employee_Last

Member_Id
Member_ Feedback

Member_Id
Member_ Name
Member_ First
Member_ Last

Location_Branch_Id
Location_Area

Location_Branch_Id
Dvd_Branch_Stock
Branch_Location

DVD_Id,
DVD_Quantity

DVD_Id
DVD_Title
DVD_Genres
DVD_Released_Date


Rental_ID
Rental_Dvd_Title

Rental_ID
Rental_Borrowed_Date
Rental_Due_Date
Rental_Overdue
Rental_Status

2NF:

Employee_Id
Employee_Gender
Employee_Position

Employee_Id
Employee_Name

Employee_Name
Employee_First
Employee_Last

Member_Id
Member_ Feedback

Member_Id
Member_ Name

Member_ Name
Member_ First
Member_ Last

Location_Branch_Id
Location_Area

Location_Branch_Id
Dvd_Branch_Stock

Dvd_Branch_Stock
Branch_Location

DVD_Id,
DVD_Quantity

DVD_Id
DVD_Title

DVD_Title
DVD_Genres
DVD_Released_Date


Rental_ID
Rental_Dvd_Title

Rental_ID
Rental_Borrowed_Date

Rental_Borrowed_Date
Rental_Due_Date
Rental_Overdue
Rental_Status

What should i do for 3NF???



Currently im trying to do an assignment question which is assigned to me which is asking me the normalization process. im aware of the 3 normal form. but im stuck with the 3rd normalisation form because i feel that im going to get the similar answer. Am i doing something wrong? My UNF and 1NF forms are below:

UNF:

Employee_Id, Employee_Gender, Employee_Position, Employee_Name, Employee_First, Employee_Last, Member_Id, Member_ Name, Member_ First, Member_ Last, Member_ Feedback, Location_Branch_Id, Dvd_Branch_Stock, Branch_Location, Location_Area, DVD_Id, DVD_Title, DVD_Quantity, DVD_Genres, DVD_Released_Date, Rental_ID, Rental_Borrowed_Date, Rental_Due_Date, Rental_Overdue, Rental_Dvd_Title,Rental_Status

1NF:

Employee_Id
Employee_Gender
Employee_Position

Employee_Id
Employee_Name
Employee_First
Employee_Last

Member_Id
Member_ Feedback

Member_Id
Member_ Name
Member_ First
Member_ Last

Location_Branch_Id
Location_Area

Location_Branch_Id
Dvd_Branch_Stock
Branch_Location

DVD_Id,
DVD_Quantity

DVD_Id
DVD_Title
DVD_Genres
DVD_Released_Date


Rental_ID
Rental_Dvd_Title

Rental_ID
Rental_Borrowed_Date
Rental_Due_Date
Rental_Overdue
Rental_Status

2NF:

Employee_Id
Employee_Gender
Employee_Position

Employee_Id
Employee_Name

Employee_Name
Employee_First
Employee_Last

Member_Id
Member_ Feedback

Member_Id
Member_ Name

Member_ Name
Member_ First
Member_ Last

Location_Branch_Id
Location_Area

Location_Branch_Id
Dvd_Branch_Stock

Dvd_Branch_Stock
Branch_Location

DVD_Id,
DVD_Quantity

DVD_Id
DVD_Title

DVD_Title
DVD_Genres
DVD_Released_Date


Rental_ID
Rental_Dvd_Title

Rental_ID
Rental_Borrowed_Date

Rental_Borrowed_Date
Rental_Due_Date
Rental_Overdue
Rental_Status

What should i do for 3NF???
Reply 5
@TP046196

Hi guest, welcome to TSR and I hope you do well in your studies. I'm actually on my way to Technical Director soon; just a while longer in senior development and management. Feel free to ask if you have an interest in coding.

Thanks for tying out before reaching out for help. I've recently done normalisation for sever million DBs but very different from what I wrote here and learned a lot on the job mostly.

Let's go over your DB Schema. Comments in a different colour. Excuse any typos, got a damaged finger all bandaged up in an odd way.

Tip: Don't use mix-case, always lower case.

Let's try and get this right - it'll probably be more towards commercial usage rather than school-worth. Ideally, the end-result would be: (table names in bold)

Database name: shop

employee
id
forename
surname
gender
position

member
id
forename
surname
feedback

The employee and member tables provide individual details independently.

You'll need a record of media collection held by the business, so:

media
id
title
genre
release_date
stock
location_id

A query off of the media table can tell us its details, where it's held (assuming 1 location, otherwise can add another record with different location_id - and of course the primary key ID would be unique to each member record).

A member can rent such media. You'll need to keep track of this and audit:

rental
id
member_id
media_id
employee_id
date_borrowed
date_returned
overdue
status

A query off of the rental table tells us which member it was, the media rented, the employee responsible, the dates where applicable and the overdue as a boolean 0/1 avalue. Status is ambiguous here; assuming it's a small varchar of different values acceptable.

Since a member can give more than one feedback, it would be better normalised in another table to keep track of member feedback. For audit purposes, you'd want to know which member, for what, the feedback, when, the seller etc.

As such:

feedback
id
rental_id
feedback

As explained, this tells us which member left the feedback. We can trace the other details (which media, employee etc.) from the rental table from the rental_id.

location
id
location_area

A record of locations, so that we can specify a specific one from the rental table via the location_id.

An id is always unique to each record and helps with referencing when building the query.

The media table would comprise of things like DVD / Blu-Rays / Online subscription of some sort etc. i.e. the media_type.

Now diving further, if we assume that stock is available for a particular media at a given location then we would also need to separate that out into another table with reference back:

stock
id
media_id
location_id

And remove the 'stock' attribute from the media table. If we ever needed to then figure out stock availability then we can check the stock table. Since the stock table also tells us which location, we can remove the location_id from the media table.
So, finally, we end up with the following:

Database name: shop

employee
id
forename
surname
gender
position

member
id
forename
surname

media
id
media_type
title
genre
release_date

rental
id
member_id
media_id
employee_id
date_borrowed
date_returned
overdue
status

feedback
id
rental_id
feedback

location
id
location

stock
id
media_id
location_id

Example SQL and dummy data:








CREATE TABLE employee (
id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
forenmame VARCHAR(100) NOT NULL,
lastname VARCHAR(100) NOT NULL,
gender ENUM('M', 'F', 'O', 'U':wink: COMMENT 'M = Male, F = Female, O = Other, U = Unknown',
position VARCHAR(255) NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB;

CREATE TABLE member ( id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, forename VARCHAR(100) NOT NULL, surname VARCHAR(100) NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB;


CREATE TABLE media ( id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, media_type VARCHAR(50) NOT NULL COMMENT 'E.g. DVD, Blu-Ray etc.', title VARCHAR(255) NOT NULL, genre VARCHAR(100) NOT NULL, release_date DATETIME DEFAULT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB;

CREATE TABLE rental ( id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, member_id INT(11) UNSIGNED NOT NULL, media_id INT(11) UNSIGNED NOT NULL, employee_id INT(11) UNSIGNED NOT NULL, date_borrowed DATETIME DEFAULT NOW(), date_returned DATETIME DEFAULT NULL, overdue TINYINT(1) DEFAULT 0 COMMENT '1 = Overdue, 0 = Not overdue)', status VARCHAR(255) DEFAULT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB;


CREATE TABLE feedback ( id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, rental_id INT(11) UNSIGNED NOT NULL, feedback VARCHAR(255) DEFAULT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB; CREATE TABLE location ( id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, location VARCHAR(255) NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB;


CREATE TABLE stock ( id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, media_id INT(11) UNSIGNED NOT NULL, location_id INT(11) UNSIGNED NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB;


INSERT INTO employee (id, forenmame, lastname, gender, position) VALUES (1, 'John', 'Snow', 'M', 'Sales':wink:;

INSERT INTO member (id, forename, surname) VALUES (2, 'Mike', 'Tyson':wink:;

INSERT INTO media (id, media_type, title, genre, release_date) VALUES (6, 'DVD', 'Pirates of The Caribbean', 'Fantasy', '2003-06-28 00:00:00':wink:;

INSERT INTO rental (id, member_id, media_id, employee_id, date_borrowed, date_returned, overdue, status) VALUES (3, 2, 6, 1, '2018-06-21 14:22:13', '2018-06-22 10:14:39', 0, NULL);

INSERT INTO feedback (id, rental_id, feedback) VALUES (4, 3, 'It was great thanks!':wink:;

INSERT INTO location (id, location) VALUES (5, 'Cardiff':wink:; INSERT INTO stock (id, media_id, location_id) VALUES (3, 6, 5);






(edited 5 years ago)
It's not really right to think of 1NF, 2NF etc as stages to something. If you are taught them in a class, for example, the teacher may explain what the differences are, and there are many more normal forms actually.
Given a problem, you would just go to 3NF. Then it is automatically in 1NF and 2NF.
Also, it is not an absolute rule. When I made databases, and when I taught databases, there were many examples where not following these requirements actually made the database simpler and no less efficient. It depends...
That looks good but what if you get a Table like this as when I break it down ?Engineer id PKFirst NameLast NameTreatmentDate QualifiedE1 Mike Wolfe Mechanics14/08/2017E2 Fred Smith Fluids 17/11/2008E2 Fred Smith Auto Cad 29/05/2014E3 Wendy Cod Auto Cad 11/03/2012E4 Neil How Structures 30/09/1998E5 Derek Bold Structures 03/05/2012

Quick Reply

Latest