Why do you need all of this normalization stuff? The main goal is to avoid redundancy in your data. Redundancy can lead to various anomalies when you modify your data. Every fact should be stored only once and you should know where to look for each fact. The normalization process brings order to your filing cabinet. You decide to conform to certain rules where each fact is stored.

Nowadays the go-to normal forms are either the Boyce-Codd normal form (BCNF), which we will cover here today, or the third normal form (3NF)

The Boyce-Codd normal form

A relational schema R is considered to be in Boyce–Codd normal form (BCNF) if, for every one of its dependencies X → Y, one of the following conditions holds true:

  • X → Y is a trivial functional dependency (i.e., Y is a subset of X)
  • X is a superkey for schema R

Informally the Boyce-Codd normal form is expressed as “Each attribute must represent a fact about the key, the whole key, and nothing but the key.

Example

Let’s take a look at this table, with some typical data. The table is not in BCNF.

Author Nationality Book title Genre Number of pages
William Shakespeare English The Comedy of Errors Comedy 100
Markus Winand Austrian SQL Performance Explained Textbook 200
Jeffrey Ullman American A First Course in Database Systems Textbook 500
Jennifer Widom American A First Course in Database Systems Textbook 500

 

The nontrivial functional dependencies in the table are:

author → nationality
book title → genre, number of pages

We can easily see that the only key is the set {author, book title}.

The same data can be stored in a BCNF schema. However, this time we would need three tables.

Author Nationality
William Shakespeare English
Markus Winand Austrian
Jeffrey Ullman American
Jennifer Widom American

 

Book title Genre Number of pages
The Comedy of Errors Comedy 100
SQL Performance Explained Textbook 200
A First Course in Database Systems Textbook 500

 

Author Book title
William Shakespeare The Comedy of Errors
Markus Winand SQL Performance Explained
Jeffrey Ullman A First Course in Database Systems
Jennifer Widom A First Course in Database Systems

 

The functional dependencies for this schema are the same as before:

author → nationality
book title → genre, number of pages

The key of the first table is {author}. The key of the second table is {book title}. The key of the third table is {author, book title}. There are no functional dependencies violating the BCNF rules, so the schema is in Boyce-Codd normal form.

How do you decompose your schema into Boyce-Codd normal form?

To go from non-BCNF normal form to BCNF, you must decompose your table using these two steps.

  1. Find a nontrivial functional dependency X → Y which violates the BCNF condition (where the X is not a superkey)
  2. Split your table in two tables:
    • one with attributes XY (all attributes from the dependency),
    • one with X attributes together with the remaining attributes from the original relation

Then you keep repeating the decomposition process until all of your tables are in BCNF. After sufficient iterations you have a set of tables, each in BCNF, such that the original relation can be reconstructed.

Example

Let’s get back to the example. The functional dependency:

book title → genre, number of pages

is one FD violating the BCNF rules. We split our relation into two relations:

  • the ones in the functional dependency (book title, genre, number of pages)
  • the rest: (book title, author, nationality). Note that the left hand-side of the FD (book title) stays in the relation!

The example data look like this. We select the values of columns from the original relation and we eliminate the duplicate rows.

Book title Genre Number of pages
The Comedy of Errors Comedy 100
SQL Performance Explained Textbook 200
A First Course in Database Systems Textbook 500

 

Author Nationality Book title
William Shakespeare English The Comedy of Errors
Markus Winand Austrian SQL Performance Explained
Jeffrey Ullman American A First Course in Database Systems
Jennifer Widom American A First Course in Database Systems

 

Are we done? Nope. The (book title, genre, number of pages) table is in BCNF. But (book title, author, nationality) isn’t. We have the dependency:

author → nationality

Together with the trivial dependency

book title → book title,

the pair (book title, author) is the key of the relation.

We have to decompose the table one more time. This time we decompose into:

  • columns forming the functional dependency: (author, nationality)
  • the remaining columns: (author, book title)

This time every table is in BCNF.

This step is not entirely accurate. It will work in typical cases though, so for now we’ll keep it simple.

Comments

The Boyce-Codd Normal Form (BCNF)

Leave a Reply

Facebook Page

Facebook Page Loading...