Normalization

Database Fundamentals

If you’re learning about databases, it won’t be long before you hear the term normalization. It sounds a little scary and some texts used to explain the process can be heavy on jargon, but it is actually intuitive.  Understanding this process is one of the cornerstones of understanding how relational databases work. There are 3 stages of normalization that are usually used in standard database application called normal forms. The normalization process is successive; data needs to meet the first normal form in order to be in the second and so on. 

                There are three important reasons to normalize a database. First, to reduce data duplication. Second, to avoid errors when data is changed. Third, is to make queries simpler. We will look at some data that isn’t normalized to get an idea of how the normalization process works.

    When a table isn’t normalized we often have all the information in a single table, like a spreadsheet. Imagine you had a spreadsheet to record customer orders for your outerwear shop. It could look something like this:

    This is typical for spreadsheet data, but it could be difficult to work with. For example, the ‘Items Bought’ column has more than one item per cell. If you had hundreds or thousands of customer orders stored this way it would hard to update the table. If one of your customer’s address changes you’d have to change it for every order to avoid storing incorrect or inconsistent information in your table. That’s where normalization comes in. 

    First Normal Form (1NF)

        •  Each cell in column should contain a single value

        • Each row in a table needs to be unique

To get our orders table into 1NF we need to break out ‘Items Bought’ to a single item per cell. Thus making each row unique, and following the rule that each cell should be single-valued.

     Another thing to mention is that use of a primary key, a column or set of columns used to identify a table row. In this example, the ‘Key’ column was added to ensure that each row is unique and can be easily identified. In this case, we would need to use multiple columns to identify a row, which is called a composite key.    Now that we’ve gotten our table into 1NF we can move onto 2NF, or second normal form. 

    Second Normal Form (2NF)

        • Be in 1NF

        • Single column primary key

To correctly move our table into 2NF we need to partition the table into 2, one for customers and the other for orders, and for each of the tables to have a primary key column.

     In the tables above, we see each has a single column primary key and we can look up which customer made an order using the ‘Customer Number’ column in the Orders table as a foreign key. A foreign key is a column, or set of columns used to link two tables in a database. Now we only need to make an address change in a single table, reducing the chance of creating errors when this data is updated. Now we can move on to 3NF (third normal form).
Third Normal Form (3NF)        • Be in 2 NF        • No transitive functional dependencies
    Ok, so I know that ‘transitive functional dependencies’ sounds a little like jargon, but the explanation will help clear up what this means. A transitive functional dependency is when changing a non-key column could cause a change in other non-key columns. A great way to think about tables is that each one should represent a single real-world object or process. In this step we create the Purchase Type table and the Items tables. Creating these separate tables means we can safely update, delete, and insert data into our tables without introducing anomalies into the database. 

    Most commercial database applications tend to adhere to 3NF, so I’m going to stop here for our purposes. But there are additional normal forms: Boyce Codd Normal Form, 4NF, 5NF and 6 NF. Some of these normal forms may be impractical everyday use, but it’s worth doing some additional reading to get a better understanding. I’ve included some links to additional reading below. Happy learning!

https://www.essentialsql.com/get-ready-to-learn-sql-database-normalization-explained-in-simple-english/
https://www.geeksforgeeks.org/normal-forms-in-dbms/

https://www.studytonight.com/dbms/database-normalization.php

Leave a comment