Database Normalization
A topic I wish to dive into further
Database normalization is organization using tables and relationships that follow rules of design created for DRYness, flexibility, protection, and dependency.
Without normalization, several issues arise. Database size goes up with repetition, and insert/delete/update all become problems.
While rules of design do exist, they are few, and they are called Normal Form, categorized in a handful of ways. Depending on how many rules the developer chooses to follow determines how their database is described.
⁝ First Normal Form ⁝ Second Normal Form ⁝ Third Normal Form ⁝ BCNF (Boyce-Codd) ⁝ Fourth Normal Form ⁝ Fifth Normal Form ⁝ Sixth Normal Form ⁝
First Normal Form:
Don't repeat groups!
All Rows Must Be Unique Identifiable (Primary Key)
Each Cell Must Have Single Value
Each Value Must Be Simplified As Far As Possible
Each Column Should Contain Values of the Same Type
Data Can Be Easily Fetched with SQL Query
In First Normal Form, multiple values in a cell means a new table! First Normal Form, simply put, is a scalable table design. All databases should be able to be called First Normal Form at the very least.
Accessing dependent count of your employee upon the birth of their child should be simple because the Household Members Table contains that information a single time. Accessing your employee's address should be simple because Employees Table contains basic contact information a single time. Accessing your client's contact information should be simple because Clients Contact Table contains that data a single time. And so on and so forth. No repeated entries that need changing
Problem:
Solution:
Below is another example. In the Old Student Table, redundant information existed as Head of Department was identical in many areas, as was branch_tel. When the Head of Department left the school, many cells would have to be changed. Changing the Student Table into two tables to hold specific information lightens the weight of the database and eases the load of the developer. In the new format, the branch value joins the student and the department/branch, as you can see in the second clipshot below, which demonstrates that redundancy is not something that can be completely eliminated but should definitely be minimized as much as possible. Due to connecting the student and branch table with the branch name, updating the values will only need to occur in one place. This process is capitalizing on necessary redundancy in as minimal a way as possible.
A primary key does not have to be a column of numbers if the developer intends to remain in First Normal Form. A composite key, utilizing two columns value inputs, instead. In order to move on to Second Normal Form, however, the column of unique primary keys must exist for singular reference.
Second Normal Form:
Minimize Redundancy!
Be in First Normal Form
No Partial Dependencies
Relationships Via Foreign Keys
Many to many relationship
Third Normal Form:
Relocate Functional Dependencies!
If the developer's database undergoes frequent change, a Third Normal Form layout would be desirable. Outside this circumstance, however, TNF can tend to become bogged down with superfluous tables very quickly.
BCNF:
Also referenced as 3 1/2 Normal Form, Boyce-Codd Normal Form was created to address the weaknesses in Second and Third.
Fourth Normal Form:
Rarely practical
Fifth Normal Form:
(Potential) Sixth Normal Form:
Sources:











