Second Normal Form
In simplest terms, for a database to be in Second Normal Form (2NF), the database must already be in 1NF (you must normalize in order), and every column in a table that is not a key has to relate only to the primary key. The most obvious indication that a database is not 2NF is if multiple records in a table might have the exact same value for a column. As an example, if you listed an editor along with each book, this value would be repeated in several rows of the books table.
Looking at the accounting table (Table 3.5), there are a number of problems. For starters, the client information will not necessarily be particular to any one invoice (a client could be billed several times). Second, the expense information is not tied to the invoices either.
To put this database into 2NF, I'll need to separate out these columns into their own tables, where each value will be represented only once. In fact, normalization could be summarized as the process of creating more and more tables until potential redundancies have been eliminated.
To make a database 2NF compliant:
Identify any fields that do not relate directly to the primary key.
As I stated, all of the client information and expense information are not invoice-particular.
Create new tables accordingly (Figure 3.6
The most logical modification for the existing structure is to make separate Clients
, and Expenses
tables. In my visual representation of the database, I create a box for each table, with the table name as a header and all of its columns (also called its attributes
Figure 3.6. To normalize the database, I must move redundant pieces of informationsuch as the client and expense datato their own tables.
Assign or create new primary keys (Figure 3.7
Figure 3.7. Each table in the database should have its own primary key, whether it's a dummy field such as Client ID or a necessary one such as Invoice Number.
Using the techniques described earlier in the chapter, ensure that each new table has a primary key. Because both the Clients
tables do not have good unique identifiers, I'll create artificial ones: Client ID
and Expense ID
. Arguably, the Client Name
field could be unique and therefore could be the primary key, but it's always best to use integers for this purpose.
Repeat Steps 13.
Since I've created new tables with new primary keys, I should double-check to see if there are any 1NF or 2NF problems. In the example (Figure 3.7
), there is one glaring issue: the Expense Category
field may apply to multiple expenses. Therefore, I'll make a new Expense Categories
table (Figure 3.8
Figure 3.8. The Expense Category field, which was part of Expenses, should be its own table as well.
Create the requisite foreign keys indicating the relationships (Figure 3.9
Figure 3.9. For the new primary keys, I've added corresponding foreign keys and indicated the relationships (both one-to-many).
The final step in achieving 2NF compliance is to incorporate foreign keys and relationships to identify how all of the data and tables are associated. Remember that a primary key in one table will most likely be a foreign key in another. If you find that the primary key in one table is not represented as a foreign key in another, you may have missed something (but not necessarily).
Another way to test for 2NF is to look at the relationships between tables. The ideal is to create one-to-many situations. Tables that have a many-to-many relationship may need to be restructured.
If the 1NF rule calls for a horizontal inspection of your tables, then 2NF calls for a vertical inspection: checking for problematic repetitions within a single column over the course of multiple rows.