Newsletters




Putting Fun in Functional Dependency


Everyone knows and loves the first three normal forms. We go through the process of normalization to remove redundancies in our data structures. But the redundancies we remove have nothing to do with trying to save space. Instead, the desire is to prevent maintenance anomalies. The normalization process involves stepping through the evaluation of normal forms that decompose a data structure into multiple structures based on the needs of these normal forms.

As a quick reminder, First Normal Form (1NF) removes repeating groups; Second Normal Form (2NF) removes partial functional dependencies; and Third Normal Form (3NF) removes transitive dependencies. The resulting structures reflect the semantics of the business; data items are grouped based on having a single value related to each business object; and each business object is represented by a primary key representing its unique identity. The resultant functional dependency-driven structures, in their own way, describe the essentials of the business area being data modeled.

After 3NF, there are two extra normal forms that claim to be tightening things up on 3NF. In other words, if 2NF and 3NF had been thought through properly (and worded correctly), these extra normal forms would never have been necessary. The issue at hand for these special normal forms revolves around composite keys, i.e., composite primary keys and composite alternate keys, which are essentially composite candidate keys.

With 2NF, we have verified that all non-key attributes are functionally dependent on the full primary key of the entity. And in 3NF, we have checked that all non-key attributes are also not dependent on other non-key attributes. But what we have not yet properly considered is the possible functional dependencies between individual attributes of a composite candidate key and attributes of a differing candidate key, or even a non-key attribute. And that is where we now look.

To address these possible anomaly-causing issues are Elementary key normal form (EKNF) and Boyce Codd Normal Form (BCNF). However, the specifics of which normal form addresses what are hard to track down.

Websites flip-flop on their description of which is which. Let’s say the EKNF deals with the case of having two composite candidate keys with at least one overlapping attribute. A classic example is having a course offering entity with two candidate keys: Session Number, Course Number is one candidate key, and Session Number, Course Name is the other.

If there is a functional dependency in which Course Number allows us to determine Course Name, this was not checked and found during our interrogation for 2NF because Course Name and Course Number are both parts of candidate keys, not a non-key attribute. Yes, that’s a data modeling “gotcha” question, because 2NF is only considering non-key attributes, hence, checking candidate key attributes is the blind spot that 2NF missed.

And exactly as we would in 2NF, the EKNF solution is to place Course Number and Course Name into a separate decomposed structure, leaving just Session Number and Course Number in the original structure.

Now for BCNF, we have a slightly different circumstance. This is because here, an attribute within a composite candidate key has a functional dependency on a non-key attribute. Again, since the attribute is part of a candidate key, the transitive dependency was never looked at in the previous forms. However, much of the existing documentation is confusing, as descriptions for these two normal forms are frequently swapped. Additionally, BCNF is often the only one mentioned and assumed to cover both circumstances.

These confusing circumstances make it hard for novice data modelers to feel like they are on solid ground. The good news for everyone is that if one does not have composite keys, one need never worry about these extra normal forms or 2NF, for what it’s worth. That’s data modeling life simplified.


Sponsors