Designing a database is not a magic trick—the design process does not involve trap doors, mirrors, or unseen wires holding things up. Sometimes database designers and data architects fail to think ahead as their designs are composed. Rather than checking on each item, the designer believes that whoever provided them the requirement “knew what they were doing.”
Tables and/or data elements are thrown into their designs because someone has requested an item, or it was listed somewhere in a wish list. The architect is assuming that the sourcing and justification of one or more data items will arise over time, i.e., like magic. But an item’s inclusion on a list is no guarantee of rationality. Due diligence is required for everything, regardless of who has made a request.
The database designers need to understand all data elements included within their designs. This doesn’t mean that they necessarily can present a lecture on the meaning and lineage of every data item, but it does mean they have a basic understanding of each data item’s meaning and that data item’s functional dependency. Furthermore, the designer should have a basic understanding of how this data item will be sourced and updated. And ideally, these thoughts have all been shared and vetted with the subject matter experts involved in the initiative.
Failure to have basic understanding about the meaning, dependency, and sourcing of any data element can result in bigger problems later. It doesn’t mean that full, complete, and polished definitions of every object must be finished first (although that would be nice). But it does mean that the designers or architects should be making detailed notes as they go and ensuring that they understand exactly how things will be populated and maintained.
Any documentation created later, such as data dictionaries, mapping documents, and the like, should be just organizing already understood thoughts. When these steps are not followed, the problems one may have could be a small matter, such as a column needing to be removed, or moved to another location. Or one could have larger troubles where items that are central to the design end up not having a valid source or have meanings drastically different than what was earlier assumed. These bigger issues can result in claims of scope creep as new sources need to be found, new use cases defined, and significant amounts of code thrown away or altered. Complete schemas may need to be reassembled in severely new ways. When changes become drastic, then one falls into the “Can’t-Get-There-From-Here” syndrome. Elements of the database design were not properly fathomed and the lack of understanding came back to haunt everyone. The assumptions of a short trip must be re-arranged, and the new route may not yet be understood.
These kinds of problems occur when the data modelers become isolated from the ETL work and removed from understanding sources, relying on others to worry over all those concerns. Obviously, the further along the development process is when these surprises are uncovered, the worse it may be to costs and resourcing budgets. It’s best to avoid uncovering surprises late in the game where new sources or new target data structures need to be defined that were previously not considered in scope. Avoid finding out a data item means something totally different than what was assumed.
Avoid finding, in the middle of development, that desired elements of the design are not at all supportable. Avoid the “Can’t-Get-There-From-Here” syndrome by understanding your design as you build it, otherwise, as a database designer, one is adding to the chaos, not helping to tame it. Approaching the design process differently is irresponsible. Paying attention to the details only improves one’s final designs.