Data Dictionary: Lean Is A Reasonable Objective

How much time is wasted in a project by data specific misunderstandings? No one really knows because data misunderstandings and their impact is extremely difficult to track. A data dictionary can certainly help but it is not the same as a bag of magical pixie dust. They take time to create, strategies to get them successfully leveraged by teams (that are unaccustomed to using them) and require maintenance in order to stay relevant. 

A data dictionary is a table that describes data and an analyst can exert quite a bit of freedom in how they are created. The most classic representation is a name, description, composition and associated values. One problem I've seen in data dictionaries in the past is that someone adopted a "fill in the template" approach. In one case, someone duplicated whole data sets, changed the names and then submitted the data dictionary as a completed deliverable. A classic example where the importance of submitting a deliverable was regarded higher than completing a high quality deliverable. 

NAME: Compliance Inspector
DESCRIPTION: One of more individuals that perform compliance inspections, documents the results and escalates violations in the appropriate manner. 
COMPOSITION: Employee ID, Employee Name, Office Location, Phone, Email and Department ID.
VALUES: Mandatory: Employee ID, Employee Name

In this example, we are describing an entity that will be responsible for entering data into the system for a specific record type, such as a compliance audit. One thing about the example might jump out to you, if you are into lean data sets. Aside from the name and description, there doesn't appear to be anything unique about this data element from the entering data standpoint. 

NAME: Compliance Inspector
DESCRIPTION: One of more individuals that perform compliance inspections, documents the results and escalates violations in the appropriate manner. 
COMPOSITION: Employee ID, Employee Name, Certification ID, Certification Renewal Date, Office Location, Phone, Email and Department ID.
VALUES: Mandatory: Employee ID, Employee Name

In the first example, we could have dozens of identified entities but the system wouldn't do anything unique to them. With the addition of the Certificate ID and Certification Renewal Date, we do have very unique elements for this entity. So we could have a General Data Entry entity in the data dictionary that would be used the majority of the time to meet system requirements. All the other similar but unique entities would appear thereafter and support the associated business rules.

© 2016 Dwayne Wright