Cleaning Data For Reports, Imports and Exports

From Dwayne Wright - Certified FileMaker 9 Developer
www.dwaynewright.com
info@dwaynewright.com
TWITTER: dwaynewright

Cleaning data is a term often given to the process of editing data so that it conforms with sets of other data. There are times where you might have to take a good hard look at your data. You see inconstant data entry or places where critical data is needed but it is missing. A few examples I’ve encountered that prompted cleaning data ...

1. You see something wrong in a report like duplicates or extremely high/low values that just do not look correct.

2. Looking at data you imported and seeing that it doesn’t fit snug with your database (like data migration). For example, a lot of last name fields seem to be empty or you see weird characters on some records.

3. Exporting data to another source and getting reports of inconsistencies.

4. Archiving data away for storage or warehousing and you know the data should be cleaned up before hand.

5. Migrating from one major version of FileMaker to another, like upgrading from FileMaker 6 to FileMaker 9. This is a good opportunity to look at the way your data is organized and perhaps make some changes that make more sense.

ABOUT DATA ENTRY PROBLEMS
A classic example is when someone decides it is easier to put the entire person name in the first name field (instead of first name in the first name field and last name in the last name field). This data can be cleaned up fairly easily by targeted use of the Replace command. To help stem the problem from arising again, look to add validation to key fields to make sure the data is entered properly.

Another data entry problem is typos. For example a payment term on an invoice might be Net 30 but the user typed in Net30. In the report, this will show up as two different values but the information behind them is the same. In cases like this, try to use value lists for fields for a more consistent data entry result.

HOW PROBLEMS SHOW UP IN REPORTS
Many times, reports that have sub-summary parts will show data problems. The example of Net 30 vs Net30 discussed above would show up prominently in a sub-summary report using that field. I would say that many times, creating a new report will be your primary spotlight for uncovering underlying data entry problems.

ABOUT IMPORTED DATA SOURCE PROBLEMS
I had a virtual one on one support session the other day and my client brought up the topic of importing data from multiple sources. He said that he was bringing data in from a dozen or so resources and almost none of it was consistent. He asked what do I do in cases like this?

CREATING A DATA WASHTUB
When I get into a situation where I need to scrub a lot of data, I create a file just for that. For example, I did a large migration of a customer that was using NetSuite into a FileMaker based CRM package. I created a file to import all the data into from the NetSuite exports. I then linked that washtub file to my main CRM system and began cleaning the data one table at a time. Every time I did a transformation of the data, I would write a script for it. Before long, I had a large collection of data cleanup scripts. I would then write scripts to load all the data into the FileMaker CRM solution.

Then when it came to do the big migration, I would delete all the data from my CRM system and my washtub file. I would import all the data in again from the source (in this case NetSuite) and run all my data cleaning scripts. After that is done, I would run all my data migration scripts to load the data into my FileMaker CRM solution. Typically, I would do a few dry runs before the final migration. This can take a lot of time but well worth the end result of having current client data cleaned and neatly integrated into their new FileMaker solution.

Here are some links to other posts that might be of interest in regards to this topic...
Introducing Field Data Validation
EXAMPLE: Phone Number Example
Deleting Related Records On The Fly
Date Fix Calculation
Potential Issue: Imported Data And Validation
Help In FileMaker 6 Conversion
Finding Duplicates Not Working
A Before You Export To Do List
Creating A Sub-Summary Report For InBizness SOHO
=
More info about the author and FileMaker in general, contact me at info@dwaynewright.com.

© 2008 - Dwayne Wright - dwaynewright.com

The material on this document is offered AS IS. There is NO REPRESENTATION OR WARRANTY, expressed or implied, nor does any other contributor to this document. WARRANTIES OF MERCHANT ABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. Consequential and incidental damages are expressly excluded. FileMaker Pro is the registered trademark of FileMaker Inc.
====================== ADVERTISEMENT ==============================
To check out the online FileMaker Crosswords, please visit /crossword.html
===================================================================