Warehousing Data Explored (lightly)

From Dwayne Wright - Certified FileMaker Developer
WEB: www.dwaynewright.com
EMAIL: info@dwaynewright.com
TWITTER: dwaynewright

There are two types of database implementations but FileMaker succeeds so well at one, the other is often ignored. I’m speaking of operational database and the more reclusive warehouse databases.

Operational databases are used day in and day out to provide up to the minute reliable interactions with information. FileMaker is one of the best databases in the world for this, particularly in small to average size workgroups. Another form of database is the analytical or warehouse database, these types are often used to store large amounts of historical and static data sets. Warehoused data, unlike operational data, is locked and doesn’t undergo editing by users in any way.

Now I recently picked up a client that switched developers and they had a reporting task for me. The previous developer setup a complex report to show year by year sales data for customers and products. The report had a rather complex and intense setup process where the customer would outline what customers to report on, what products to report on and what four possible years they would want to display on the report.

Basically, it was a tab panel based data entry assistant. You make choices on the first tab and then navigate to the next tab. Based upon your first tab selections, you get a filtered set of options on the next tab. This process continues until you reach the last tab and there you can execute the script that runs the report.

This customers comparison report would use a subsummary by customer name, then by supplier and then by the products sold by a supplier. In the body of the report, would be the product id, product name, year 1 sales totals, year 2 sales totals, year 3 sales totals, year 4 sales totals and a grand total for the four select years. The final result was a classic crosstab report that the client found very useful. They wanted a couple more of these reports that showed data in different combinations and for some otherwise ignored business units.

The developers implementation of this data display was brilliant but flawed. I’ve seen this quite a bit lately in the developer community and have written about this before. The developer used a hunt and gather routine in combination with variable arrays to build the data sets. The reports took forever to run but the coding under the hood is quite remarkable. Because the coding under the hood was also largely undocumented and it used multiple subscripts, tweaking this report for my clients needs was about as flexible as a steel girder.

Sometimes the most simple implementation is the best and sometimes asking the client an extra question or two makes all the difference in the world.

These new clients are so nice and I spent a few hours trying to merge my mind into the programming created by the previous developer. I finally reached the conclusion that I would need to call the client and break some bad news to them. Due to the implementation of the previous developer, these tweaks to the report was going to be very expensive. I thought it was a shame because this could probably be a data warehouse implementation. Then I thought, “you dummy” make one of the new reports a warehouse implementation and show it to them!

I called them up and asked them one quick question, “Do your sales for previous years ever change?”. They said no, in fact, they would like for a method to lock older sales records so they cannot be edited by mistake.

So I created a new file for my data warehouse and then needed to make a first pass at populating it with data. I imported a set of records from the sales order line item table that had every unique combination of a client and a product. I then used that data to form a set of relationships back to the sales order line item table to get aggregate sum totals of each combination. Basically, I wanted to get all the totals of each year combination from 2000 to 2009. Then I created a static field and used the Replace command to put the calculated results into a field that could be indexed.

So I would have a record that had a customer id, a product id and static totals of each years matching sales activity.

The final result was a new tab in the customer file that shows a year by year breakdown of every product they have ever purchased and it all uses non-calculated static data. I have one problem with 2009 data. So I run a script from their server that updates the static 2009 totals and the customer is happy with a one day lag on their information. A customer can do standard searches for the data they want and then perform constrain finds on the found set for just about any combination they can think of and the results are immediate.

So warehousing your report data can be a very big boon for your database projects with large sets of historical data!
More info about the author and FileMaker in general, contact me at info@dwaynewright.com.

© 2009 - Dwayne Wright - dwaynewright.com

The material on this document is offered AS IS. There is NO REPRESENTATION OR WARRANTY, expressed or implied, nor does any other contributor to this document. WARRANTIES OF MERCHANT ABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. Consequential and incidental damages are expressly excluded. FileMaker Pro is the registered trademark of FileMaker Inc.