A READER ASKS: Duplicate Outgoing Email Records

© 2010 Dwayne Wright - dwaynewright.com
From Dwayne Wright PMP - Certified FileMaker 10, 9 & 8 Developer
EMAIL: info@dwaynewright.com     TWITTER: dwaynewright

A READER ASKS
I was wondering when you would be available to handle the scripting for me to remove the duplicate email addresses.I am using FileMaker Pro 9, I have attached the file, sorry if it is a duplicate, so you can confirm if you can help me, and can maybe give an estimate in cost. Last thing, one we have the script in FileMaker, will we be able to run it again if we import again? 


DWAYNE RESPONDS
Back in May of 2008, I uploaded an example file called EXAMPLE: Mark Duplicates Calculation. It was a follow up to a posting the month before called EXAMPLE: Flag Duplicates Via A Relationship. This example is a blending of those two and a little bit more. In the example file, you can flip through the records and see flagged duplicates. You can experiment with other duplicate situations and tweak the relationships / calculations accordingly.

What I ended up doing on the first pass was flagging email records that had the same date and to address. This was done by creating a multiple predicate relationship setup. Then I could do a find and find the flagged duplicate records. Going through the records, there were some false positives. Here is a link for more information about a Multiple Predicate Relationship.

FYI... The term False Positive has more implications than I’m aware of. When I did a search for this term on wikipedia.com, it took me to an article about Type 1 and Type 2 statistical errors. Straight searches find all kinds on diverse meanings and I encourage you to check it out (if you are interested). In my FileMaker experience, it is generally agreed to meaning a found set search that is technically correct but the resulting found set isn’t what was intended.

I followed up my original duplicate detector with adding one more relationship that added the sent date as part of the relationship used to detect the duplicates. This could still return a false result because it is possible that two records have the same to address, subject & date and still be originals. For this reason, you can add many more predicates to make sure you flag duplicates correctly. This will likely slow things down a little bit because each added predicate is one more thing to process but it may be quite adequate for your situations. 

An example file can be downloaded by clicking here.