A READER ASKS: Duplicate Outgoing Email Records

From Dwayne Wright - Certified FileMaker 9 Developer
WEB: www.dwaynewright.com
EMAIL: info@dwaynewright.com
TWITTER: dwaynewright

A READER ASKS
I was wondering when you would be available to handle the scripting for me to remove the duplicate email addresses.

I am using FileMaker Pro 9, I have attached the file, sorry if it is a duplicate, so you can confirm if you can help me, and can maybe give an estimate in cost. Last thing, one we have the script in FileMaker, will we be able to run it again if we import again?

-------
DWAYNE RESPONDS
Actually, this was an email that went back and forth a couple times and the reader hired me to do one hour worth of work on their solution. There database is still in FileMaker 6, so these example file is a different look at the actual implementation we used.

Back in May of 2008, I uploaded an example file called EXAMPLE: Mark Duplicates Calculation. It was a follow up to a posting the month before called EXAMPLE: Flag Duplicates Via A Relationship. This example is a blending of those two and a little bit more.

Here you can see the multiple predicate relationship I used for flagging the duplicates.

In the example file, you can flip through the records and see flagged duplicates. You can experiment with other duplicate situations and tweak the relationships / calculations accordingly.


What I ended up doing on the first pass was flagging email records that had the same date and to address. This was done by creating a multiple predicate relationship setup. Then I could do a find and find the flagged duplicate records. Going through the records, there were some false positives. Here is a link for more information about a Multiple Predicate Relationship.

FYI... The term False Positive likely has more implications than I’m aware of. When I did a search for this term on wikipedia.com, it took me to an article about Type 1 and Type 2 statistical errors. Straight searches find all kinds on diverse meanings and I encourage you to check it out (if you are interested). In my FileMaker experience, it is generally agreed to meaning a found set search that is technically correct but the resulting found set isn’t what was intended.

I followed up my original duplicate detector with adding one more relationship that added the sent date as part of the relationship used to detect the duplicates. This could still return a false result because it is possible that two records have the same to address, subject & date and still be originals. For this reason, you can add many more predicates to make sure you flag duplicates correctly. This will likely slow things down a little bit because each added predicate is one more thing to process but it may be quite adequate for your situations.

An example file can be downloaded at ...
/blogfiles08/EmailDupDetector.zip

Here are some links to other posts that might be of interest in regards to this topic...
A READER ASKS: Dealing With Duplicates
Working With Unique Or Duplicate Values
=
More info about the author and FileMaker in general, contact me at info@dwaynewright.com.

© 2008 - Dwayne Wright - dwaynewright.com

The material on this document is offered AS IS. There is NO REPRESENTATION OR WARRANTY, expressed or implied, nor does any other contributor to this document. WARRANTIES OF MERCHANT ABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. Consequential and incidental damages are expressly excluded. FileMaker Pro is the registered trademark of FileMaker Inc.

===== UNPAID ADVERTISEMENT FOR A SERVICE I RECOMMEND =====


For more information about the database hosting services from thedroolingdog, please visit http://www.thedroolingdog.com/
===================================================================