Uploaded image for project: 'CiviCRM'
  1. CiviCRM
  2. CRM-6813

Duplicate criteria fail to take into account multiple duplicate location records

    Details

    • Type: Patch
    • Status: Done/Fixed
    • Priority: Major
    • Resolution: Fixed/Completed
    • Affects Version/s: 3.2.3
    • Fix Version/s: 3.3.alpha
    • Component/s: Core CiviCRM
    • Labels:
      None

      Description

      CiviCRM's Core Duplicate finding code contains a flaw in the way it builds queries to find duplicates.

      In our case if someone has 4 address records in the same zip code and we are using a duplicate criteria looks at postal code with a weight of 10 for the postal code criteria and the threshold of 40 for the complete criteria. Anyone what come in with that zip code will match to the record. I know 4 addresses in the same zip is an extreme example, but it illustrates the point well.

      In this example a new entry on a Contribution page named Jane Doe in zip code 60640 will match with a John Smith who has 4 address records in the 60640 postal code, despite the fact that they don't match on first name, last name or e-mail. Generally all the the UNIONed queries (see below) should have a DISTINCT after the SELECT, but specifically in this case the third SELECT ( SELECT contact_id id, 10 weight FROM civicrm_address WHERE postal_code = '60640 ') in the query returns one record for each address that is in the 60640 zip code. Which for John Smith has four address, it's counting each one of those 4 records as 10 "points" towards the threshold of 40 points to consider something a dupe.

      Here's the query that CiviCRM submits to the DB when an anonymous user submits Jane Doe's record via a Contribution Page.

      SELECT id id, 10 weight FROM civicrm_contact WHERE SUBSTR(first_name, 1, 4) = SUBSTR('Jane', 1, 4) AND first_name IS NOT NULL
      UNION ALL
      SELECT id id, 10 weight FROM civicrm_contact WHERE last_name = 'Doe'
      UNION ALL
      SELECT contact_id id, 10 weight FROM civicrm_address WHERE postal_code = '60640'
      UNION ALL
      SELECT contact_id id, 10 weight FROM civicrm_email WHERE email = 'jdoe@example.com';

      This is the result of the function in the CRM_Dedupe_BAO_RuleGroup::tableQuery() called by CRM_Dedupe_Finder::dupesByParams called by CRM_Contribute_Form_Contribution_Confirm::postProcess() when the Confirm page is submitted.

      Attached is the sql for the civicrm_dupe_rules for our rules.

        Attachments

          Activity

            People

            • Assignee:
              deepak Deepak Srivastava
              Reporter:
              jalama Jim Taylor
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: