Details
- 
    Type:Improvement 
- 
    Status: Open
- 
    Priority:Major 
- 
    Resolution: Unresolved
- 
    Affects Version/s: 4.7.9
- 
    Fix Version/s: Unscheduled
- 
    Component/s: None
- 
    Labels:
- 
        Versioning Impact:Patch (backwards-compatible bug fixes)
- 
        Documentation Required?:None
- 
        Funding Source:Needs Funding
Description
As part of this PR a 'limit' on the contacts checked for duplicates was introduced:
https://github.com/civicrm/civicrm-core/pull/8456
This is designed to add an option to stop server-killing large queries.
However, with the current interface there is no way to check 'beyond' the limit specified. Ie: check the first 10 contacts - how do you then check the next 10 contacts?
The original UI proposed was:
o Check [ 1000 ] contacts, skipping the first [ 5 ] contacts
o Check the entire database
Which adds the 'offset'. Otherwise we would be repeatedly checking the first X contacts.
As a follow-up I think this entire approach could use some work. It would be better to 'anticipate' a server-killing query, or check for duplicates in several calls (as a kind of batch process), or find another way to remove the need for a setting to be manually entered. I think the process should 'just work' - without killing the server, and without any intervention needed from the user.
–
A second suggestion (thanks Eileen [=) is to remove the '(Duplicate)' label on the dedupe listing and merge screen - favouring more descriptive labels: 'Contact to delete' and 'Contact to keep'. We then need to be careful about labeling the other fields in the dedupe listing screen (ie: email, address) as they won't necessarily be 'deleted' or 'kept'... (different types/location values mean that both might be kept if they were batch merged).
–
At the same time as the above we can tidy up the references in code. Ie: rather than srcID, dstID, mainID, otherID etc, consistently use 'id_to_keep' or something equally descriptive.