Details
-
Type: Bug
-
Status: Done/Fixed
-
Priority: Minor
-
Resolution: Fixed/Completed
-
Affects Version/s: 4.7.12
-
Fix Version/s: 4.7.14
-
Component/s: CiviCRM Search
-
Labels:
-
Documentation Required?:None
-
Sprint:4.7.14 Search
-
Funding Source:Contributed Code
Description
The quickform search results are ordered by 'exactFirst' what this means is if you have wildcard search enabled and your database contains the orgs
The Donald Duck Federation
Donald Duck Fed
Donald Duck Federation
A Donald Duck Fed
And you type in 'Donald Duck Fed' then the results will be ordered as follows:
Donald Duck Fed
A Donald Duck Fed
Donald Duck Federation
The Donald Duck Federation
ie the entry that exactly matches what you entered would appear first. This feels pretty useful on first name & last_name searches and marginally useful on 'name' (sort_name) & email searches.
It also seems much more useful than when you have entered a long string vs a short string. But it costs much more on a short string.
On a large dataset (10million) if the search string on sort field is 'm' it takes around 8 seconds, but only a quarter of a second if we remove the part that puts exact matches first. The reason is that ALL records that meet the criteria have to be evaluated for searching, so our limit = 10 is not helping us.
The longer the search string is the less the difference. The result here is often that we are just firing off a bunch of slow queries as we type which hurt the server but not usability as they don't come back in time. However, we found that when someone just entered a '%' the temp table to render it clogged up the server to a significant degree.
There are some options here
1) don't do the sort by exact search on display_name or email
2) offer an setting not to do the exact search on display name or email
3) don't do exact search unless the string is at least 4 letters long
4) add an option for the minimum string length to invoke exact search
5) don't do an exact search for display_name or email unless the strings is 5 letters+, use a limit of 2 for the other fields
I'm somewhat inclined to go with 5 because it gives a more performant option that meets most needs without confusing things with more settings. However, a search for 'Smith' on a large dataset is still a bit slow and I remain unconvinced that the exactMatch on sort name adds enough value to justify a performance sacrifice - it would be good to hear whether other people can think of value it adds