Details
-
Type: Improvement
-
Status: Done/Fixed
-
Priority: Major
-
Resolution: Won't Fix
-
Affects Version/s: 1.7
-
Fix Version/s: 2.1
-
Component/s: Core CiviCRM
-
Labels:None
Description
Here are some real world queries from a 1.7 installation with ~50k contacts. I have a few suggestions that may speed things up. In the included explains.txt do a search for "filesort" or "temporary" and you'll find several queries that can be improved.
Several relate to civicrm_custom_group which, though small, is queried several times per page load with only slightly different WHERE clauses. We might want to take a look at caching this table into an array, or the like to decrease the number of times this table is queried. Or just some simple indexes might help. Though I could be out to lunch here. Maybe the table is just so small that adding indexes would slow things down.
Here's another particularly bad one:
SELECT DISTINCT(contact_a.id) as contact_id, contact_a.contact_type as contact_type, contact_a.contact_sub_type as `contact_sub_type`, contact_a.sort_name as sort_name, contact_a.display_name as display_name, civicrm_individual.id as individual_id, civicrm_individual.first_name as `first_name`, civicrm_individual.middle_name as `middle_name`, civicrm_individual.last_name as `last_name`, civicrm_individual.birth_date as `birth_date`, gender.value as gender_id, gender.label as gender, civicrm_location.id as location_id, civicrm_address.id as address_id, civicrm_address.street_address as `street_address`, civicrm_address.supplemental_address_1 as `supplemental_address_1`, civicrm_address.supplemental_address_2 as `supplemental_address_2`, civicrm_address.city as `city`, civicrm_address.postal_code_suffix as `postal_code_suffix`, civicrm_address.postal_code as `postal_code`, civicrm_address.geo_code_1 as `geo_code_1`, civicrm_address.geo_code_2 as `geo_code_2`, civicrm_state_province.id as state_province_id, civicrm_state_province.abbreviation as `state_province`, civicrm_state_province.name as state_province_name, civicrm_country.id as country_id, civicrm_country.name as `country`, civicrm_phone.id as phone_id, civicrm_phone.phone_type as phone_type, civicrm_phone.phone as `phone`, civicrm_email.id as email_id, civicrm_email.email as `email`, civicrm_email.on_hold as `on_hold`, civicrm_im.id as im_id, civicrm_im.name as `im`, contact_a.nick_name as `nick_name`, contact_a.legal_identifier as `legal_identifier`, contact_a.external_identifier as `external_identifier`, contact_a.home_URL as `home_URL`, contact_a.preferred_mail_format as `preferred_mail_format` FROM civicrm_contact contact_a LEFT JOIN civicrm_individual ON (contact_a.id = civicrm_individual.contact_id) LEFT JOIN civicrm_location ON (civicrm_location.entity_table = 'civicrm_contact' AND contact_a.id = civicrm_location.entity_id AND civicrm_location.is_primary = 1) LEFT JOIN civicrm_address ON civicrm_location.id = civicrm_address.location_id LEFT JOIN civicrm_state_province ON civicrm_address.state_province_id = civicrm_state_province.id LEFT JOIN civicrm_country ON civicrm_address.country_id = civicrm_country.id LEFT JOIN civicrm_email ON (civicrm_location.id = civicrm_email.location_id AND civicrm_email.is_primary = 1) LEFT JOIN civicrm_phone ON (civicrm_location.id = civicrm_phone.location_id AND civicrm_phone.is_primary = 1) LEFT JOIN civicrm_im ON (civicrm_location.id = civicrm_im.location_id AND civicrm_im.is_primary = 1) LEFT JOIN civicrm_option_group option_group_gender ON (option_group_gender.name = 'gender') LEFT JOIN civicrm_option_value gender ON (civicrm_individual.gender_id = gender.value AND option_group_gender.id = gender.option_group_id) WHERE ( ( ( LOWER(contact_a.sort_name) LIKE 'XXXX%' ) ) ) AND ( 1 ) ORDER BY sort_name asc LIMIT 0, 50;
It does a Using where; Using temporary; Using filesort on the contact table. Ouch. Could we get rid of the DISTINCT? should we be storing a lowercase version of sort_name in a new column? Should we index sort_name?
Count: 10 Time=5.70s (57s) Lock=0.00s (0s) Rows=1.1 (11)
SELECT civicrm_contact.id as contact_id,
civicrm_contact.domain_id as domain_id,
civicrm_contact.hash as hash,
civicrm_contact.contact_type as contact_type,
civicrm_contact.contact_sub_type as contact_sub_type
FROM civicrm_contact
LEFT JOIN civicrm_location ON ( civicrm_location.entity_table = 'S' AND
civicrm_contact.id = civicrm_location.entity_id AND
civicrm_location.is_primary = N )
LEFT JOIN civicrm_email ON ( civicrm_location.id = civicrm_email.location_id AND civicrm_email.is_primary = N )
WHERE civicrm_email.email = 'S' AND civicrm_contact.domain_id = N
Hmmm, it's not returning many rows, why is it taking so long? Should we index all is_primary fields?
Count: 1 Time=2.00s (2s) Lock=0.00s (0s) Rows=1.0 (1), XXXXXXXX[XXXXXXXX]@localhost
SELECT civicrm_email.email as email
FROM civicrm_contact
LEFT JOIN civicrm_location ON ( civicrm_location.entity_table = 'S' AND
civicrm_contact.id = civicrm_location.entity_id AND
civicrm_location.is_primary = N )
LEFT JOIN civicrm_email ON ( civicrm_location.id = civicrm_email.location_id AND
civicrm_email.is_primary = N )
WHERE civicrm_contact.id = N
same deal. Should we try to index is_primary?
I think these are the most important ones (next to the "SELECT * FROM tbl_name;" that I reported in another issue).