Details
-
Type: Improvement
-
Status: Done/Fixed
-
Priority: Major
-
Resolution: Fixed/Completed
-
Affects Version/s: 3.1.3
-
Fix Version/s: 3.1.4
-
Component/s: Core CiviCRM
-
Labels:None
Description
When pulling activity information for a certain contact CiviCRM pulls data from civicrm_activity, civicrm_activity_target and civicrm_activity_assignment (core tables).
I noticed this was slow (~4sec on ~50k records in civcirm_activity_target) and investigated.
Checking the my slow query log showed the following SQL on which I examined timings
SELECT SQL_NO_CACHE COUNT(DISTINCT(civicrm_activity.id)) as count
from civicrm_activity
left join civicrm_activity_target at on
civicrm_activity.id = at.activity_id
left join civicrm_activity_assignment aa on
civicrm_activity.id = aa.activity_id
left join civicrm_option_value on
( civicrm_activity.activity_type_id = civicrm_option_value.value )
left join civicrm_option_group on
civicrm_option_group.id = civicrm_option_value.option_group_id
where ( source_contact_id = 8143 OR at.target_contact_id = 8143 OR aa.assignee_contact_id = 8143 )
and civicrm_option_group.name = 'activity_type'
and (civicrm_option_value.component_id IS NULL OR civicrm_option_value.component_id IN (1, 2, 3, 4, 6, 8))
-------
count |
-------
89 |
-------
1 row in set (4.11 sec)
In an attempt to understand it better I rewrote it with left joins as subqueries and got the following timing
SELECT SQL_NO_CACHE COUNT(DISTINCT(civicrm_activity.id)) as count
FROM civicrm_activity
left join civicrm_option_value on
( civicrm_activity.activity_type_id = civicrm_option_value.value )
left join civicrm_option_group on
civicrm_option_group.id = civicrm_option_value.option_group_id
where ( source_contact_id = 8143
OR civicrm_activity.id IN
(SELECT activity_id FROM civicrm_activity_target WHERE target_contact_id = 8143)
OR civicrm_activity.id IN
(SELECT activity_id FROM civicrm_activity_assignment WHERE assignee_contact_id = 8143)
)
AND civicrm_option_group.name = 'activity_type'
AND (civicrm_option_value.component_id IS NULL OR civicrm_option_value.component_id IN (1, 2, 3, 4, 6, 8))
-------
count |
-------
89 |
-------
1 row in set (0.01 sec)
Seems mysql didn't know how to optimize or couldn't use an index.
I examined indexes on civicrm_activity_assignment and civicrm_activity_target I found they both have indexes that contain activity id and contact id but mysql optimizer is not using them. It seems it could not take the ORed condition and bring it back to the join and choose the better index. So I tried to help it with the following
SELECT SQL_NO_CACHE COUNT(DISTINCT(civicrm_activity.id)) as count
from civicrm_activity
left join civicrm_activity_target at on
( civicrm_activity.id = at.activity_id AND at.target_contact_id = 8143)
left join civicrm_activity_assignment aa on
( civicrm_activity.id = aa.activity_id AND aa.assignee_contact_id = 8143)
left join civicrm_option_value on
( civicrm_activity.activity_type_id = civicrm_option_value.value )
left join civicrm_option_group on
civicrm_option_group.id = civicrm_option_value.option_group_id
where ( source_contact_id = 8143 OR at.target_contact_id = 8143 OR aa.assignee_contact_id = 8143 )
and civicrm_option_group.name = 'activity_type'
and (civicrm_option_value.component_id IS NULL OR civicrm_option_value.component_id IN (1, 2, 3, 4, 6, 8));
-------
count |
-------
89 |
-------
1 row in set (0.01 sec)
So, it seems that mysql really needs to have join spelled out to be as narrow as possible.
P.S. Should this be filed under CiviCRM Profile?