[CRM-17454] New speed bump (since last LTS) on deduping - CiviCRM Issue Tracker

Details

Type: Bug
Status: Done/Fixed
Priority: Minor
Resolution: Fixed/Completed
Affects Version/s: 4.6.9
Fix Version/s: 4.7
Component/s: None
Labels:
- PatchSubmitted
- performance

Documentation Required?:
None
Funding Source:
Contributed Code

Description

On upgrading to 4.6 I discovered a new query that runs slowly in some circumstances.

The query is

SELECT
table_name,
column_name
FROM information_schema.key_column_usage
WHERE
referenced_table_schema = database() AND
referenced_table_name = 'civicrm_contact' AND
referenced_column_name = 'id';

and it was introduced in https://issues.civicrm.org/jira/browse/CRM-14500 in order to make the selection of which tables to merge dynamic.

The fastest this query ran for me was locally at 150ms (locally, 202 contacts, file_per_table = OFF) and the slowest was on a dev server with many databases & file_per_table= ON at 1 min 17 sec. Those extremes were running the query directly. I experience of up to 17 seconds within the context of CiviCRM.

It seems that running this query will open the db file for as many database tables as the mysql user has permission to. The slowest results were thus obtained running under the user with access to the most databases on the dev server. However, that was not the full picture as it was still slower against some databases than others within that server and this didn't seem to match directly the database size. Perhaps file fragmentation is a factor?

Some informations about why it opens each table file is available at this link http://dev.mysql.com/doc/refman/5.5/en/information-schema-optimization.html.
Limiting the query to the database in use did not alter the speed - which is consistent with that link.

I played around with setting innodb_stats_on_metadata = 0; and it seemed to shave some time off - locally from 250 ms to 150ms and at the upper extreme from 1.17 min to 1.05 min. I was 'fairly' consistent.

https://www.percona.com/blog/2011/12/23/solving-information_schema-slowness/

My local was the only one I tested with file_per_table OFF as setting that to on is standard for us so I don't know if that affects the speed.

The data derived from the query only needs to be updated as frequently as a system_flush is done - ie. upgrades, extension installs, module installs & custom group creation so some sort of caching makes sense

Attachments

Issue Links

links to

PR: civicrm-core#7283: CRM-17454 fix slow dedupe query

Activity

People

Assignee:

Eileen McNaughton

Reporter:

Eileen McNaughton

Votes:

0 Vote for this issue

Watchers:

2 Start watching this issue

Dates

Created:

2015-10-28 5:32

Updated:

2016-01-11 3:47

Resolved:

2015-12-27 3:37