CRM-1569 Optimize CiviContribute main page query

    Details

    • Type: Improvement
    • Status: Done/Fixed
    • Priority: Major
    • Resolution: Fixed/Completed
    • Affects Version/s: 1.6
    • Fix Version/s: Unscheduled
    • Component/s: CiviContribute
    • Labels:
      None

      Description

      Because of the way the query is written, the new receive_date index is ignored. If this query isn't optimized, loading CiviContribute could time out.

      Here's the current query:
      SELECT DISTINCT(contact_a.id) as contact_id, contact_a.contact_type as contact_type, contact_a.sort_name as sort_name, contact_a.display_name as display_name, civicrm_contribution.id as contribution_id, civicrm_contribution.receive_date as `receive_date`, civicrm_contribution.non_deductible_amount as `non_deductible_amount`, civicrm_contribution.total_amount as `total_amount`, civicrm_contribution.fee_amount as `fee_amount`, civicrm_contribution.net_amount as `net_amount`, civicrm_contribution.trxn_id as `trxn_id`, civicrm_contribution.invoice_id as `invoice_id`, civicrm_contribution.currency as `currency`, civicrm_contribution.cancel_date as `cancel_date`, civicrm_contribution.cancel_reason as `cancel_reason`, civicrm_contribution.receipt_date as `receipt_date`, civicrm_contribution.thankyou_date as `thankyou_date`, civicrm_contribution.source as `contribution_source`, civicrm_contribution.is_test as `is_test`, civicrm_contribution.contribution_status_id as `contribution_status_id`, civicrm_contribution_type.id as contribution_type_id, civicrm_contribution_type.name as contribution_type, civicrm_contribution_type.accounting_code as `accounting_code`, civicrm_product.id as product_id, civicrm_product.name as `product_name`, civicrm_product.sku as `sku`, civicrm_contribution_product.id as contribution_product_id, civicrm_contribution_product.product_option as `product_option`, civicrm_contribution_product.fulfilled_date as `fulfilled_date`, civicrm_contribution_product.start_date as `contribution_start_date`, civicrm_contribution_product.end_date as `contribution_end_date`, contrib_status.id as status_id, contrib_status.name as `contrib_status`, civicrm_contribution.contribution_recur_id as `contribution_recur_id` FROM civicrm_contact contact_a LEFT JOIN civicrm_contribution ON civicrm_contribution.contact_id = contact_a.id INNER JOIN civicrm_contribution_type ON civicrm_contribution.contribution_type_id = civicrm_contribution_type.id LEFT JOIN civicrm_contribution_product ON civicrm_contribution_product.contribution_id = civicrm_contribution.id LEFT JOIN civicrm_product ON civicrm_contribution_product.product_id =civicrm_product.id LEFT JOIN civicrm_option_group option_group_contrib_status ON (option_group_contrib_status.name = 'contribution_status') LEFT JOIN civicrm_option_value contrib_status ON (civicrm_contribution.contribution_status_id = contrib_status.value AND option_group_contrib_status.id = contrib_status.option_group_id ) WHERE ( LOWER( civicrm_contribution.is_test ) = '0' ) AND ( 1 ) ORDER BY receive_date desc LIMIT 0, 10

        Attachments

          Activity

          [CRM-1569] Optimize CiviContribute main page query
          Donald A. Lobo added a comment -

          any thoughts or suggestions on how to optimize this query? In general LEFT JOIN's are incredibly bad especially for large data sets. Ideally if the count for the above is large (> 100?), a optimization would be to get the contact/contribution id's for the first 10 (or your offset/limit) and then include that in the where clause (i.e. where contact_id IN (XX,YY,ZZ....). I've seen this on a super large data set ( > 2 million records) and am quite convinced that splitting the query is probably the way to go

          i also suspect there is a bug in the above, and the DISTINCT should not be there (it slows down the query, and will return only one contrib per contact!)

          David Strauss added a comment -

          Refactor the query to pull recent contributions from the civicrm_contribution table. Then, run a join on the other tables you need information from. If those joins are too slow, then just don't load the information in the initial query. Run a query to get just the batch of data you need, or run a query for each row in the visible results.

          David Strauss added a comment -

          Yes, the DISTINCT appears to be a bug.

          David Strauss added a comment -

          Because this query appears to be buggy, I'd appreciate pushing the fix to 1.6.

          Donald A. Lobo added a comment -

          just checked and the DISTINCT does not appear to make a difference. so if a contact has made 2 contribs it does appear (at this level, my sql knowledge gets a bit shaky). I need to check why we have it there. will do so later today

          I'd like to push this issue to 1.8 and combine it with the general rearchitecting and optimization of queries (and the left join mess). its a pretty big project and i'm reluctant to add such a big chunk so late in the game for 1.7

          David Greenberg added a comment -

          Deferring search refactoring and related query optimization issues to 2.1.

          Nicholai Burton added a comment -

          My first attempt at helping, please be gentle. David's suggestion best way to optimize the query and the easiest way to (eventually) allow the end user to define the limit without requiring another optimization. My own db is small so the performance difference is negligible for me, but the following will improve performance significantly. I used a temp table because the LIMIT feature isn't supported for subqueries. I also don't see the purpose of DISTINCT contact id in this query, but left it there in case I'm missing some context. I don't know PHP so I'm assuming you can run multiple queries in one go.

          CREATE TEMPORARY TABLE recentcontributions(id int) ENGINE=MEMORY;

          INSERT INTO recentcontributions(id)
          select id from civicrm_contribution
          WHERE ( LOWER( civicrm_contribution.is_test ) = '0' ) AND ( 1 )
          ORDER BY receive_date desc
          LIMIT 0, 10;

          SELECT
          DISTINCT(contact_a.id) as contact_id,
          contact_a.contact_type as contact_type,
          contact_a.sort_name as sort_name,
          contact_a.display_name as display_name,
          civicrm_contribution.id as contribution_id,
          civicrm_contribution.receive_date as `receive_date`,
          civicrm_contribution.non_deductible_amount as `non_deductible_amount`,
          civicrm_contribution.total_amount as `total_amount`,
          civicrm_contribution.fee_amount as `fee_amount`,
          civicrm_contribution.net_amount as `net_amount`,
          civicrm_contribution.trxn_id as `trxn_id`,
          civicrm_contribution.invoice_id as `invoice_id`,
          civicrm_contribution.currency as `currency`,
          civicrm_contribution.cancel_date as `cancel_date`,
          civicrm_contribution.cancel_reason as `cancel_reason`,
          civicrm_contribution.receipt_date as `receipt_date`,
          civicrm_contribution.thankyou_date as `thankyou_date`,
          civicrm_contribution.source as `contribution_source`,
          civicrm_contribution.is_test as `is_test`,
          civicrm_contribution.contribution_status_id as `contribution_status_id`,
          civicrm_contribution_type.id as contribution_type_id,
          civicrm_contribution_type.name as contribution_type,
          civicrm_contribution_type.accounting_code as `accounting_code`,
          civicrm_product.id as product_id,
          civicrm_product.name as `product_name`,
          civicrm_product.sku as `sku`,
          civicrm_contribution_product.id as contribution_product_id,
          civicrm_contribution_product.product_option as `product_option`,
          civicrm_contribution_product.fulfilled_date as `fulfilled_date`,
          civicrm_contribution_product.start_date as `contribution_start_date`,
          civicrm_contribution_product.end_date as `contribution_end_date`,
          contrib_status.id as status_id, contrib_status.name as `contrib_status`,
          civicrm_contribution.contribution_recur_id as `contribution_recur_id`
          FROM civicrm_contact contact_a
          LEFT JOIN civicrm_contribution ON civicrm_contribution.contact_id = contact_a.id
          INNER JOIN civicrm_contribution_type ON civicrm_contribution.contribution_type_id = civicrm_contribution_type.id
          LEFT JOIN civicrm_contribution_product ON civicrm_contribution_product.contribution_id = civicrm_contribution.id
          LEFT JOIN civicrm_product ON civicrm_contribution_product.product_id =civicrm_product.id
          LEFT JOIN civicrm_option_group option_group_contrib_status ON (option_group_contrib_status.name = 'contribution_status')
          LEFT JOIN civicrm_option_value contrib_status ON (civicrm_contribution.contribution_status_id = contrib_status.value AND option_group_contrib_status.id = contrib_status.option_group_id )
          WHERE civicrm_contribution.id in
          (select id from recentcontributions)
          ORDER BY receive_date desc;

          DROP TEMPORARY TABLE recentcontributions;

          Dharmatech added a comment -

          Closing as of 4.0.7, European Code Sprint 12/2011. Issue has been tested and verified as fixed, non-issue or edge case.

            People

            • Assignee:
              Donald A. Lobo
              Reporter:
              David Strauss

              Dates

              • Created:
                Updated:
                Resolved: