Uploaded image for project: 'CiviCRM'
  1. CiviCRM
  2. CRM-21051

Case-activity history is incomplete

    Details

    • Type: Epic
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.6.0, 4.7.0
    • Fix Version/s: None
    • Component/s: CiviCase
    • Labels:
      None
    • Versioning Impact:
      Patch (backwards-compatible bug fixes)
    • Documentation Required?:
      None
    • Funding Source:
      Needs Funding
    • Verified?:
      No

      Description

       

      Background

      CiviCRM's logging functionality developed in three layers:

      • (Metadata Log; MDL) During early development of CiviCRM, it was determined that tracking basic metadata about changes would be useful. This led to the creation of the civicrm_log table, which tracks the timestamp of the change and the identity of the user who made the change. (This was implemented for a handful of entities in the BAO layer and enabled for all installations.)
      • (Case-Activity Log; CAL) During the development of CiviCase, it was determined that tracking the substantive revisions of activities would be important for case-managers/CiviCase-users. This led to the creation of columns civicrm_activity.is_current_revision and civicrm_activity.original_id. Whenever a case-activity is edited, a new record is added to civicrm_activity, and the old one is disabled. (This was implemented in the Activity BAO and enabled for all CiviCase activities.)
      • (Full Log; FUL) Subsequently, it was determined that tracking the substantive revisions of all CiviCRM records would be important for a range of users. This led to the creation of the "logging" and the various log_civicrm_* tables. (This was implemented with a set of auto-generated SQL triggers. It was disabled by default.)

      Problems

      • At some undetermined point (before or during CiviCRM v4.6), the Case-Activity Log broke-down. Changes made in the CiviCase screen, "Manage Case", do not produce additional records in civicrm_activity. However, changes made through APIv3 do.
      • In the wild, we can expect to find deployments with different histories and data quality, such as:
        • (CAL-C) Databases with complete history in civicrm_activity (eg they've run an old version for their entire lifespan)
        • (FUL-C) Databases with complete history in log_civicrm_activity (eg they've enabled logging for their entire lifespan)
        • (CAL-P) Databases with partial history in civicrm_activity (eg they've upgraded at some point; or, they've truncated logs at some point)
        • (FUL-P) Databases with partial history in log_civicrm_activity (eg they've enabled/disabled logging at various times; or, they've truncated logs at some point)
        • (CAL-N) Databases with no substantive history in civicrm_activity (eg they've only used 4.6+)
        • (FUL-N) Databases with no substantive history in log_civicrm_activity (eg they've always had logging disabled; or, they've truncated logs at some point)
        • Databases with some combination: complete-or-partial-or-no history in civicrm_activity and complete-or-partial-or-no history in log_civicrm_activity
        • Databases which don't use CiviCase and don't care about any of this.
      • In the wild, we can expect to find some mix of downstream reports/code/integrations which either:
        • Reads activities without CAL filters (is_current_revision and original_id). They could break if CAL suddenly started to work.
        • Reads activities with CAL filters (is_current_revision and original_id). They could break if the columns were removed completely.
      • The Case-Activity Log is an architectural outlier. Problems in this layer are less likely to be detected because they only apply to a small part of the overall application. Developing test-coverage for both CAL+FUL would be more work than just FUL.

      General Philosophies

      Based on one's general philosophy and capacity, one might pick-and-choose/allocate tasks differently:

      • Resurrection, KISS-style: Let's do the smallest thing possible to re-enable Case-Activity Log. Downstream will handle cleanup/transition/back-filling on their own.
      • Consolidation, KISS-style: Let's do the smallest thing possible to switch over to Full Log. Downstream will handle cleanup/transition/back-filling on their own.
      • Resurrection, Perfectionist-style: Let's re-enable Case-Activity Log and do as much as we can to help downstream with cleanup/transition/back-filling.
      • Consolidation, Perfectionist-style**: Let's switch over to Full Log and do as much as we can to help downstream with cleanup/transition/back-filling.

       

      Menu of Possible Tasks

      Based on one's general philosophy and capacity, one might pick-and-choose among these tasks:

      • Diagnostic: Examine behavior/history of the codebase
        • (DG-1) Perform a general bisection between CiviCRM 4.2.0 and 4.6.latest to determine the origin of the failure.
        • (DG-2) Verify that these standard FK's reference the original_id (rather than the latest id): civicrm_activity.parent_id civicrm_activity.original_id civicrm_activity_contact.activity_id civicrm_case_activity.activity_id
        • (DG-3) Verify that these dynamic FK's reference the original_id (rather than the latest id): civicrm_entity_file.entity_id civicrm_entity_tag.entity_id civicrm_log.entity_id
        • (DG-4) Verify that these dynamic FK's do not reference activities: civicrm_acl.entity_id civicrm_acl_entity_role.entity_id civicrm_entity_batch.entity_id civicrm_premiums.entity_id civicrm_entity_financial_account.entity_id civicrm_financial_item.entity_id civicrm_campaign_group.entity_id civicrm_tell_friend.entity_id civicrm_pledge_block.entity_id civicrm_note.entity_id civicrm_price_set_entity.entity_id civicrm_discount.entity_id civicrm_uf_join.entity_id civicrm_action_log.entity_id civicrm_mailing_group.entity_id civicrm_pcp_block.entity_id civicrm_pcp_block.target_entity_id civicrm_entity_financial_trxn.entity_id civicrm_line_item.entity_id
        • (DG-5) Verify that bookmarked URLs or emailed URLs reference original_id (rather than the latest id).
      • Sample data: Design or find useful datasets
        • (SD-Synthetic) Construct synthetic data-sets which reproduce the various combinations of {CAL,FUL}-{C,P,N}.
        • (SD-Organic) Collect a small library of real-world datasets. As appropriate, obtain authority to share with a tester (or find an able/responsive tester who already has authority).
      • Runtime Logic: Update the runtime behavior to consistently support one form
        • (RL-Fix) Fix BAO code which putatively fills in CAL data (or fix whatever issue was identified in DG-1)
        • (RL-Remove) Remove BAO code which putatively fills in CAL data.
        • (RL-Option) Change BAO code to allow an opt-out/phase-out of CAL tracking.
        • (RL-Chg-Report) Modify CiviCase UI to pull activity reports from log_civicrm_activity instead of civicrm_activity.
      • Transition: Develop documentation, scripts, APIs, or similar to provide for transitioning data
        • (TR-Backfill-CAL) Backfill civicrm_activity using log_civicrm_activity. This would be useful if you aim to restore CAL and have been using FUL. It has two variants:
          • (Simple) Blindly transferring all records. (Easier to conceive, but it only works well if CAL-N or FUL-N. Other scenarios lead to dupes.)
          • (Dedupe) Attempt to deduplicate activity history that is redundant between the two tables. (Harder, but should work in more cases.)
        • (TR-Backfill-FUL) Backfill log_civicrm_activity using civicrm_activity. This would be useful if you aim to consolidate on FUL while preserving as much data as possible from CAL. It has two variants:
          • (Simple) Blindly transferring all records. (Easier to conceive, but it only works well if CAL-N or FUL-N. Other scenarios lead to dupes.)
          • (Dedupe) Attempt to deduplicate activity history that is redundant between the two tables. (Harder, but should work in more cases.)
        • (TR-Consolidate-OriginalID) Purge/consolidate CAL history. Standardize on original_id.
          • If any foreign-keys reference later IDs, then replace them with original_id.
          • Copy data from latest civicrm_activity into original civicrm_activity. Toggle the is_current_revision.
          • Delete everything except the original activity.
          • Possibly repeat on log_civicrm_activity?
          • (Note: Based on quick testing, you should guard against the possibility of multiple records being flagged as the current revision. Look for the newest.)
        • (TR-Consolidate-LatestID) Purge/consolidate CAL history. Standardize on latest id.
          • If any foreign-keys reference earlier versions of the same original_id, then replace them with the newest ID.
          • On the latest record, change its original_id to match the newest id.
          • Delete everything except the latest activity.
          • Possibly repeat on log_civicrm_activity?
          • (Note: Based on quick testing, you should guard against the possibility of multiple records being flagged as the current revision. Look for the newest.)
      • Testing Given SD-1 and/or SD-2, upgrade some databases and:
        • (TS-1) Run the chosen transition scripts against sample data (like SD-1 or SD-2).
        • (TS-2) Check that historical data is stored and displayed correctly.
        • (TS-3) Check that new data is generated correctly.
      • Communication:
        • (C-1) Figure out what the heck to say about all this

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              timotten Tim Otten
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: