Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-6852

Log Saver fails to handle exceptions while check pointing

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.6
    • Fix Version/s: 3.5.0, 3.3.7, 3.4.4
    • Component/s: Log
    • Labels:
      None
    • Release Notes:
      Fixes issues that makes Log Saver more resilient to errors while checkpointing.
    • Rank:
      1|hzzin3:

      Description

      Log Saver maintains a list of currently open files per partition that get synced during check pointing.
      When there is an exception during the syncing of a file while check pointing (mostly due to all datanodes that host the file are not reachable), then the file is not removed from the open file list. The file is removed only when it gets new events. This leads to one bad file not letting the check pointing for the partition to proceed, effectively halting the processing of events for the partition this file belongs to.

        Attachments

          Activity

            People

            • Assignee:
              poorna Poorna Chandra
              Reporter:
              poorna Poorna Chandra
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: