Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-1423

System wide intent log for robust operation

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: App Fabric
    • Labels:
      None
    • Rank:
      1|hzymqf:

      Description

      CDAP is a distributed system, which must often handle operations that span system boundaries. Even with the support of Tephra for HBase transactions, this can still create situations where consistency between systems is difficult to ensure. For example, registering a new partition for a file dataset requires 3 steps:

      1. create the new files / partition directory
      2. record the partition in CDAP metadata
      3. register the new partition in Hive

      If the overall transaction fails to commit, we can rollback step 2, but the partition may remain registered in Hive.

      In this situation, we could make use of an intent log to ensure that, even in the case of failures, these distributed operations will still reach a consistent state. In the case above, the intent log would record the start of the "add partition" operation. Once the operation completes successfully, the log would record success. If a failure occurs during operation, the intent log can be replayed, allowing the operation to either be reapplied or undone, bring the system back to a consistent state.

      Note that this does not do anything to guarantee atomicity of these distributed operations (which I don't think we can easily ensure across disparate systems). But it does make the overall system much more resilient by ensuring that we return to a consistent state.

      Some examples of similar functionality are FATE in Accumulo or the distributed procedure mechanism in HBase.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                terence Terence Yim
                Reporter:
                gary Gary Helmling
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: