Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-13422

Scheduler shouldn't start program within transaction

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Scheduler
    • Rank:
      1|i00cof:

      Description

      In general, we should not assume starting a program is a fast operation. 

      For pre-5.0, starting is relatively fast (mostly involve talking to YARN and setting up the launch context), although is not guaranteed. However, in 5.0+, starting involves uploading files to remote cloud and it could exceed the tx timeout.

      Since program start is not a transactional operation, it can't be rollback. If there is TX timeout that happened, it could result in multiple launches of the same program.

      As discussed before, generally we should log the intention inside a tx, commit the tx, execute the task, and then update the state in another tx.

        Attachments

          Activity

            People

            • Assignee:
              ali.anwar Ali Anwar
              Reporter:
              terence Terence Yim
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: