Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-10941

StreamingSource cannot evaluate macros with checkpointing


    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Pipelines
    • Labels:
    • Rank:


      This is due to limitations in Spark Streaming checkpoints.

      The limitation is due to how Spark implements checkpointing and due to how Spark builds its dag.

      With checkpointing enabled, Spark will serialize all code into its checkpoint. This means that the StreamingSource that gets serialized is the one that has it macros already substituted in. This means that if you run the pipeline once and the macro evaluates to XYZ, it is forever frozen at that value because it is serialized in the checkpoint. If you stop the pipeline, update the runtime arguments so that the macro should evaluate to ABC, then run the pipeline again, Spark will simply pick up the StreamingSource from the checkpoint and macro evaluation will not occur. We work around this in other plugins by wrapping them in classes that dynamically instantiate the plugin and evaluate macros at runtime. However, this does not work for a StreamingSource because of how Spark builds its dag.

      When you create an InputDStream (Spark's class), it adds itself to the dag in the spark context. This means there isn't any way for us to use a dynamic version of the input dstream that performs the trick of using a wrapper class that instantiates the actual class and evaluates macros at runtime. It's possible we might be able to really hack into things and change the spark dag, but it would be complex and very brittle.

      If it can't be fixed, we should check for macros at configure time and fail pipeline creation.




            • Assignee:
              ashau Albert Shau
              ashau Albert Shau
            • Votes:
              0 Vote for this issue
              2 Start watching this issue


              • Created: