Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-10228

Spark Streaming pipeline creates empty partitions every window interval duration

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.5.0
    • Fix Version/s: 4.3.0
    • Component/s: Pipelines
    • Labels:
      None
    • Release Notes:
      Sinks in streaming pipelines no longer have their prepareRun and onFinish methods called if the rdd for that batch is empty
    • Rank:
      1|hzy3wf:

      Description

      I have attached a pipeline config with this mail which is basically a Spark Streaming real-time pipeline that creates window interval of size 1min. When new data is not ingested, I find that empty partitions are created in the TPFS Sink.

        Attachments

          Activity

            People

            • Assignee:
              ashau Albert Shau
              Reporter:
              gokul Gokul Gunasekaran
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: