Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-10663

Text source and sink improvements

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Applications, Pipelines
    • Labels:
      None
    • Rank:
      1|hzz4e7:

      Description

      There are avro and parquet file sources and sinks in Hydrator. They have a few limitations:

      1. Sources read based on some offset and duration. This works as long as the input is always available when the run starts, which is not always true. Would be better for them to use the batch partition consumer to read new partitions.

      2. Input/Output format is not configurable, which means you need to write a plugin to support some other format. We should figure out some way this can be configurable, so using text or orc or some other format is just a config change. The challenge here lies in converting to and from whatever the formats require to a record.

        Attachments

          Activity

            People

            • Assignee:
              nitin Nitin Motgi
              Reporter:
              ashau Albert Shau
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: