Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-10241

XMLReader - File Pattern Ignored - Can't Use Pattern to Read Bulk Files

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.5.1
    • Fix Version/s: 4.2.0
    • Component/s: Pipelines
    • Labels:
      None
    • Sprint:
      App Eng Sprint 4, App Eng Sprint 5, App Eng Sprint 6, App Eng Sprint 7, App Eng Sprint 8
    • Release Notes:
      Added additional validations and documentation to the XML File Source plugin to improve user experience.
    • Rank:
      1|hzy2sn:

      Description

      Path must contain either fully qualified file path or ".*" in pattern or it doesn't work. Can't read bulk files with specific pattern.
      org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist

      Following the catalog.xml example demonstrates the error when pattern is used to filter file names. Only fully qualified path to file name works, or end path with "/" and use pattern ".*" No other patterns work to filter files.

      https://github.com/caskdata/hydrator-plugins/blob/5e027172ecbb6685feb9b0fa039606b1c7789af8/core-plugins/src/main/java/co/cask/hydrator/plugin/common/BatchFileFilter.java
      https://github.com/caskdata/hydrator-plugins/search?utf8=%E2%9C%93&q=PathFilter

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                russellsavage Russ Savage
                Reporter:
                ted Ted Coyle
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: