Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-4384

In HA HDFS mode reading from stream throws an java.lang.IllegalArgumentException: Wrong FS exception

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.2.0, 3.1.0, 3.0.0, 2.8.0
    • Fix Version/s: 3.3.0, 3.2.2
    • Component/s: App Fabric
    • Labels:
      None
    • Release Notes:
      Fixes an issue that prevents streams from being read in HA HDFS mode.
    • Rank:
      1|hzz3w7:

      Description

      In HA HDFS from uploading a file to streams (order of few MBs) and reading the stream back gives the following exception:

      java.lang.IllegalArgumentException: Wrong FS: hdfs://prodnameservice1:8020/<PATH>, expected: hdfs://prodnameservice1
              at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) ~[hadoop-common-2.5.0-cdh5.3.2.jar:na]
              at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:192) ~[hadoop-hdfs.jar:na]
              at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:104) ~[hadoop-hdfs.jar:na]
              at org.apache.hadoop.hdfs.DistributedFileSystem$32.doCall(DistributedFileSystem.java:1569) ~[hadoop-hdfs.jar:na]
              at org.apache.hadoop.hdfs.DistributedFileSystem$32.doCall(DistributedFileSystem.java:1565) ~[hadoop-hdfs.jar:na]
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-2.5.0-cdh5.3.2.jar:na]
              at org.apache.hadoop.hdfs.DistributedFileSystem.isFileClosed(DistributedFileSystem.java:1565) ~[hadoop-hdfs.jar:na]
              at co.cask.cdap.common.io.Locations$9.size(Locations.java:365) ~[co.cask.cdap.cdap-common-3.2.1.jar:na]
              at co.cask.cdap.common.io.Locations$11.size(Locations.java:406) ~[co.cask.cdap.cdap-common-3.2.1.jar:na]
              at co.cask.cdap.common.io.DFSSeekableInputStream.size(DFSSeekableInputStream.java:51) ~[co.cask.cdap.cdap-common-3.2.1.jar:na]
              at co.cask.cdap.data.stream.StreamDataFileReader.createEventTemplate(StreamDataFileReader.java:344) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar
      :na]
              at co.cask.cdap.data.stream.StreamDataFileReader.readHeader(StreamDataFileReader.java:305) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.stream.StreamDataFileReader.init(StreamDataFileReader.java:280) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.stream.StreamDataFileReader.doOpen(StreamDataFileReader.java:252) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.stream.StreamDataFileReader.initialize(StreamDataFileReader.java:139) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.stream.LiveStreamFileReader$StreamPositionTransformFileReader.initialize(LiveStreamFileReader.java:169) ~[co.cask.cdap.c
      dap-data-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.stream.LiveStreamFileReader.renewReader(LiveStreamFileReader.java:81) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.file.LiveFileReader.initialize(LiveFileReader.java:42) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.stream.MultiLiveStreamFileReader$StreamEventSource.initialize(MultiLiveStreamFileReader.java:175) ~[co.cask.cdap.cdap-da
      ta-fabric-3.2.1.jar:na]
              at co.cask.cdap.data.stream.MultiLiveStreamFileReader.initialize(MultiLiveStreamFileReader.java:72) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar
      :na]
              at co.cask.cdap.data.stream.service.StreamFetchHandler.createReader(StreamFetchHandler.java:286) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na
      ]
              at co.cask.cdap.data.stream.service.StreamFetchHandler.fetch(StreamFetchHandler.java:124) ~[co.cask.cdap.cdap-data-fabric-3.2.1.jar:na]
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_67]
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_67]
      

      This happens because in HA mode the URI returned from org.apache.hadoop.fs.FileContext is not compatible with the one expected by org.apache.hadoop.dfs.DistributedFileSystem. FileContext is not HA aware and always appends the port which DistributedFilesystem uses the logical name.

      Related HDFS JIRA: https://issues.apache.org/jira/browse/HADOOP-9617

      Until the HDFS JIRA is fixed we need a workaround in CDAP to strip out the port if used in HA mode.

        Attachments

          Activity

            People

            • Assignee:
              terence Terence Yim
              Reporter:
              sree Sreevatsan Raman
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: