Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-3829

etl snapshot fileset not explorable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.1
    • Component/s: ETL
    • Labels:
      None
    • Release Notes:
      Fixed snapshot sink so that the data is explorable as a PartitionedFileSet.
    • Rank:
      1|hzz0qf:

      Description

      The snapshot filesets cannot be explored through hive. If you try, you get an error like:

      Caused by: java.io.IOException: Not a file: file:/Users/ashau/dev/cdap/data/namespaces/default/data/testfiles/latest
      	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:277) ~[hadoop-mapreduce-client-core-2.3.0.jar:na]
      	at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:362) ~[hive-exec-1.1.0.jar:1.1.0]
      	at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:294) ~[hive-exec-1.1.0.jar:1.1.0]
      	at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445) ~[hive-exec-1.1.0.jar:1.1.0]
      	... 26 common frames omitted
      

      This is because the create table statement for a fileset sets the Hive table location to the base path of the FileSet. The error happens if you don't tell Hive that subdirectories are supported. Though even if we set that subdirectories are supported, it wouldn't be enough in this particular use case.

      This is because the fileset uses other directories in its basepath to perform some data swapping, and those directories shouldn't be explorable. So we would need some way to set the explore location when creating the dataset.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ashau Albert Shau
                Reporter:
                ashau Albert Shau
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: