Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-4556

Hadoop dependencies classpath should be set as environment var for hadoop less spark builds

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.0, 3.2.0
    • Fix Version/s: 4.1.0
    • Component/s: Distribution, Master, Spark
    • Labels:
    • Release Notes:
      CDAP now uses environment variables in the spark-env.sh and properties in the spark-defaults.conf when launching Spark programs.
    • Rank:
      1|hzz4vb:

      Description

      In a hadoop less spark user needs to set SPARK_DIST_CLASSPATH with all the dependencies classes which is added to the classpath of spark program.

      https://spark.apache.org/docs/latest/hadoop-provided.html

      Cloudera hadoop less spark install and modified the spark-env.sh to add hadoop dependencies classes to SPARK_DIST_CLASSPATH env variable which is the later used by spark-submit. To support CDAP on such spark install we will need to source the spark-env.sh script and set this enviroment variable for CDH deployed through Cloudera manager.

      On CM cluster the classpath.txt file is present under conf dir spark home and spark-env.sh is present in the same dir too.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                terence Terence Yim
                Reporter:
                rsinha Rohit Sinha
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: