Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-6469

On a non-secure cluster containers always run as yarn user

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5.0
    • Component/s: CDAP, Docs
    • Labels:
    • Rank:
      1|hzzfzz:

      Description

      On a non-kerberos enabled cluster, the master service that launches twill application run as cdap user and the master twill application gets run as "yarn" user.

      One side effect of this is that the datasets that are created on HDFS have file ownership as "yarn" user. If hdfs.user is configured as cdap then subsequent mapreduce job that writes to the datasets fail with permission denied.

      Master process running as cdap user.

      ps -aef | grep cdap.service=master | grep -v grep
      cdap     22902     1  1 04:56 ?        00:08:46 /usr/lib/jvm/java/bin/java -Dcdap.service=master -Xmx1024m -Duser.dir=/var/tmp/cdap -Dexplore.conf.files=/usr/hdp/2.3.4.7-4/hadoop/conf/capacity-scheduler.xml:/usr/hdp/2.3.4.7-4/hadoop/conf/core-site.xml:/usr/hdp/2.3.4.7-4/hadoop/conf/hadoop-env.sh*:/usr/hdp/2.3.4.7-4/hadoop/conf/hadoop-metrics.properties:/usr/hdp/2.3.4.7-4/hadoop/conf/hdfs-site.xml:/usr/hdp/2.3.4.7-4/hadoop/conf/log4j.properties:/usr/hdp/2.3.4.7-4/hadoop/conf/mapred-env.sh*:/usr/hdp/2.3.4.7-4/hadoop/conf/mapred-site.xml:/usr/hdp/2.3.4.7-4/hadoop/conf/yarn-env.sh*:/usr/hdp/2.3.4.7-4/hadoop/conf/yarn-site.xml:/usr/hdp/2.3.4.7-4/hive/conf/hive-env.sh*:/usr/hdp/2.3.4.7-4/hive/conf/hive-site.xml: -Dexplore.classpath=/usr/hdp/2.3.4.7-4/hive/lib/accumulo-core-1.7.0.2.3.4.7-4.jar:/usr/hdp/2.3.4.7-4/hive/lib/accumulo-fate-1.7.0.2.3.4.7-4.jar:/usr/hdp/2.3.4.7-4/hive/lib/accumulo-start-1.7.0.2.3.4.7-4.jar:/usr/hdp/2.3.4.7-4/hive/lib/accumulo-trace-1.7.0.2.3.4.7-4.jar:/usr/hdp/2.3.4.7-4/hive/lib/activation-1.1.jar:/usr/hdp/2.3.4.7-4/hive/lib/ant-1.9.1.jar:/usr/hdp/2.3.4.7-4/hive/lib/ant-launcher-1.9.1.jar:/usr/hdp/2.3.4.7-4/hive/lib/antlr-2.7.7.jar:/usr/hdp/2.3.4.7-4/hive/lib/antlr-runtime-3.4.jar:/usr/hdp/2.3.4.7-4/hive/lib/apache-log4j-extras-1.2.17.jar:/usr/hdp/2.3.4.7-4/hive/lib/asm-commons-3.1.jar:/usr/hdp/2.3.4.7-4/hive/lib/asm-tree-3.1.jar:/usr/hdp/2.3.4.7-4/hive/lib/avro-1.7.5.jar:/usr/hdp/2.
      

      Containers running as yarn user

      yarn     24599 24597  0 04:57 ?        00:00:00 /bin/bash -c /usr/lib/jvm/java/bin/java -Djava.io.tmpdir=tmp -Dyarn.container=container_1468442130591_0024_01_000003 -Dtwill.runnable=master.services.dataset.executor -cp launcher.jar:/etc/hadoop/conf -Xmx359m -XX:MaxPermSize=128M -verbose:gc -Xloggc:/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000003/gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M org.apache.twill.launcher.TwillLauncher container.jar org.apache.twill.internal.container.TwillContainerMain true 1>/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000003/stdout 2>/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000003/stderr
      yarn     24609 24599  0 04:57 ?        00:01:49 /usr/lib/jvm/java/bin/java -Djava.io.tmpdir=tmp -Dyarn.container=container_1468442130591_0024_01_000003 -Dtwill.runnable=master.services.dataset.executor -cp launcher.jar:/etc/hadoop/conf -Xmx359m -XX:MaxPermSize=128M -verbose:gc -Xloggc:/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000003/gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M org.apache.twill.launcher.TwillLauncher container.jar org.apache.twill.internal.container.TwillContainerMain true
      yarn     24844 24600  0 04:57 ?        00:00:00 bash /data/yarn/local/usercache/cdap/appcache/application_1468442130591_0024/container_1468442130591_0024_01_000005/default_container_executor.sh
      yarn     24845 24844  0 04:57 ?        00:00:00 /bin/bash -c /usr/lib/jvm/java/bin/java -Djava.io.tmpdir=tmp -Dyarn.container=container_1468442130591_0024_01_000005 -Dtwill.runnable=master.services.metrics -cp launcher.jar:/etc/hadoop/conf -Xmx180m -XX:MaxPermSize=128M -verbose:gc -Xloggc:/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000005/gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M org.apache.twill.launcher.TwillLauncher container.jar org.apache.twill.internal.container.TwillContainerMain true 1>/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000005/stdout 2>/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000005/stderr
      yarn     24855 24845  0 04:57 ?        00:01:40 /usr/lib/jvm/java/bin/java -Djava.io.tmpdir=tmp -Dyarn.container=container_1468442130591_0024_01_000005 -Dtwill.runnable=master.services.metrics -cp launcher.jar:/etc/hadoop/conf -Xmx180m -XX:MaxPermSize=128M -verbose:gc -Xloggc:/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000005/gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M org.apache.twill.launcher.TwillLauncher container.jar org.apache.twill.internal.container.TwillContainerMain true
      yarn     27116 24600  0 04:57 ?        00:00:00 bash /data/yarn/local/usercache/cdap/appcache/application_1468442130591_0024/container_1468442130591_0024_01_000008/default_container_executor.sh
      yarn     27118 27116  0 04:57 ?        00:00:00 /bin/bash -c /usr/lib/jvm/java/bin/java -Djava.io.tmpdir=tmp -Dyarn.container=container_1468442130591_0024_01_000008 -Dtwill.runnable=master.services.explore.service -cp launcher.jar:/etc/hadoop/conf -Xmx774m -XX:MaxPermSize=128M -verbose:gc -Xloggc:/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000008/gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M org.apache.twill.launcher.TwillLauncher container.jar org.apache.twill.internal.container.TwillContainerMain true 1>/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000008/stdout 2>/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000008/stderr
      yarn     27318 27118  0 04:57 ?        00:01:47 /usr/lib/jvm/java/bin/java -Djava.io.tmpdir=tmp -Dyarn.container=container_1468442130591_0024_01_000008 -Dtwill.runnable=master.services.explore.service -cp launcher.jar:/etc/hadoop/conf -Xmx774m -XX:MaxPermSize=128M -verbose:gc -Xloggc:/data/logs/hadoop-yarn/userlogs/application_1468442130591_0024/container_1468442130591_0024_01_000008/gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M org.apache.twill.launcher.TwillLauncher container.jar org.apache.twill.internal.container.TwillContainerMain true
      

      This seems to be a bug in yarn where the process always run as yarn user:
      http://stackoverflow.com/questions/33550135/how-to-set-user-in-linuxcontainerexecutor

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                John John Jackson
                Reporter:
                sree Sreevatsan Raman
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: