在hadoop的应用平台看到的错误如下,根据这个错误参考了网络上的一些解决方法,没有解决这个问题,后来找了下yarn的日志
|
yarn的日志如下:
2019-03-27 13:53:08,961 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /opt/data/yarn/nm/usercache/root/appcache/application_1553509480321_0011 = file:/opt/data/yarn/nm/usercache/root/appca
che/application_1553509480321_0011
2019-03-27 13:53:08,983 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: hdfs://cdh03:8020/user/root/.sparkStaging/applicati
on_1553509480321_0011/__spark_conf__.zip
2019-03-27 13:53:08,985 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: { hdfs://cdh03:8020/user/root/.sparkStaging/application_1553509480321_0011/__spark_conf__.zip, 1553665982289,
ARCHIVE, null } failed: File does not exist: hdfs://cdh03:8020/user/root/.sparkStaging/application_1553509480321_0011/__spark_conf__.zip
java.io.FileNotFoundException: File does not exist: hdfs://cdh03:8020/user/root/.sparkStaging/application_1553509480321_0011/__spark_conf__.zip
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1269)
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1261)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1261)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-03-27 13:53:08,986 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1553509480321_0011_02_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2019-03-27 13:53:08,986 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1553509480321_0011_02_000001 sent RELEASE event on a resource request { hdfs://cdh03:8020/u
ser/root/.sparkStaging/application_1553509480321_0011/__spark_conf__.zip, 1553665982289, ARCHIVE, null } not present in cache.
2019-03-27 13:53:08,986 WARN org.apache.hadoop.ipc.Client: interrupted waiting to send rpc request to server
java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1088)
at org.apache.hadoop.ipc.Client.call(Client.java:1483)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy89.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:257)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:171)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:131)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1147)
2019-03-27 13:53:08,987 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=root OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: LOCALIZATION_FAILED APPID=application_1553509480321_0011 CONTAINERID=container_1553509480321_0011_02_000001
根据“PriviledgedActionException as:root (auth:SIMPLE)”可以知道是权限的问题,参考文章:http://hadoop-common.472056.n3.nabble.com/UserGroupInformation-PriviledgedActionException-as-root-auth-SIMPLE-td4038525.html
每个NM的core-site.xml设置下下面的属性就可以了
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
本文记录了一个Hadoop应用中遇到的错误及其解决方案。错误源于Spark任务启动失败,具体为找不到配置文件__spark_conf__.zip,通过调整dfs.permissions属性值为false解决了权限问题。

835

被折叠的 条评论
为什么被折叠?



