windows安装Flume将本地目录下文件上传到HDFS(kerberos)问题汇总

本文记录了在Windows环境下,使用Flume连接HDFS过程中遇到的七个常见问题及解决方案,涉及类未定义错误、权限认证、依赖包缺失等,最终通过调整配置和补充依赖成功实现数据传输。

环境

本地系统:Windows10
CDH:6.1.0
Hadoop: 3.0
Flume:1.8.0
JDK:1.8
安全:kerberos

问题一:

找不到com/ctc/wstx/io/InputBootstrapper
2020-09-02 11:02:22,640 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:447)] process failed
java.lang.NoClassDefFoundError: com/ctc/wstx/io/InputBootstrapper
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:224)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:541)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunnerPollingRunner.run(SinkRunner.java:145)atjava.lang.Thread.run(Thread.java:748)Causedby:java.lang.ClassNotFoundException:com.ctc.wstx.io.InputBootstrapperatjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherPollingRunner.run(SinkRunner.java:145) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.LauncherPollingRunner.run(SinkRunner.java:145)atjava.lang.Thread.run(Thread.java:748)Causedby:java.lang.ClassNotFoundException:com.ctc.wstx.io.InputBootstrapperatjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherAppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 6 more
Exception in thread “SinkRunner-PollingRunner-DefaultSinkProcessor” java.lang.NoClassDefFoundError: com/ctc/wstx/io/InputBootstrapper
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:224)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:541)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunnerPollingRunner.run(SinkRunner.java:145)atjava.lang.Thread.run(Thread.java:748)Causedby:java.lang.ClassNotFoundException:com.ctc.wstx.io.InputBootstrapperatjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherPollingRunner.run(SinkRunner.java:145) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.LauncherPollingRunner.run(SinkRunner.java:145)atjava.lang.Thread.run(Thread.java:748)Causedby:java.lang.ClassNotFoundException:com.ctc.wstx.io.InputBootstrapperatjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherAppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 6 more

问题解决:

在Hadoop集群中将Hadoop下的jar包放到flume的lib下:
woodstox-core-5.0.3.jar
hadoop-hdfs-3.0.0-cdh6.1.0.jar
commons-configuration2-2.1.jar
hadoop-common-3.0.0-cdh6.1.0.jar
commons-io-2.4.jar
hadoop-auth-3.0.0-cdh6.1.0.jar

问题二:

2020-09-02 14:34:01,915 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:447)] process failed
java.lang.NoClassDefFoundError: org/codehaus/stax2/XMLInputFactory2
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader1.run(URLClassLoader.java:362)atjava.security.AccessController.doPrivileged(NativeMethod)atjava.net.URLClassLoader.findClass(URLClassLoader.java:361)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.Launcher1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher1.run(URLClassLoader.java:362)atjava.security.AccessController.doPrivileged(NativeMethod)atjava.net.URLClassLoader.findClass(URLClassLoader.java:361)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherAppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.hadoop.conf.Configuration.(Configuration.java:321)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:224)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:541)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunnerPollingRunner.run(SinkRunner.java:145)atjava.lang.Thread.run(Thread.java:748)Causedby:java.lang.ClassNotFoundException:org.codehaus.stax2.XMLInputFactory2atjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherPollingRunner.run(SinkRunner.java:145) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.codehaus.stax2.XMLInputFactory2 at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.LauncherPollingRunner.run(SinkRunner.java:145)atjava.lang.Thread.run(Thread.java:748)Causedby:java.lang.ClassNotFoundException:org.codehaus.stax2.XMLInputFactory2atjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherAppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 19 more

问题解决:

stax2-api-3.1.4.jar

问题三:

2020-09-02 14:52:40,947 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:447)] process failed
java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder
at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3283)
at org.apache.hadoop.fs.FileSystem.access200(FileSystem.java:123)atorg.apache.hadoop.fs.FileSystem200(FileSystem.java:123) at org.apache.hadoop.fs.FileSystem200(FileSystem.java:123)atorg.apache.hadoop.fs.FileSystemCache.getInternal(FileSystem.java:3337)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3305)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:476)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:260)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:252)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
at org.apache.flume.sink.hdfs.BucketWriter9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutorWorker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.htrace.core.TracerBuilderatjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherBuilder at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.LauncherBuilderatjava.net.URLClassLoader.findClass(URLClassLoader.java:381)atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)atsun.misc.LauncherAppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 16 more

问题解决:

htrace-core4-4.2.0-incubating.jar

问题四:

2020-09-02 14:55:10,550 (hdfs-k3-call-runner-0) [WARN - org.apache.hadoop.util.NativeCodeLoader.(NativeCodeLoader.java:60)] Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
2020-09-02 14:55:10,996 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:443)] HDFS IO error
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme “hdfs”
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3266)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3286)
at org.apache.hadoop.fs.FileSystem.access200(FileSystem.java:123)atorg.apache.hadoop.fs.FileSystem200(FileSystem.java:123) at org.apache.hadoop.fs.FileSystem200(FileSystem.java:123)atorg.apache.hadoop.fs.FileSystemCache.getInternal(FileSystem.java:3337)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3305)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:476)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:260)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:252)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
at org.apache.flume.sink.hdfs.BucketWriter9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutorWorker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

问题解决:windows安装flume需要将所有hadoop-hdfs*的jar包拷贝到flume的lib下

hadoop-hdfs-3.0.0-cdh6.1.0.jar
hadoop-hdfs-3.0.0-cdh6.1.0-tests.jar
hadoop-hdfs-client-3.0.0-cdh6.1.0.jar
hadoop-hdfs-client-3.0.0-cdh6.1.0-tests.jar
hadoop-hdfs-httpfs-3.0.0-cdh6.1.0.jar
hadoop-hdfs-native-client-3.0.0-cdh6.1.0.jar
hadoop-hdfs-native-client-3.0.0-cdh6.1.0-tests.jar
hadoop-hdfs-nfs-3.0.0-cdh6.1.0.jar

问题四:

2020-09-02 15:12:25,698 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:443)] HDFS IO error
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:281)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1176)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1155)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1093)
at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:463)
at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:460)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:474)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:401)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1103)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1083)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:972)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:960)
at org.apache.flume.sink.hdfs.HDFSDataStream.doOpen(HDFSDataStream.java:81)
at org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:108)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:252)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
at org.apache.flume.sink.hdfs.BucketWriter9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutorWorker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1499)
at org.apache.hadoop.ipc.Client.call(Client.java:1445)
at org.apache.hadoop.ipc.Client.call(Client.java:1355)
at org.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:228)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:228) at org.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:228)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.Proxy12.create(UnknownSource)atorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:349)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:498)atorg.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)atorg.apache.hadoop.io.retry.RetryInvocationHandlerProxy12.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:349) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandlerProxy12.create(UnknownSource)atorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:349)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:498)atorg.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandlerCall.invoke(RetryInvocationHandler.java:157)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandlerCall.invoke(RetryInvocationHandler.java:157)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy13.create(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:276)
… 23 more

问题解决:

tier1.sinks.sink1.hdfs.kerberosKeytab= D:/…/hdfs.keytab
tier1.sinks.sink1.hdfs.kerberosPrincipal= hdfs@AUTO.COM

问题五:

java.lang.IllegalArgumentException: Can’t get Kerberos realm
at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:65)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:318)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:364)
at org.apache.flume.auth.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:143)
at org.apache.flume.auth.FlumeAuthenticationUtil.getAuthenticator(FlumeAuthenticationUtil.java:68)
at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:256)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProviderFileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)atjava.util.concurrent.ExecutorsFileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141) at java.util.concurrent.ExecutorsFileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)atjava.util.concurrent.ExecutorsRunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access301(ScheduledThreadPoolExecutor.java:180)atjava.util.concurrent.ScheduledThreadPoolExecutor301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor301(ScheduledThreadPoolExecutor.java:180)atjava.util.concurrent.ScheduledThreadPoolExecutorScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:110)
at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:63)
… 16 more
Caused by: KrbException: Cannot locate default realm
at sun.security.krb5.Config.getDefaultRealm(Config.java:1029)
… 22 more
问题解决:
配置krb5.conf,我是在windows端部署的flume,所以在flume的安装目录的conf下的flume-env.ps1添加如下krb5.conf路径
JAVAOPTS="JAVA_OPTS="JAVAOPTS="JAVA_OPTS -Djava.security.krb5.conf=D:\tools\apache-flume-1.8.0-bin\conf\krb5.conf"

问题六:
2020-09-02 19:18:55,867 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:443)] HDFS IO error
java.io.IOException: Failed on local exception: java.io.IOException: Couldn’t set up IO streams: java.lang.NoClassDefFoundError: com/google/re2j/PatternSyntaxException; Host Details : local host is: “LAPTOP-NOJK9QO5/169.254.232.127”; destination host is: “hostname01”:8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:808)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1503)
at org.apache.hadoop.ipc.Client.call(Client.java:1445)
at org.apache.hadoop.ipc.Client.call(Client.java:1355)
at org.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:228)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:228) at org.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:228)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.Proxy12.create(UnknownSource)atorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:349)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:498)atorg.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)atorg.apache.hadoop.io.retry.RetryInvocationHandlerProxy12.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:349) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandlerProxy12.create(UnknownSource)atorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:349)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:498)atorg.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandlerCall.invoke(RetryInvocationHandler.java:157)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandlerCall.invoke(RetryInvocationHandler.java:157)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy13.create(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:276)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1176)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1155)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1093)
at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:463)
at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:460)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:474)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:401)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1103)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1083)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:972)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:960)
at org.apache.flume.sink.hdfs.HDFSDataStream.doOpen(HDFSDataStream.java:81)
at org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:108)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:252)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.flume.auth.UGIExecutor.execute(UGIExecutor.java:46)
at org.apache.flume.auth.KerberosAuthenticator.execute(KerberosAuthenticator.java:64)
at org.apache.flume.sink.hdfs.BucketWriter9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor9.call(BucketWriter.java:698)atjava.util.concurrent.FutureTask.run(FutureTask.java:266)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutorWorker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Couldn’t set up IO streams: java.lang.NoClassDefFoundError: com/google/re2j/PatternSyntaxException
at org.apache.hadoop.ipc.ClientConnection.setupIOstreams(Client.java:861)atorg.apache.hadoop.ipc.ClientConnection.setupIOstreams(Client.java:861) at org.apache.hadoop.ipc.ClientConnection.setupIOstreams(Client.java:861)atorg.apache.hadoop.ipc.ClientConnection.access3600(Client.java:410)atorg.apache.hadoop.ipc.Client.getConnection(Client.java:1560)atorg.apache.hadoop.ipc.Client.call(Client.java:1391)...43moreCausedby:java.lang.NoClassDefFoundError:com/google/re2j/PatternSyntaxExceptionatorg.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:311)atorg.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:234)atorg.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:160)atorg.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390)atorg.apache.hadoop.ipc.Client3600(Client.java:410) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1560) at org.apache.hadoop.ipc.Client.call(Client.java:1391) ... 43 more Caused by: java.lang.NoClassDefFoundError: com/google/re2j/PatternSyntaxException at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:311) at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:234) at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:160) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390) at org.apache.hadoop.ipc.Client3600(Client.java:410)atorg.apache.hadoop.ipc.Client.getConnection(Client.java:1560)atorg.apache.hadoop.ipc.Client.call(Client.java:1391)...43moreCausedby:java.lang.NoClassDefFoundError:com/google/re2j/PatternSyntaxExceptionatorg.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:311)atorg.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:234)atorg.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:160)atorg.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390)atorg.apache.hadoop.ipc.ClientConnection.setupSaslConnection(Client.java:614)
at org.apache.hadoop.ipc.Client$Connection.access2300(Client.java:410)atorg.apache.hadoop.ipc.Client2300(Client.java:410) at org.apache.hadoop.ipc.Client2300(Client.java:410)atorg.apache.hadoop.ipc.ClientConnection2.run(Client.java:799)atorg.apache.hadoop.ipc.Client2.run(Client.java:799) at org.apache.hadoop.ipc.Client2.run(Client.java:799)atorg.apache.hadoop.ipc.ClientConnection2.run(Client.java:795)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:422)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)atorg.apache.hadoop.ipc.Client2.run(Client.java:795) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.ipc.Client2.run(Client.java:795)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:422)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)atorg.apache.hadoop.ipc.ClientConnection.setupIOstreams(Client.java:795)
… 46 more
Caused by: java.lang.ClassNotFoundException: com.google.re2j.PatternSyntaxException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 58 more

问题解决:补充hadoop jar下的依赖包

re2j-1.0.jar

问题七:问题记录,暂时没有找到问题在哪,我再集群本地是可以上传文件不报错的,也不是内存问题,我怀疑是因为集群部署的kerberos导致认证时节点不互通

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/flume/biqas/FlumeData.1599096698303.tmp could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2095)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2671)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol2.callBlockingMethod(ClientNamenodeProtocolProtos.java)atorg.apache.hadoop.ipc.ProtobufRpcEngine2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine2.callBlockingMethod(ClientNamenodeProtocolProtos.java)atorg.apache.hadoop.ipc.ProtobufRpcEngineServerProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)atorg.apache.hadoop.ipc.RPCProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPCProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)atorg.apache.hadoop.ipc.RPCServer.call(RPC.java:991)
at org.apache.hadoop.ipc.ServerRpcCall.run(Server.java:869)atorg.apache.hadoop.ipc.ServerRpcCall.run(Server.java:869) at org.apache.hadoop.ipc.ServerRpcCall.run(Server.java:869)atorg.apache.hadoop.ipc.ServerRpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1499)
    at org.apache.hadoop.ipc.Client.call(Client.java:1445)
    at org.apache.hadoop.ipc.Client.call(Client.java:1355)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
    at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:497)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
    at com.sun.proxy.$Proxy13.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1083)
    at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1865)
    at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668)
    at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)

2020-09-03 09:32:54,764 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:99)] Unexpected error while checking replication factor
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:166)
at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:85)
at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:610)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:545)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:748)

Window ⇒ Flume ⇒ HDFS方案:

由于问题七没有得到解决被搁置。

Window ⇒ Flume ⇒ Flume ⇒ HDFS方案:

通过Flume端口传输成功将数据从Windows端传输到集群的HDFS中。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值