一、Flume的案例
1.案例1:Avro
Avro可以发送一个给定的文件给Flume,Avro 源使用AVRO RPC机制。
1)创建agent的配置文件avro.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
2)启动flume
[root@logsrv03 apache-flume-1.6.0-bin]# bin/flume-ng agent -c . -f conf/avro.conf -n a1 -Dflume.root.logger=INFO,console
3)创建指定的文件(在当前的目录下./)
[root@logsrv03 apache-flume-1.6.0-bin]#echo "hello boy">./log.0
4)通过avro-client发送文件(这里的-H也可以写主机名,我的是logsrv03)
[root@logsrv03 apache-flume-1.6.0-bin]# bin/flume-ng avro-client -c . -H 172.17.6.148 -p 4141 -F ./log.0
5)然后在logsrv03的服务端窗口可以看到以下信息,注意最后一行的信息:
15/08/17 16:11:06 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
15/08/17 16:11:06 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:conf/avro.conf
15/08/17 16:11:06 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: a1
15/08/17 16:11:06 INFO conf.FlumeConfiguration: Processing:k1
15/08/17 16:11:06 INFO conf.FlumeConfiguration: Processing:k1
15/08/17 16:11:06 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]
15/08/17 16:11:06 INFO node.AbstractConfigurationProvider: Creating channels
15/08/17 16:11:06 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory
15/08/17 16:11:06 INFO node.AbstractConfigurationProvider: Created channel c1
15/08/17 16:11:06 INFO source.DefaultSourceFactory: Creating instance of source r1, type avro
15/08/17 16:11:06 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: logger
15/08/17 16:11:07 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1]
15/08/17 16:11:07 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:Avro source r1: { bindAddress: 0.0.0.0, port: 4141 } }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@245babce counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
15/08/17 16:11:07 INFO node.Application: Starting Channel c1
15/08/17 16:11:07 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
15/08/17 16:11:07 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started
15/08/17 16:11:07 INFO node.Application: Starting Sink k1
15/08/17 16:11:07 INFO node.Application: Starting Source r1
15/08/17 16:11:07 INFO source.AvroSource: Starting Avro source r1: { bindAddress: 0.0.0.0, port: 4141 }…
15/08/17 16:11:07 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.
15/08/17 16:11:07 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: r1 started
15/08/17 16:11:07 INFO source.AvroSource: Avro source r1 started.
15/08/17 16:11:19 INFO ipc.NettyServer: [id: 0xe795f282, /172.17.6.148:56953 => /172.17.6.148:4141] OPEN
15/08/17 16:11:19 INFO ipc.NettyServer: [id: 0xe795f282, /172.17.6.148:56953 => /172.17.6.148:4141] BOUND: /172.17.6.148:4141
15/08/17 16:11:19 INFO ipc.NettyServer: [id: 0xe795f282, /172.17.6.148:56953 => /172.17.6.148:4141] CONNECTED: /172.17.6.148:56953
15/08/17 16:11:19 INFO ipc.NettyServer: [id: 0xe795f282, /172.17.6.148:56953 :> /172.17.6.148:4141] DISCONNECTED
15/08/17 16:11:19 INFO ipc.NettyServer: [id: 0xe795f282, /172.17.6.148:56953 :> /172.17.6.148:4141] UNBOUND
15/08/17 16:11:19 INFO ipc.NettyServer: [id: 0xe795f282, /172.17.6.148:56953 :> /172.17.6.148:4141] CLOSED
15/08/17 16:11:19 INFO ipc.NettyServer: Connection to /172.17.6.148:56953 disconnected.
15/08/17 16:11:22 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64hello world }
2.案例2:spool
Spool监测配置的目录下新增的文件,并将文件中的数据读取出来。
1)创建配置文件spool.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir = /home/hadoop/flume-1.5.0-bin/logs
a1.sources.r1.fileHeader = true
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
2)启动flume
[root@logsrv03 apache-flume-1.6.0-bin]# bin/flume-ng agent -c . -f conf/spool.conf -n a1 -Dflume.root.logger=INFO,console
3)追加文件到usr/local/logs,这里的logs文件夹需要自己新建
[root@logsrv03 apache-flume-1.6.0-bin]#echo "spool test">./logs/spool_test.log
4)在控制台上可以看到以下信息:
15/08/17 16:31:06 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
15/08/17 16:31:06 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:conf/spool.conf
15/08/17 16:31:06 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: a1
15/08/17 16:31:06 INFO conf.FlumeConfiguration: Processing:k1
15/08/17 16:31:06 INFO conf.FlumeConfiguration: Processing:k1
15/08/17 16:31:06 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]
15/08/17 16:31:06 INFO node.AbstractConfigurationProvider: Creating channels
15/08/17 16:31:06 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory
15/08/17 16:31:07 INFO node.AbstractConfigurationProvider: Created channel c1
15/08/17 16:31:07 INFO source.DefaultSourceFactory: Creating instance of source r1, type spooldir
15/08/17 16:31:07 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: logger
15/08/17 16:31:07 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1] <

本文详细介绍了Apache Flume的四个使用案例:Avro、Spool、Exec和SyslogTCP。通过配置文件启动Flume,Avro源使用AVRO RPC机制发送文件;Spool源监测目录下的新增文件;Exec源执行命令获取输出;SyslogTCP监听TCP端口接收数据。每个案例都展示了详细的启动和执行过程。

2828

被折叠的 条评论
为什么被折叠?



