1. 场景
随机产生数据然后将产生的数据写入到hdfs 中。
2. 随机数据源
代码:
package com.wudl.flink.hdfs.source;
import org.apache.flink.api.java.tuple.Tuple;
import org.apache.flink.api.java.tuple.Tuple4;
import org.apache.flink.streaming.api.functions.source.SourceFunction;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Random;
/**
* @author :wudl
* @date :Created in 2021-12-27 0:29
* @description:
* @modified By:
* @version: 1.0
*/
public class MySource implements SourceFunction<String> {
private boolean isRunning = true;
String[] citys = {
"北京","广东","山东","江苏","河南","上海","河北","浙江","香港","山西","陕西","湖南","重庆","福建","天津","云南","四川","广西","安徽","海南","江西","湖北","山西","辽宁","内蒙古"};
int i = 0;
@Override
public void run(SourceContext<String> ctx) throws Exception {
Random random = new Random();
SimpleDateFormat df = new SimpleDateFormat

本文介绍如何使用Apache Flink实现一个随机数据生成器,将数据写入HDFS,并配置HDFS Sink以指定文件格式和滚动策略。通过MySource创建数据流,HdfsSink负责持久化数据。

2879

被折叠的 条评论
为什么被折叠?



