- Armbrust M, Xin R S, Lian C, et al. Spark SQL: Relational Data Processing in Spark[C]// ACM SIGMOD International Conference on Management of Data. ACM, 2015:1383-1394.
- Meng X, Bradley J, Yavuz B, et al. MLlib: machine learning in apache spark[J]. Journal of Machine Learning Research, 2015, 17(1):1235-1241.
- Zaharia M, Chowdhury M, Das T, et al. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing[C]// Usenix Conference on Networked Systems Design and Implementation. USENIX Association, 2012:2-2.
- Zaharia M, Chowdhury M, Franklin M J, et al. Spark: cluster computing with working sets[C]// Usenix Conference on Hot Topics in Cloud Computing. USENIX Association, 2010:10-10.
论文推荐 Spark
最新推荐文章于 2025-06-10 07:33:52 发布
本文主要介绍了Apache Spark的核心技术,包括Spark SQL实现的关系数据处理、MLlib提供的机器学习算法、Resilient Distributed Datasets (RDD)作为内存集群计算的容错抽象以及Spark如何支持工作集密集型任务。

1667

被折叠的 条评论
为什么被折叠?



