有这么一个数据集:
| session_id | row_number | sku_id |
| aa | 2 | a |
| aa | 3 | c |
| aa | 1 | b |
| bb | 2 | d |
| bb | 3 | c |
| bb | 1 | a |
| bb | 4 | a |
根据session_id分组,将sku_id按照row_number的顺序,多行拼接成一行。
代码:
select
session_id,
collect_list(sku_id),
collect_set(sku_id)
from
(
select * from temp.tmp_as_test order by row_number desc
)
b
group by
session_id
代码比较简单,主要是比较巧妙。输出结果为:
| aa | ["c","a","b"] | ["c","a","b"] |
| bb | ["a","c","d","a"] | ["a","c","d"] |
达到了预期的要求。
本文介绍了一种使用SQL的collect_list和collect_set函数,根据session_id分组并将sku_id按row_number顺序多行拼接成一行的方法。通过示例数据集和简洁的代码实现了预期的数据处理目标。

5430

被折叠的 条评论
为什么被折叠?



