hive sql中的行转列和列转行

HQL中的行转列和列转行-CSDN博客

一、测试数据准备

1、创建表单

1
2
3
4
5
6
7
create table if not exists stu_score(
score_id int,
stu_name string,
stu_subject string,
score int
)
row format delimited fields terminated by ','; -- 指定字段之间的分隔符为逗号

2、装载数据

文件上传到虚拟机: 主机虚拟机互传文件:SCP命令和MobaXterm工具

linux目录:root/data/data.txt

1
2
3
4
5
6
7
8
9
10
11
12
1,张三,语文,80
2,张三,数学,90
3,张三,英语,70
4,张三,生物,85
5,李四,语文,80
6,李四,数学,92
7,李四,英语,76
8,李四,生物,88
9,王五,语文,60
10,王五,数学,82
11,王五,英语,96
12,王五,生物,78
1
2
load data local inpath '/root/data/data.txt'
overwrite into table stu_score;

3、检验数据是否装载好

1
select * from stu_score;

二、行转列

行转列预期结果

1、使用 case when… then…

1
2
3
4
5
6
7
8
SELECT
stu_name,
MAX(CASE WHEN stu_subject = '语文' THEN score END) AS chinese,
MAX(CASE WHEN stu_subject = '数学' THEN score END) AS math,
MAX(CASE WHEN stu_subject = '英语' THEN score END) AS english,
MAX(CASE WHEN stu_subject = '生物' THEN score END) AS biology
FROM stu_score
GROUP BY stu_name;

2、使用 if()

1
2
3
4
5
6
7
8
SELECT 
stu_name,
SUM(IF(stu_subject='语文', score,0)) as chinese,
SUM(IF(stu_subject='数学', score,0)) as math,
SUM(IF(stu_subject='英语' ,score,0)) as english,
SUM(IF(stu_subject='生物', score,0)) as biology
FROM stu_score
GROUP BY stu_name;

3、函数

  1. CONCAT(str1, str2)

    CONCAT 函数用于将两个或多个字符串连接在一起。

  2. CONCAT_WS(separator, str1, str2,…):

    CONCAT_WS 函数用于将字符串连接在一起,并使用指定的分隔符将它们分开。

  3. COLLECT_SET(col):

    COLLECT_SET 函数通常用于处理复杂数据类型,用于收集每个分组内的唯一值,并返回一个集合

三、列转行:

数据准备:

以上面的结果创建临时表

1
2
3
4
5
6
7
8
9
CREATE TEMPORARY TABLE IF NOT EXISTS temp_stu_summary AS
SELECT
stu_name,
SUM(IF(stu_subject='语文', score, 0)) AS chinese,
SUM(IF(stu_subject='数学', score, 0)) AS math,
SUM(IF(stu_subject='英语', score, 0)) AS english,
SUM(IF(stu_subject='生物', score, 0)) AS biology
FROM stu_score
GROUP BY stu_name;

临时表temp_stu_summary

1
select * from temp_stu_summary 

列转行预期结果:

1、使用 union all

1
2
3
4
5
6
7
SELECT stu_name, '语文' AS subject, chinese AS score FROM temp_stu_summary
UNION ALL
SELECT stu_name, '数学' AS subject, math AS score FROM temp_stu_summary
UNION ALL
SELECT stu_name, '英语' AS subject, english AS score FROM temp_stu_summary
UNION ALL
SELECT stu_name, '生物' AS subject, biology AS score FROM temp_stu_summary;

2、函数 EXPLODE

1
2
3
4
5
6
SELECT
stu_name,
A.subject,
A.score
FROM temp_stu_summary
LATERAL VIEW EXPLODE(map("语文", chinese,"数学",math,"英语", english,"生物", biology)) A AS subject, score;

EXPLODE(col):将hive一列中复杂的array或者map结构拆分成多行

LATERAL VIEW 是 Apache Hive 中用于处理复杂数据类型的查询语法。在 Hive 中,复杂数据类型通常包括数组(Array)、映(Map)和结构(Struct)。

1
lateral view + explode(array|map)` 或 `lateral view + inline(array_struct)

hive sql中的行转列和列转行
https://leaf-domain.gitee.io/2024/01/21/hive-sqlHL/
作者
叶域
发布于
2024年1月21日
许可协议