l******9 发帖数: 579 | 1 【 以下文字转载自 Database 讨论区 】
发信人: light009 (light009), 信区: Database
标 题: SQL find distinct values in large table
发信站: BBS 未名空间站 (Wed May 21 17:20:16 2014, 美东)
I have a very large table with 1 billion rows and 12 columns, which are int,
double, varchar. I need to know the distinct values for each volume.
the table has columns :
id id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 id_10 id_11
SELECT COUNT(distinct a.id) as num_dist_id
FROM my_large_table as a
SELECT distinct a.id
FROM my_large_table as a
It is very slow.
SELECT distinct a.id , a.id_1 , a.id_2, a.id_3, a.id_4, a.id_5, a.id_6 ...
FROM my_large_table as a
ORDER BY a.id, a.id_1, a.id_2
i need to know how many distinct values that are associated with the same
column vlue.
For example, for id = 1, how many distinct values of id_1 associated with it
.
Can I speed up the query ?
Thanks ! | i**z 发帖数: 194 | 2 你如果用SAS 的话,用data step,可能会比 proc sql 快。
int,
【在 l******9 的大作中提到】 : 【 以下文字转载自 Database 讨论区 】 : 发信人: light009 (light009), 信区: Database : 标 题: SQL find distinct values in large table : 发信站: BBS 未名空间站 (Wed May 21 17:20:16 2014, 美东) : I have a very large table with 1 billion rows and 12 columns, which are int, : double, varchar. I need to know the distinct values for each volume. : the table has columns : : id id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 id_10 id_11 : : SELECT COUNT(distinct a.id) as num_dist_id
| c******y 发帖数: 3269 | 3 try "group" instead of "distinct" |
|