问个关于sql query 运行速度的问题 - Database版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Database版 - 问个关于sql query 运行速度的问题

相关主题
● 挖一大坑+奔	● 请教 sql server index问题
● teradata 的问题	● Business Intelligence vs DataWareHouse
● 有谁知道TeraData多少钱？	● 找工作求助
● SQL脚本？	● 【招聘】淘宝网 - DB Architect, Sr. DB Engineer, Sr. DBA - (转载)
● Is very-large database the same w/ datawarehouse?	● SQL Server DBA vs BI & DW
● 请教sql server DB 大侠	● JDBC
● beijing呀，教教我怎么optimize sql server吧。	● best practices for sql developer
● 一个求和的有日期限制的sql query 问题.	● 面试问题How would you improve table insert performance? (give five or more ideas)

相关话题的讨论汇总
话题: item话题: 123话题: 456话题: select话题: total

进入Database版参与讨论

(共1页)

d*******8
发帖数: 3182

oracle, 3个tables，每个超过1m条数据，table A 有 ID 。table B 有 ID 和B_ITEM,
table C 有 ID 和C_ITEM，问题就是那种query 运行速度快？
1，select * from A
inner join B on ...
inner join C on ...
where B_ITEM = 123 and C_ITEM =456
2, select * from A where ID in (select ID from B where B_ITEM=123 union all
select ID form C where C_ITEM=456 )
3, 还有更有的方案吗？
另外，REGEXP_LIKE() 与REGEXP_INSTR() 那个快？
谢谢。

B*****g
发帖数: 34098

without the detail off your table and data, can tell too much.
but I would guess 2.

ITEM,
all

【在 d*******8 的大作中提到】

: oracle, 3个tables，每个超过1m条数据，table A 有 ID 。table B 有 ID 和B_ITEM,
: table C 有 ID 和C_ITEM，问题就是那种query 运行速度快？
: 1，select * from A
: inner join B on ...
: inner join C on ...
: where B_ITEM = 123 and C_ITEM =456
: 2, select * from A where ID in (select ID from B where B_ITEM=123 union all
: select ID form C where C_ITEM=456 )
: 3, 还有更有的方案吗？
: 另外，REGEXP_LIKE() 与REGEXP_INSTR() 那个快？

c*****d
发帖数: 6045

in general, I would guess 1

【在 B*****g 的大作中提到】

: without the detail off your table and data, can tell too much.
: but I would guess 2.
:
: ITEM,
: all

B*****g
发帖数: 34098

原因？呵呵

【在 c*****d 的大作中提到】

: in general, I would guess 1

w*******e
发帖数: 1622

second that.

【在 c*****d 的大作中提到】

: in general, I would guess 1

d*******8
发帖数: 3182

tables 只有 key，没有index，我没权限搞这些。也不想麻烦那帮大爷，反正一个
qurey即使运行一天我也刷网等结果。只不过等的过程有点烦。
id 是number(9)，B_ITEM和C_ITEM都是varchar(4000),that is why i also asked REGEXP_
LIKE() 与 REGEXP_INSTR() 那个快？

B*****g
发帖数: 34098

晕，1M的要做query的table没有index，赞一个。
另一个问题不会。

REGEXP_

【在 d*******8 的大作中提到】

: tables 只有 key，没有index，我没权限搞这些。也不想麻烦那帮大爷，反正一个
: qurey即使运行一天我也刷网等结果。只不过等的过程有点烦。
: id 是number(9)，B_ITEM和C_ITEM都是varchar(4000),that is why i also asked REGEXP_
: LIKE() 与 REGEXP_INSTR() 那个快？

B*****g
发帖数: 34098

另外，难道他们把key 的index强行拿掉了？

【在 B*****g 的大作中提到】

: 晕，1M的要做query的table没有index，赞一个。
: 另一个问题不会。
:
: REGEXP_

d*******8
发帖数: 3182

no，i mean only id is the key

j*****n
发帖数: 1781

I bet 1 is slightly better than 2.
In common sense, usually ID key has default clustered index and is arranged
by order.
1 does narrow down the searching range then do the table scan for character
match.
2 is doing two table scans and then starting join with table A.
both REGEXP_LIKE() and REGEXP_INSTR() are doing heavy load pattern matching,
I would say they are similar performance.
try Like and Instr if possible, they are simple matching so that are lighter
load.

相关主题
● 请教sql server DB 大侠	● 请教 sql server index问题
● beijing呀，教教我怎么optimize sql server吧。	● Business Intelligence vs DataWareHouse
● 一个求和的有日期限制的sql query 问题.	● 找工作求助
进入Database版参与讨论

j*****n
发帖数: 1781

btw, in M$SQL, inner join is better performance than IN.

c*****t
发帖数: 1879

The way database resolves query starts from actual data,
and then apply filters, does joins etc. Not the other way
around. This is how fundamentally databases work.
In this case, 1st query is resolved by applying predicates
on table B and table C (i.e. B.ITEM_B = 123). From here,
generates the necessary B.ID, and C.ID which can then be
compared with A.ID. If B.ITEM_B = 123 generates NULL set,
there won't even be a join operation.
Also, prior join / set operations, usually there are internal

B*****g
发帖数: 34098

我不信sql server不是case by case

【在 j*****n 的大作中提到】

: btw, in M$SQL, inner join is better performance than IN.

B*****g
发帖数: 34098

顶一个。周末咱试试，呵呵

【在 c*****t 的大作中提到】

: The way database resolves query starts from actual data,
: and then apply filters, does joins etc. Not the other way
: around. This is how fundamentally databases work.
: In this case, 1st query is resolved by applying predicates
: on table B and table C (i.e. B.ITEM_B = 123). From here,
: generates the necessary B.ID, and C.ID which can then be
: compared with A.ID. If B.ITEM_B = 123 generates NULL set,
: there won't even be a join operation.
: Also, prior join / set operations, usually there are internal

c*******e
发帖数: 8624

1M rows is nothing here, I deal with tables like 10/M a day X 2 years, hehe

【在 B*****g 的大作中提到】

: 晕，1M的要做query的table没有index，赞一个。
: 另一个问题不会。
:
: REGEXP_

B*****g
发帖数: 34098

也没有index？呵呵。

hehe

【在 c*******e 的大作中提到】

: 1M rows is nothing here, I deal with tables like 10/M a day X 2 years, hehe

d*******8
发帖数: 3182

OK, I already wrote a script and will test them tonight and this weekend (2
times):
SQL1 like:
select * from A where ID in (select ID from B where B_ITEM=123 union
all select ID form C where C_ITEM=456 )
SQL2 like:
select * from A
inner join B on ...
inner join C on ...
where B_ITEM = 123 and C_ITEM =456
SQL3 like:
select a.ID
from A a, B b, C c
where (b.ITEM_B = 123 AND b.ID = A.ID) OR
(c.ITEM_C = 456 AND c.ID = A.ID);
是同时用3个连接测还是1 by 1测精确？
any suggestion?

B*****g
发帖数: 34098

I doubt you can trace the sql.
http://youngcow.net/doc/oracle9i/server.920/a96533/sqltrace.htm#8760

2

【在 d*******8 的大作中提到】

: OK, I already wrote a script and will test them tonight and this weekend (2
: times):
: SQL1 like:
: select * from A where ID in (select ID from B where B_ITEM=123 union
: all select ID form C where C_ITEM=456 )
: SQL2 like:
: select * from A
: inner join B on ...
: inner join C on ...
: where B_ITEM = 123 and C_ITEM =456

k********e
发帖数: 702

DBA们，SQL1和SQL2根本求的不是一个结果吧
apple vs. orange 比什么比？

c*****d
发帖数: 6045

为啥不是一个结果？

【在 k********e 的大作中提到】

: DBA们，SQL1和SQL2根本求的不是一个结果吧
: apple vs. orange 比什么比？

相关主题
● 【招聘】淘宝网 - DB Architect, Sr. DB Engineer, Sr. DBA - (转载)	● best practices for sql developer
● SQL Server DBA vs BI & DW	● 面试问题How would you improve table insert performance? (give five or more ideas)
● JDBC	● Enterprise data warehouse team is looking for developers an
进入Database版参与讨论

c*****d
发帖数: 6045

我也见过10M条记录没索引，令人叫绝的是，该表有800多个字段
当时佩服的是五体投地

1M rows is nothing here, I deal with tables like 10/M a day X 2 years, hehe

【在 c*******e 的大作中提到】

: 1M rows is nothing here, I deal with tables like 10/M a day X 2 years, hehe

c*****d
发帖数: 6045

one by one is OK

2

【在 d*******8 的大作中提到】

B*****g
发帖数: 34098

Only below one got what "I" want. haha
select * from A where ID in (select ID from B where B_ITEM=123 union
all select ID form C where C_ITEM=456 )
A.
ID
1
2
3
4
B.
ID B_ITEM
1 123
2 123
3 456
C.
ID C_ITEM
1 456
3 456
4 123

【在 c*****d 的大作中提到】

: 为啥不是一个结果？

c*******e
发帖数: 8624

我这些table都是有stats的,太大了,10M * 365 * 2 = 7.3 billion了

hehe

【在 c*****d 的大作中提到】

: 我也见过10M条记录没索引，令人叫绝的是，该表有800多个字段
: 当时佩服的是五体投地
:
: 1M rows is nothing here, I deal with tables like 10/M a day X 2 years, hehe

B*****g
发帖数: 34098

干嘛要搞那么大？

【在 c*******e 的大作中提到】

: 我这些table都是有stats的,太大了,10M * 365 * 2 = 7.3 billion了
:
: hehe

c*******e
发帖数: 8624

都是交易记录啊

【在 B*****g 的大作中提到】

: 干嘛要搞那么大？

B*****g
发帖数: 34098

还要人吗？哈哈。
不过话说回来了老交易干嘛要都放在一个database里

【在 c*******e 的大作中提到】

: 都是交易记录啊

c*******e
发帖数: 8624

时不时要查一下查的,呵呵.

【在 B*****g 的大作中提到】

: 还要人吗？哈哈。
: 不过话说回来了老交易干嘛要都放在一个database里

j*****n
发帖数: 1781

history data should not be in OLTP, usually put in another Server, populated
by night batch feed. OLTP only keep few months data. Otherwise I doubt what
kind of server can support such heavy load for both OLTP and OLAP.
specially for 2m/day.
and for such large amount of data, partitioning usually should be considered.

【在 B*****g 的大作中提到】

: 还要人吗？哈哈。
: 不过话说回来了老交易干嘛要都放在一个database里

d*******8
发帖数: 3182

04/11/2008
SQL1 Total 2165.72694492
SQL2 Total 2200.25103188
SQL3 Total 2922.04847908
04/12/2008
SQL1 Total 2316.902812
SQL2 Total 2361.45646501
SQL3 Total 3111.09219217
04/13/2008
SQL1 Total 2231.486696
SQL2 Total 2277.06693292
SQL3 Total 2951.74333119

相关主题
● Database vs Data Warehouse	● teradata 的问题
● 大家绝不觉得干DB这行挺难进步的？	● 有谁知道TeraData多少钱？
● 挖一大坑+奔	● SQL脚本？
进入Database版参与讨论

c*****t
发帖数: 1879

Interesting, the OR operator is slow, close to 1/3 slower. I
guess this server could have performed implicit inner joins of all
3 and then perform evaluation. In this case, OR didn't get optimized
into UNION, so manual UNION is needed.

【在 d*******8 的大作中提到】

: 04/11/2008
: SQL1 Total 2165.72694492
: SQL2 Total 2200.25103188
: SQL3 Total 2922.04847908
: 04/12/2008
: SQL1 Total 2316.902812
: SQL2 Total 2361.45646501
: SQL3 Total 3111.09219217
: 04/13/2008
: SQL1 Total 2231.486696

B*****g
发帖数: 34098

do you get same result for sql1 ,sql2?

【在 d*******8 的大作中提到】

c**t
发帖数: 2744

Don't be fooled. It depends on the settings. Database might be tuned by
changing some default parameters to better satify applications.

【在 c*****t 的大作中提到】

: Interesting, the OR operator is slow, close to 1/3 slower. I
: guess this server could have performed implicit inner joins of all
: 3 and then perform evaluation. In this case, OR didn't get optimized
: into UNION, so manual UNION is needed.

d*******8
发帖数: 3182

04/14/2008
SQL1 Total 2386.87175417
SQL2 Total 2836.06475782
SQL3 Total 4909.92693806
Results1 is as same as Results2.
Results3 is as same as Results2.
Results1 is as same as Results3.

B*****g
发帖数: 34098

so the data will not have this situation:
A.
ID
1
2
3
4
B.
ID B_ITEM
1 123
2 123
3 456
C.
ID C_ITEM
1 456
3 456
4 123

【在 d*******8 的大作中提到】

: 04/14/2008
: SQL1 Total 2386.87175417
: SQL2 Total 2836.06475782
: SQL3 Total 4909.92693806
: Results1 is as same as Results2.
: Results3 is as same as Results2.
: Results1 is as same as Results3.

c*****d
发帖数: 6045

有点意思，能不能把exec plan贴出来看看

【在 d*******8 的大作中提到】

d*******8
发帖数: 3182

To Beijing:
这里的SQL1，SQL2，SQL3 只是简化的用以表述的例子，真正的SQL后面还有〉100Char
的条件的呢。所以归根结底的结果应该是相同的。与你用以举例的3个简单table不同。
To coolbid:
不懂什么是exec plan，就简单写下我的script吧：
ls = [sql1, sql2, sql3]
t = []
r = []
for l in ls:
....start = time.time()
....r.append(getResultSetFromDB(l))
....end = time.time()
....t.append(end-start)

B*****g
发帖数: 34098

ok.
How do you test sql 1-3? always in order sql1, sql 2, sql3? if yes, can you
try sql3, sql2 ,sq1.

100Char

【在 d*******8 的大作中提到】

: To Beijing:
: 这里的SQL1，SQL2，SQL3 只是简化的用以表述的例子，真正的SQL后面还有〉100Char
: 的条件的呢。所以归根结底的结果应该是相同的。与你用以举例的3个简单table不同。
: To coolbid:
: 不懂什么是exec plan，就简单写下我的script吧：
: ls = [sql1, sql2, sql3]
: t = []
: r = []
: for l in ls:
: ....start = time.time()

d*******8
发帖数: 3182

yes, i tried 3, 2, 1 on 04/14/2008

you

【在 B*****g 的大作中提到】

: ok.
: How do you test sql 1-3? always in order sql1, sql 2, sql3? if yes, can you
: try sql3, sql2 ,sq1.
:
: 100Char

c*****d
发帖数: 6045

beijing，顺序不会对这个结果会有影响的吧？

you

【在 B*****g 的大作中提到】

: ok.
: How do you test sql 1-3? always in order sql1, sql 2, sql3? if yes, can you
: try sql3, sql2 ,sq1.
:
: 100Char

相关主题
● SQL脚本？	● beijing呀，教教我怎么optimize sql server吧。
● Is very-large database the same w/ datawarehouse?	● 一个求和的有日期限制的sql query 问题.
● 请教sql server DB 大侠	● 请教 sql server index问题
进入Database版参与讨论

w*******e
发帖数: 1622

恩, 根据结果看, 次序没影响
如果有影响, 时间应该越来越短的

【在 c*****d 的大作中提到】

: beijing，顺序不会对这个结果会有影响的吧？
:
: you

B*****g
发帖数: 34098

要是DBA在run SQL3（例如每天2am），进行大规模的bk，analyssis。。。。

【在 c*****d 的大作中提到】

: beijing，顺序不会对这个结果会有影响的吧？
:
: you

d*******8
发帖数: 3182

嗯，我也考虑到这方面的因素，所以在最后一次比较结果的时候顺便将次序换了一下。

【在 B*****g 的大作中提到】

: 要是DBA在run SQL3（例如每天2am），进行大规模的bk，analyssis。。。。

w*r
发帖数: 2421

不乖，上次问你的问题你怎么没有反应地！
你那个数据库那么巨大啊?每天10M一年就是3B,两年的事6B...，你什么时候跑到ATT/
SBC工作的？

hehe

【在 c*******e 的大作中提到】

: 1M rows is nothing here, I deal with tables like 10/M a day X 2 years, hehe

c*******e
发帖数: 8624

没注意看啊,去找找

【在 w*r 的大作中提到】

: 不乖，上次问你的问题你怎么没有反应地！
: 你那个数据库那么巨大啊?每天10M一年就是3B,两年的事6B...，你什么时候跑到ATT/
: SBC工作的？
:
: hehe

z***y
发帖数: 7151

union all 会导致merge scan, 不必要的使用tempdb tablespace.
1 might be better.

ITEM,
all

【在 d*******8 的大作中提到】

c*****d
发帖数: 6045

你说反了吧
union all不会导致merge scan

【在 z***y 的大作中提到】

: union all 会导致merge scan, 不必要的使用tempdb tablespace.
: 1 might be better.
:
: ITEM,
: all

B*****g
发帖数: 34098

ATT用啥数据库

ATT/

【在 c*******e 的大作中提到】

: 没注意看啊,去找找

c*******e
发帖数: 8624

not sure about att, we use teradata + oracle

【在 B*****g 的大作中提到】

: ATT用啥数据库
:
: ATT/

w*r
发帖数: 2421

ATT is so far the largest Teradata implementation across the world. Walmart
used to be the No.1, now
walmart is moving toward neoview.

【在 B*****g 的大作中提到】

: ATT用啥数据库
:
: ATT/

相关主题
● Business Intelligence vs DataWareHouse	● SQL Server DBA vs BI & DW
● 找工作求助	● JDBC
● 【招聘】淘宝网 - DB Architect, Sr. DB Engineer, Sr. DBA - (转载)	● best practices for sql developer
进入Database版参与讨论

w*r
发帖数: 2421

yi?
你不在ATT...哪个公司有一天10M以上的transaction啊？walmart...

【在 c*******e 的大作中提到】

: not sure about att, we use teradata + oracle

B*****g
发帖数: 34098

NASA

【在 w*r 的大作中提到】

: yi?
: 你不在ATT...哪个公司有一天10M以上的transaction啊？walmart...

B*****g
发帖数: 34098

你那有oracle不推荐我？

【在 c*******e 的大作中提到】

: not sure about att, we use teradata + oracle

c*******e
发帖数: 8624

我们data warehouse是teradata,real time的是用oracle,关键是得有EAD
H1B连转的都不要了

【在 B*****g 的大作中提到】

: 你那有oracle不推荐我？

(共1页)

进入Database版参与讨论

相关主题
● 面试问题How would you improve table insert performance? (give five or more ideas)	● Is very-large database the same w/ datawarehouse?
● Enterprise data warehouse team is looking for developers an	● 请教sql server DB 大侠
● Database vs Data Warehouse	● beijing呀，教教我怎么optimize sql server吧。
● 大家绝不觉得干DB这行挺难进步的？	● 一个求和的有日期限制的sql query 问题.
● 挖一大坑+奔	● 请教 sql server index问题
● teradata 的问题	● Business Intelligence vs DataWareHouse
● 有谁知道TeraData多少钱？	● 找工作求助
● SQL脚本？	● 【招聘】淘宝网 - DB Architect, Sr. DB Engineer, Sr. DBA - (转载)

相关话题的讨论汇总
话题: item话题: 123话题: 456话题: select话题: total

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天