由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Database版 - SQL, recruiter发过来的面试题 (转载)
相关主题
SQL Server - convert datetime to a string YY-MM-DD HH一个关于T-SQL的问题
Oracle新手请教一个问题再请教大牛一个问题
date format转换问题请教请帮忙在PL/SQL里面执行下面的计算,一个笔记软件破解用的
有大侠知道怎么format下面这个query的时间么SSRS report failing to display dataset string
MySQL table either insert or drop/truncate table running forever转换成时间的格式 MS SQL
oracle help?SQL问题(有包子)
Oracle 求助Help with database design
Informix中的trigger问题问个查询问题。
相关话题的讨论汇总
话题: null话题: load话题: varchar话题: positions话题: table
进入Database版参与讨论
1 (共1页)
p****1
发帖数: 275
1
【 以下文字转载自 JobHunting 讨论区 】
发信人: peace1 (peace1), 信区: JobHunting
标 题: SQL, recruiter发过来的面试题
发信站: BBS 未名空间站 (Thu Sep 8 22:14:20 2016, 美东)
有兴趣做不? 我琢磨着, 出题的人是在寻找最优解 :)
We work with large datasets, and are always performance conscious since
extended processing times will impact our time to market. Keep this in mind
as you answer the following questions:
There is a table defined as:
CREATE TABLE [Positions](
[load_id] [int] NOT NULL,
[acct_cd] [varchar](20) NOT NULL,
[acct_num] [varchar](255) NULL,
[sec_id] [varchar](50) NOT NULL,
[long_sht_cd] [varchar](3) NOT NULL,
[sedol] [varchar](15) NULL,
[isin] [varchar](15) NULL,
[cusip] [varchar](9) NULL,
[sec_type] [varchar](8) NULL,
[sec_name] [varchar](100) NULL,
[currency_cd] [varchar](3) NULL,
[total_holding] [decimal](18, 4) NULL,
[mkt_price] [float] NULL,
[datetime_stamp] [datetime] NULL,
CONSTRAINT [pk_Positions] PRIMARY KEY CLUSTERED (
[load_id] ASC,
[acct_cd] ASC,
[sec_id] ASC,
[long_sht_cd] ASC )
)
This table holds account positions data that are appended to multiple times
a day
There are currently some 24 million rows in the table. Every time we append
additional positions we add approximately 32,000 entries to this table, and
all 32,000 entries will have the same load_id. The load_id is incremented by
one each time we load a batch of 32,000 entries (i.e. the first 32K entries
have load_id=1, the next 32K has load_id=2, etc...). The datetime_stamp
field shows the time at which the entries were loaded and is the same for
all 32K entries in a single load.
How would you efficiently retrieve the first set of positions for the
current day given the above table definition?
Example:
Today, positions were loaded into this table at 8am, 10am and 3pm. At 5pm
today we want to know what positions were loaded at 8am since that is the
first load that occurred today. Note that for any given day, there can be
different number of loads and the times that the loads occur will vary.
n****f
发帖数: 905
2
很有意思的考題。
就針對單一表來說。 我第一步, 要建立為 TIMESTAMP 索引。
第二步, 然後找出今天 TIMESTAMP 的第一個 LOAD_ID。
SELECT * FROM Positions WHERE LOAD_ID = (HERE YOUR SQL)
但是, 在現實當中, 這類事情不應該是這樣設計的。
DEVELOPER 要是這樣幹, 肯定是不行的。
應該要有一個 LOAD 的 TABLE。裡面有
LOAD—ID (PK)
TIMESTAMP (DATETIME)
這樣, 用戶要是要得到今天的 LOADID , 簡單查這個表就 OK 了。
有什麼原因要在 POSITION 表裡面加一個 重複成千上萬次的TIMESTAMP COLUMN? 查詢
的代價太大。
這個考題, 對於初學是能回答出結果, 但是很難提出改進意見。
因此說, 這不是考你 SQL , 是考你實戰經驗。
正確回答是, 給出 SQL , 但是指出這樣設計的缺欠, 並提供修正意見
p****1
发帖数: 275
3
Thanks very mu h for your reply.
Would appreciate if you could let me know how to do below?
就針對單一表來說。 我第一步, 要建立為 TIMESTAMP 索引。
第二步, 然後找出今天 TIMESTAMP 的第一個 LOAD_ID。

【在 n****f 的大作中提到】
: 很有意思的考題。
: 就針對單一表來說。 我第一步, 要建立為 TIMESTAMP 索引。
: 第二步, 然後找出今天 TIMESTAMP 的第一個 LOAD_ID。
: SELECT * FROM Positions WHERE LOAD_ID = (HERE YOUR SQL)
: 但是, 在現實當中, 這類事情不應該是這樣設計的。
: DEVELOPER 要是這樣幹, 肯定是不行的。
: 應該要有一個 LOAD 的 TABLE。裡面有
: LOAD—ID (PK)
: TIMESTAMP (DATETIME)
: 這樣, 用戶要是要得到今天的 LOADID , 簡單查這個表就 OK 了。

n****f
发帖数: 905
4
In Oracle, it will be:
1. Create Index:
CREATE INDEX INDX_TIMESTAMP ON POSITIONS
(TO_CHAR (DATETIME_STAMP,'YYYY_MM_DD'));
2. Query Data:
SELECT *
FROM POSITIONS
WHERE ID =
(SELECT MIN (LOAD_ID)
FROM POSITIONS
WHERE TO_CHAR (DATETIME_STAMP, 'YYYY_MM_DD') =
TO_CHAR (SYSDATE, 'YYYY_MM_DD'))
p****1
发帖数: 275
5
Thank you very much!

【在 n****f 的大作中提到】
: In Oracle, it will be:
: 1. Create Index:
: CREATE INDEX INDX_TIMESTAMP ON POSITIONS
: (TO_CHAR (DATETIME_STAMP,'YYYY_MM_DD'));
: 2. Query Data:
: SELECT *
: FROM POSITIONS
: WHERE ID =
: (SELECT MIN (LOAD_ID)
: FROM POSITIONS

s**********o
发帖数: 14359
6
如果是我,我就把出题的训一顿。这么多数据,成年累月的往一个TABLE里堆,
一看就是不懂行的,这种TABLE要么有HISTORY/ARCHIVE TABLE或者是
TABLE PARTITION,否则你REBUILD INDEX会成NIGHTMARE。你真的
每天都需要10年前的POSITION吗?
g*****y
发帖数: 36
7
Further optimization
1. create composite index on timestamp and load_id.
create index idx_pos on (DATETIME_STAMP,load_id);
2. Remove functional filter against indexed column.
select * from positions where load_id =
(select min(load_id) from positions where datetime_stamp >=trunc(sysdate));

【在 n****f 的大作中提到】
: In Oracle, it will be:
: 1. Create Index:
: CREATE INDEX INDX_TIMESTAMP ON POSITIONS
: (TO_CHAR (DATETIME_STAMP,'YYYY_MM_DD'));
: 2. Query Data:
: SELECT *
: FROM POSITIONS
: WHERE ID =
: (SELECT MIN (LOAD_ID)
: FROM POSITIONS

w*r
发帖数: 2421
8
select * from
(select
rank() over(partition by cast(datetime_stamp as date),load_id ) as by_day_
seq
a.*
from table a
where cast(datetime_stamp as date) = current_date ) a
where a.by_day_seq =1
c*****l
发帖数: 312
9
Oracle有clustered PK吗?

【在 n****f 的大作中提到】
: In Oracle, it will be:
: 1. Create Index:
: CREATE INDEX INDX_TIMESTAMP ON POSITIONS
: (TO_CHAR (DATETIME_STAMP,'YYYY_MM_DD'));
: 2. Query Data:
: SELECT *
: FROM POSITIONS
: WHERE ID =
: (SELECT MIN (LOAD_ID)
: FROM POSITIONS

p****1
发帖数: 275
10
will it be faster if we add clustered index to datetime_stamp first?

【在 w*r 的大作中提到】
: select * from
: (select
: rank() over(partition by cast(datetime_stamp as date),load_id ) as by_day_
: seq
: a.*
: from table a
: where cast(datetime_stamp as date) = current_date ) a
: where a.by_day_seq =1

n****f
发帖数: 905
11
You should understand what is the "Clustered" mean.
then you will have the answer.
http://sql-plsql.blogspot.com/2010/06/non-clustered-indexes.html

【在 p****1 的大作中提到】
: will it be faster if we add clustered index to datetime_stamp first?
p****1
发帖数: 275
12
非常感谢。
One table can only have one clustered index, and the table has Primary Key
Clustered already, so, I can only add non clustered index to datetime_stamp.
:)

【在 n****f 的大作中提到】
: You should understand what is the "Clustered" mean.
: then you will have the answer.
: http://sql-plsql.blogspot.com/2010/06/non-clustered-indexes.html

n****f
发帖数: 905
13
Please check DOC from Oracle:
http://docs.oracle.com/database/121/TGSQL/tgsql_indc.htm#TGSQL850
And you will find out when you should use the clustered index.

stamp.

【在 p****1 的大作中提到】
: 非常感谢。
: One table can only have one clustered index, and the table has Primary Key
: Clustered already, so, I can only add non clustered index to datetime_stamp.
: :)

1 (共1页)
进入Database版参与讨论
相关主题
问个查询问题。MySQL table either insert or drop/truncate table running forever
一道题 PL/SQLoracle help?
convert datetime to ''Oracle 求助
question on large tables (>=800 million records, 10 G bytes of data)Informix中的trigger问题
SQL Server - convert datetime to a string YY-MM-DD HH一个关于T-SQL的问题
Oracle新手请教一个问题再请教大牛一个问题
date format转换问题请教请帮忙在PL/SQL里面执行下面的计算,一个笔记软件破解用的
有大侠知道怎么format下面这个query的时间么SSRS report failing to display dataset string
相关话题的讨论汇总
话题: null话题: load话题: varchar话题: positions话题: table