z**k 发帖数: 378 | 1 【 以下文字转载自 Database 讨论区 】
发信人: zerk (阿呆), 信区: Database
标 题: 请问:query about checking consistency
发信站: BBS 未名空间站 (Fri Jul 16 13:10:46 2010, 美东)
sorry i cannot type chinese at work.
I have a collection of records, ordered by date, the record has two other
attributes, say attr and prev, here's a simple example:
date, attr, prev | z**k 发帖数: 378 | 2 btw, i don't wanna use the any sas-only sql function | D******n 发帖数: 2836 | 3 at least u should give a sample desired output, dude.
are u to just check 1 obs ahead? why did u mention date? i don't see
anything to do with the problem unless the date is not sorted. | z**k 发帖数: 378 | 4 well, let me make it concrete. (and u r right, the date attr is redundent, it's just part of the dataset)
I took record of stock price every trading day, for each tuple in the db, I
have 3 attributes:
date, previous_day_close_price, close_price
and I want to check if the system is consistent, if I see something like:
... ...
2010-07-14, 5.17, 5.21
2010-07-15, 5.21, *5.14
2010-07-16, *5.10, 5.06
... ...
(* - inconsist prices)
I know something's wrong here, since 2010-07-15's previous close pr
【在 D******n 的大作中提到】 : at least u should give a sample desired output, dude. : are u to just check 1 obs ahead? why did u mention date? i don't see : anything to do with the problem unless the date is not sorted.
| D******n 发帖数: 2836 | 5 not a sql guru, i can only see this can be quite easily done in sas. | A*******s 发帖数: 3942 | 6 you can do it in data step with retain statement or lag function.
With proc sql, i think you need subqueries. Not sure whether the code work
but you can give it a shot.
SELECT t1.date
FROM table as t1, (select *, date-1 as pre_date from table) as t2
WHERE t1.prevclose <> t2.close
AND t1.date = t2.pre_date;
it's just part of the dataset)
I
【在 z**k 的大作中提到】 : well, let me make it concrete. (and u r right, the date attr is redundent, it's just part of the dataset) : I took record of stock price every trading day, for each tuple in the db, I : have 3 attributes: : date, previous_day_close_price, close_price : and I want to check if the system is consistent, if I see something like: : ... ... : 2010-07-14, 5.17, 5.21 : 2010-07-15, 5.21, *5.14 : 2010-07-16, *5.10, 5.06 : ... ...
| z**k 发帖数: 378 | 7 my point is the trading day is not consecutive, only the row index
matters here, for example, 2010-07-19 is a Monday, so the previous
trading day is the previous Friday, in the data set, i only have:
--Date-- + -Prev- + -close-
2010-07-16, xxxxx, 35.17 # Friday
2010-07-19, 35.17, yyyyy # Monday
all I wanna do is create a table that shift original table by one, you
can view the Date column here as just a string or row name, not
something appears in the WHERE clause
work
【在 A*******s 的大作中提到】 : you can do it in data step with retain statement or lag function. : With proc sql, i think you need subqueries. Not sure whether the code work : but you can give it a shot. : SELECT t1.date : FROM table as t1, (select *, date-1 as pre_date from table) as t2 : WHERE t1.prevclose <> t2.close : AND t1.date = t2.pre_date; : : it's just part of the dataset) : I
| z**k 发帖数: 378 | | l*********s 发帖数: 5409 | 9 although daily stock price dataset is pretty small, use table join to find
the last record is highly inefficient.
【在 z**k 的大作中提到】 : ok, Problem solved. : http://www.mitbbs.com/article_t0/Database/31146665.html
| z**k 发帖数: 378 | 10 I hv hash index on ticker and date, as long as the data fits in ram, the
query could be done in linear time. I guess for large data set the
clustered tree index could greatly reduce disk io.
what's ur suggestion?
find
【在 l*********s 的大作中提到】 : although daily stock price dataset is pretty small, use table join to find : the last record is highly inefficient.
| l*********s 发帖数: 5409 | 11 I think you shall follow Acturies's advice, use lag function in SAS or last
function in SQL. A full join is not good, as most entries are going to be
waste and the excessive "if" conditioning.
【在 z**k 的大作中提到】 : I hv hash index on ticker and date, as long as the data fits in ram, the : query could be done in linear time. I guess for large data set the : clustered tree index could greatly reduce disk io. : what's ur suggestion? : : find
| z**k 发帖数: 378 | 12 ... well, i m not suppose to use any sas function.
last
be
【在 l*********s 的大作中提到】 : I think you shall follow Acturies's advice, use lag function in SAS or last : function in SQL. A full join is not good, as most entries are going to be : waste and the excessive "if" conditioning.
|
|