请问:query about checking consistency (转载) - Statistics版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - 请问:query about checking consistency (转载)

相关主题
● 学习R的过程还是挺痛苦的！	● 请教一下R高手们
● 关于rank sum test（Mann-Whitney test）	● Clustering analysis with categorical variables
● Please help me with SAS! Thank you!	● [合集] k-mean clustering
● 昨天考了SAS ADVANCED，不难	● 请教一个R:K-means的问题
● 请问如何把SAS dataset转到R里？	● 请教一个频率优化问题（相关性？）
● 求问一道SAS adv 题	● 这种情况应该用什么hypothesis test。
● 如何把access query转成sas dataset?	● 从大data 产生多个小data 的方法
● 请教一个数据分类的问题	● 问个弱弱的sas问题

相关话题的讨论汇总
话题: date话题: prev话题: attr话题: query

进入Statistics版参与讨论

1

(共1页)

z**k 发帖数: 378	1 【以下文字转载自 Database 讨论区】发信人: zerk (阿呆), 信区: Database 标题: 请问:query about checking consistency 发信站: BBS 未名空间站 (Fri Jul 16 13:10:46 2010, 美东) sorry i cannot type chinese at work. I have a collection of records, ordered by date, the record has two other attributes, say attr and prev, here's a simple example: date, attr, prev
z**k 发帖数: 378	2 btw, i don't wanna use the any sas-only sql function
D******n 发帖数: 2836	3 at least u should give a sample desired output, dude. are u to just check 1 obs ahead? why did u mention date? i don't see anything to do with the problem unless the date is not sorted.
z**k 发帖数: 378	4 well, let me make it concrete. (and u r right, the date attr is redundent, it's just part of the dataset) I took record of stock price every trading day, for each tuple in the db, I have 3 attributes: date, previous_day_close_price, close_price and I want to check if the system is consistent, if I see something like: ... ... 2010-07-14, 5.17, 5.21 2010-07-15, 5.21, 5.14 2010-07-16, 5.10, 5.06 ... ... (* - inconsist prices) I know something's wrong here, since 2010-07-15's previous close pr 【在 D******n 的大作中提到】 : at least u should give a sample desired output, dude. : are u to just check 1 obs ahead? why did u mention date? i don't see : anything to do with the problem unless the date is not sorted.
D******n 发帖数: 2836	5 not a sql guru, i can only see this can be quite easily done in sas.
A*******s 发帖数: 3942	6 you can do it in data step with retain statement or lag function. With proc sql, i think you need subqueries. Not sure whether the code work but you can give it a shot. SELECT t1.date FROM table as t1, (select , date-1 as pre_date from table) as t2 WHERE t1.prevclose <> t2.close AND t1.date = t2.pre_date; it's just part of the dataset) I 【在 zk 的大作中提到】 : well, let me make it concrete. (and u r right, the date attr is redundent, it's just part of the dataset) : I took record of stock price every trading day, for each tuple in the db, I : have 3 attributes: : date, previous_day_close_price, close_price : and I want to check if the system is consistent, if I see something like: : ... ... : 2010-07-14, 5.17, 5.21 : 2010-07-15, 5.21, 5.14 : 2010-07-16, *5.10, 5.06 : ... ...
z**k 发帖数: 378	7 my point is the trading day is not consecutive, only the row index matters here, for example, 2010-07-19 is a Monday, so the previous trading day is the previous Friday, in the data set, i only have: --Date-- + -Prev- + -close- 2010-07-16, xxxxx, 35.17 # Friday 2010-07-19, 35.17, yyyyy # Monday all I wanna do is create a table that shift original table by one, you can view the Date column here as just a string or row name, not something appears in the WHERE clause work 【在 A******s 的大作中提到】 : you can do it in data step with retain statement or lag function. : With proc sql, i think you need subqueries. Not sure whether the code work : but you can give it a shot. : SELECT t1.date : FROM table as t1, (select , date-1 as pre_date from table) as t2 : WHERE t1.prevclose <> t2.close : AND t1.date = t2.pre_date; : : it's just part of the dataset) : I
z**k 发帖数: 378	8 ok, Problem solved. http://www.mitbbs.com/article_t0/Database/31146665.html
l*********s 发帖数: 5409	9 although daily stock price dataset is pretty small, use table join to find the last record is highly inefficient. 【在 z**k 的大作中提到】 : ok, Problem solved. : http://www.mitbbs.com/article_t0/Database/31146665.html
z**k 发帖数: 378	10 I hv hash index on ticker and date, as long as the data fits in ram, the query could be done in linear time. I guess for large data set the clustered tree index could greatly reduce disk io. what's ur suggestion? find 【在 l*********s 的大作中提到】 : although daily stock price dataset is pretty small, use table join to find : the last record is highly inefficient.
l*********s 发帖数: 5409	11 I think you shall follow Acturies's advice, use lag function in SAS or last function in SQL. A full join is not good, as most entries are going to be waste and the excessive "if" conditioning. 【在 z**k 的大作中提到】 : I hv hash index on ticker and date, as long as the data fits in ram, the : query could be done in linear time. I guess for large data set the : clustered tree index could greatly reduce disk io. : what's ur suggestion? : : find
z**k 发帖数: 378	12 ... well, i m not suppose to use any sas function. last be 【在 l*********s 的大作中提到】 : I think you shall follow Acturies's advice, use lag function in SAS or last : function in SQL. A full join is not good, as most entries are going to be : waste and the excessive "if" conditioning.

1

(共1页)

进入Statistics版参与讨论

相关主题
● 问个弱弱的sas问题	● 请问如何把SAS dataset转到R里？
● 发包子求大牛解SAS问题，急	● 求问一道SAS adv 题
● SAS Question	● 如何把access query转成sas dataset?
● 用什么可以画这个clustering 图? R?	● 请教一个数据分类的问题
● 学习R的过程还是挺痛苦的！	● 请教一下R高手们
● 关于rank sum test（Mann-Whitney test）	● Clustering analysis with categorical variables
● Please help me with SAS! Thank you!	● [合集] k-mean clustering
● 昨天考了SAS ADVANCED，不难	● 请教一个R:K-means的问题

相关话题的讨论汇总
话题: date话题: prev话题: attr话题: query

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)