l**b 发帖数: 11 | 1 Is there any way to create a table with more than 1000 columns?!! | aw 发帖数: 127 | 2 WHY DO YOU NEED SUCH A TABLE?
难以想象。
【在 l**b 的大作中提到】 : Is there any way to create a table with more than 1000 columns?!!
| l**b 发帖数: 11 | 3 e.g., a 20000x2000 matrix, which is very common for data mining.
【在 aw 的大作中提到】 : WHY DO YOU NEED SUCH A TABLE? : 难以想象。
| aw 发帖数: 127 | 4 恕俺无知,你SURE你的一个RECORD有1000个ATTRIBUTES?ORACLE也就是1000的LIMIT,稍
微具体点讲讲你的DESIGN?
【在 l**b 的大作中提到】 : e.g., a 20000x2000 matrix, which is very common for data mining.
| l**b 发帖数: 11 | 5 We know that the common web server log file is just a flat text file.
suppose we have 2000 web pages in a web site. If a row records all pages
visited by a certain user (he may visit some of 2000 pages), and if 20000
users have been identified, how to load such records (in text file it would be
a 20000x2000 matrix, cell value is how many seconds the user spent on the
page.) into database?
稍
【在 aw 的大作中提到】 : 恕俺无知,你SURE你的一个RECORD有1000个ATTRIBUTES?ORACLE也就是1000的LIMIT,稍 : 微具体点讲讲你的DESIGN?
| s***e 发帖数: 284 | 6 table 1: web page (page_id, ...
table 2: user ( user_id, ...
table 3: access record ( user_id, page_id, access_time )
【在 l**b 的大作中提到】 : We know that the common web server log file is just a flat text file. : suppose we have 2000 web pages in a web site. If a row records all pages : visited by a certain user (he may visit some of 2000 pages), and if 20000 : users have been identified, how to load such records (in text file it would be : a 20000x2000 matrix, cell value is how many seconds the user spent on the : page.) into database? : : 稍
| aw 发帖数: 127 | 7 lieb,你该回去复习DATABASE基本概念了,没有冒犯的意思。
would be
恕俺无知,你SURE你的一个RECORD有1000个ATTRIBUTES?ORACLE也就是1000的LIMIT,
【在 s***e 的大作中提到】 : table 1: web page (page_id, ... : table 2: user ( user_id, ... : table 3: access record ( user_id, page_id, access_time )
| l**b 发帖数: 11 | 8 Thanks, shuke. you are definitely right from the database design perspective.
Then, for table 3 we would have 40,000,000(much fewer if the original matrix
is a sparse matrix.) rows while having only 3 columns.
The only concern is that for separating into more tables, more aggregation
code (join query) need to be written when geting data back to do some
matrix-oriented computation.
Not sure which way is faster, the database tables or a single 20000x2000
matrix-looking text file, in terms of gett
【在 s***e 的大作中提到】 : table 1: web page (page_id, ... : table 2: user ( user_id, ... : table 3: access record ( user_id, page_id, access_time )
| l**b 发帖数: 11 | 9 flush) good reminder, not using database for such a long time...
too much text mining...
20000
the
【在 aw 的大作中提到】 : lieb,你该回去复习DATABASE基本概念了,没有冒犯的意思。 : : would be : 恕俺无知,你SURE你的一个RECORD有1000个ATTRIBUTES?ORACLE也就是1000的LIMIT,
| b*e 发帖数: 3845 | 10 exactly.
【在 aw 的大作中提到】 : lieb,你该回去复习DATABASE基本概念了,没有冒犯的意思。 : : would be : 恕俺无知,你SURE你的一个RECORD有1000个ATTRIBUTES?ORACLE也就是1000的LIMIT,
| s***e 发帖数: 284 | 11 If you import data into a database, you'd better stick on database for
further computation as much as possible. For example, creating indexes
on user_id or page_id in table 3 could improve your join query performance
a lot. If you can't utilize the advantage of database, just forget it.
【在 l**b 的大作中提到】 : Thanks, shuke. you are definitely right from the database design perspective. : Then, for table 3 we would have 40,000,000(much fewer if the original matrix : is a sparse matrix.) rows while having only 3 columns. : The only concern is that for separating into more tables, more aggregation : code (join query) need to be written when geting data back to do some : matrix-oriented computation. : Not sure which way is faster, the database tables or a single 20000x2000 : matrix-looking text file, in terms of gett
|
|