w*r 发帖数: 2421 | 1 从一开始oracle的设计思想就是针对OLTP,share everything的系统非常难以做好分布
式的计算,数据库的表一但大一些strategic query的优化就一塌湖涂了,看来一时半
会儿是赶不上DB2/Teradata的 | c*******e 发帖数: 8624 | 2 现在数据库语言还是局限太多,当然也可能是我功力不够
【在 w*r 的大作中提到】 : 从一开始oracle的设计思想就是针对OLTP,share everything的系统非常难以做好分布 : 式的计算,数据库的表一但大一些strategic query的优化就一塌湖涂了,看来一时半 : 会儿是赶不上DB2/Teradata的
| w*r 发帖数: 2421 | 3 标准ANSI的SQL就那么些了,每个公司自己有自己的extension, Teradata的extension
还好,相对来说performance比较稳定,OLAP的的function优化得虽然一般但是比
oracle里面的看起来很powerful,但是效率不稳定的facey的东西要好.
【在 c*******e 的大作中提到】 : 现在数据库语言还是局限太多,当然也可能是我功力不够
| c*******e 发帖数: 8624 | 4 teradata比oracle效率强太多了
extension
【在 w*r 的大作中提到】 : 标准ANSI的SQL就那么些了,每个公司自己有自己的extension, Teradata的extension : 还好,相对来说performance比较稳定,OLAP的的function优化得虽然一般但是比 : oracle里面的看起来很powerful,但是效率不稳定的facey的东西要好.
| I******e 发帖数: 101 | 5 Did not get it. So long as you are not playing with more than 10TB dataset,
Oracle is more than enough to solve any strategic plan.
Can you give me show me an example that Oracle did not get it? | w*r 发帖数: 2421 | 6 TB1 200K rows,
TB2 9M rows,
TB1 join TB2 based on certain conditions, all condition columns has been
indexed individually, even function based index has been build.
due to the fact PL/SQL's low efficiency, all functions has been converted to
JAVA functions and embeded into the database. Indexes were build on top
these functions.
Cartesian factor from TB1 to TB2 is about 4, and expected resultset of 1M
rows.
Oracle cannot correct choose the most efficient query plan by using indexes
, full table | I******e 发帖数: 101 | 7 This depends a lot on your join conditions and how system is bounded: IO,
CPU or Network.
If you only do natural join on your TB1 and TB2, hash join based on table
scan should be chosen. If you have complex functional based index, because
the PL/SQL or Java engine is out of the SQL engine(though they are in the
same binary), the context switch is still very expensive.
Can you tell me the explain plan information? Are you using Oracle 10g? | w*r 发帖数: 2421 | 8 I am using Oracle 10g, java store procedure is way more efficient than PLSQL
stored procedure. Indexes were build on the columns I tried to join.
Basically, if the larger table has full statistics, then the optimizer
choose full tablescan for the larger table and hash join. If I delete the
statistics on larger table, only leave the indexes statistics, the optimizer
choose bitmap index join first and then access table by rowid by the result
of the join:D
because
【在 I******e 的大作中提到】 : This depends a lot on your join conditions and how system is bounded: IO, : CPU or Network. : If you only do natural join on your TB1 and TB2, hash join based on table : scan should be chosen. If you have complex functional based index, because : the PL/SQL or Java engine is out of the SQL engine(though they are in the : same binary), the context switch is still very expensive. : Can you tell me the explain plan information? Are you using Oracle 10g?
| I******e 发帖数: 101 | 9 TB1 200K rows,
TB2 9M rows,
Let us assume each row in 1K, then TB1 is 200MB, TB2 is 9GB. Assuming that
you have decent IO capacity, like 50MB/s, then the table scan should only
take
((9 GB) + (200 MB)) / (50 MB/s) = 188.32s, around 3 minutes.
So, table scan should not cause the problem.
If you have enough memory for hash join, then hash join should be much
faster than any index join. Are you using automatic memory management or
fixed memory management in 10g?
PLSQL
optimizer
result
【在 w*r 的大作中提到】 : I am using Oracle 10g, java store procedure is way more efficient than PLSQL : stored procedure. Indexes were build on the columns I tried to join. : Basically, if the larger table has full statistics, then the optimizer : choose full tablescan for the larger table and hash join. If I delete the : statistics on larger table, only leave the indexes statistics, the optimizer : choose bitmap index join first and then access table by rowid by the result : of the join:D : : because
| w*r 发帖数: 2421 | 10 well, you would not imagine that I would let the query just return the 1.xM
rows to query cursor, right? the, CTAS was used to store the result into a
new table, the hash join sounds fine, however once PLSQL involved in the
return result in SELECT statement, it will dramatically increase the CPU
usage, after change the PLSQL to java function, then I cut the run time by
almost 70%. By disable the hash join, the run time cut another 40%.
BTW, your calculaiton assumes the result and the join will
【在 I******e 的大作中提到】 : TB1 200K rows, : TB2 9M rows, : Let us assume each row in 1K, then TB1 is 200MB, TB2 is 9GB. Assuming that : you have decent IO capacity, like 50MB/s, then the table scan should only : take : ((9 GB) + (200 MB)) / (50 MB/s) = 188.32s, around 3 minutes. : So, table scan should not cause the problem. : If you have enough memory for hash join, then hash join should be much : faster than any index join. Are you using automatic memory management or : fixed memory management in 10g?
| | | I******e 发帖数: 101 | 11 Your understanding about hash join's memory usage might not be correct.
Hash join only requires around sqrt(data size) to do in memory join. Even
though you are joining terabytes data, hash join can still do in-memory join
with around 300MB memory, which is the power of hash join.
I wonder how much memory hash join is using in your case. This depends
whether you are using automatic memory management in 10g. Hash join's
memory usage is not related to memory buffer or SGA which can be influenc
【在 w*r 的大作中提到】 : well, you would not imagine that I would let the query just return the 1.xM : rows to query cursor, right? the, CTAS was used to store the result into a : new table, the hash join sounds fine, however once PLSQL involved in the : return result in SELECT statement, it will dramatically increase the CPU : usage, after change the PLSQL to java function, then I cut the run time by : almost 70%. By disable the hash join, the run time cut another 40%. : BTW, your calculaiton assumes the result and the join will
| w*r 发帖数: 2421 | 12 Well, the system I was using does not have automatica SGA enabled. It is VLM
configuratoin which was an effort trying to use more than 4GB of memory.
Once VLM is enabled, automatica SGA cannot be deployed. I believe this might
caused multiple read/write problem I described.
Also, I have another question, since you sounds like an Oracle expert, I
once have issues with memory setting. I posted a question on metalink,
however, nobody give hints so far. I did follow the Oracle's memory setting
and o
【在 I******e 的大作中提到】 : Your understanding about hash join's memory usage might not be correct. : Hash join only requires around sqrt(data size) to do in memory join. Even : though you are joining terabytes data, hash join can still do in-memory join : with around 300MB memory, which is the power of hash join. : I wonder how much memory hash join is using in your case. This depends : whether you are using automatic memory management in 10g. Hash join's : memory usage is not related to memory buffer or SGA which can be influenc
| I******e 发帖数: 101 | 13 I am not sure about Oracle's parallel load. The error you sees means that
Oracle could not allocate the requested 64K memory. This can be caused by a
lot of reasons and the most likely one is memory fragmentation and Oracle
is still very limited by 3G address space. I do not remember whether VLM is
certified by Oracle on Windows.
Oracle does limit parallel engine's memory usage: each parallel operator can
not use more than 30% of pag_aggregate_target. Still, if your query is
very complex and
【在 w*r 的大作中提到】 : Well, the system I was using does not have automatica SGA enabled. It is VLM : configuratoin which was an effort trying to use more than 4GB of memory. : Once VLM is enabled, automatica SGA cannot be deployed. I believe this might : caused multiple read/write problem I described. : Also, I have another question, since you sounds like an Oracle expert, I : once have issues with memory setting. I posted a question on metalink, : however, nobody give hints so far. I did follow the Oracle's memory setting : and o
| w*r 发帖数: 2421 | 14 There are 4 Xeon processors,
when you say parallel operator, what exactly does it mean? does it mean the
slave thread? or something else.
VLM is mentioned in every oracle documentatoin since 9i, I only tried to use
it after I failed many times to make Oracle efficiently use memory
without generating this type of error. After VLM deployed, it seems that a
lot of queries which used to cause this error stopped throw exception out.
However, when I run the create MTV command with parallel option for
【在 I******e 的大作中提到】 : I am not sure about Oracle's parallel load. The error you sees means that : Oracle could not allocate the requested 64K memory. This can be caused by a : lot of reasons and the most likely one is memory fragmentation and Oracle : is still very limited by 3G address space. I do not remember whether VLM is : certified by Oracle on Windows. : Oracle does limit parallel engine's memory usage: each parallel operator can : not use more than 30% of pag_aggregate_target. Still, if your query is : very complex and
| I******e 发帖数: 101 | 15 > There are 4 Xeon processors,
What kind of I/O system do you have and how high throughput you can get? Do
you have 20 RAID disks which can make 8 CPUs happy?
> when you say parallel operator, what exactly does it mean? does it mean
the
> slave thread? or something else.
I mean an operator like hash join, or sort which are running in parallel?
>>>
VLM is mentioned in every oracle documentatoin since 9i, I only tried to use
it after I failed many times to make Oracle efficiently use memory
witho | w*r 发帖数: 2421 | 16
Do
Well, to be honest, I was not DBA for our Oracle system, and I still don't
consider
myself an Oracle DBA today. I wasn't hired for this and I was not neither
am get paid to do it. The reason I am dealing with it now is that the so-
called Oracle DBAs in our company are totally idiots. When first time I
checked the Oracle setting, everything was default. No setting, no
configuration. They are just those people who believes click and go theory.
That's why the whole thing was so screwed up dur
【在 I******e 的大作中提到】 : > There are 4 Xeon processors, : What kind of I/O system do you have and how high throughput you can get? Do : you have 20 RAID disks which can make 8 CPUs happy? : > when you say parallel operator, what exactly does it mean? does it mean : the : > slave thread? or something else. : I mean an operator like hash join, or sort which are running in parallel? : >>> : VLM is mentioned in every oracle documentatoin since 9i, I only tried to use : it after I failed many times to make Oracle efficiently use memory
| I******e 发帖数: 101 | 17 Thanks for the insight! DBAs and configurations do control a lot of
database performance. When I joined the new company, they replaced Oracle
system with Netezza database, which claimed to be 100x faster. But, when I
looked at the Oracle system, it is a quad-CPU with only one disk, no RAID is
needed:) My friends interviewed several DBAs who do not really understand
how composite key index works.
For Create MV memory failure, have you tried creating table and see whether
it will went through? | w*r 发帖数: 2421 | 18 I gave up the idea to find good DBA. Good DBA is not someone who knows
technology, DBA also has to be familiar with the business model for which
needs time and effort. when I tried to interview those in-market DBa, most
likely I would end up with someone who claims know something and do not have
insight / patience to learn. Nowadays, there are too many click-n-go DBAs
in market.
I would rather keep our current structure and fire those on-board Oracle
person one by one until the group cannot supp
【在 I******e 的大作中提到】 : Thanks for the insight! DBAs and configurations do control a lot of : database performance. When I joined the new company, they replaced Oracle : system with Netezza database, which claimed to be 100x faster. But, when I : looked at the Oracle system, it is a quad-CPU with only one disk, no RAID is : needed:) My friends interviewed several DBAs who do not really understand : how composite key index works. : For Create MV memory failure, have you tried creating table and see whether : it will went through?
| I******e 发帖数: 101 | 19 I disagree here and DBAs should most care about the technology: they are
responsible for system's efficiency and application's schema. In your data
size, a couple of DBAs should be more than enough.
Big system might disappoint you. I agree that Teradata is better because of
their clustering technology(I never believe DB2 is better). But you have
to pay for the price on hardware and a lot of tuning are hidden inside their
Teradata legacy hardware. My company replace Oracle with the fast Netez
【在 w*r 的大作中提到】 : I gave up the idea to find good DBA. Good DBA is not someone who knows : technology, DBA also has to be familiar with the business model for which : needs time and effort. when I tried to interview those in-market DBa, most : likely I would end up with someone who claims know something and do not have : insight / patience to learn. Nowadays, there are too many click-n-go DBAs : in market. : I would rather keep our current structure and fire those on-board Oracle : person one by one until the group cannot supp
|
|