问个问题 - Java版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Java版 - 问个问题

相关主题
● Ejb 3.0 deployment descriptor?	● Hibernate question
● Spring 工作机会好象不多啊!	● Job with Oracle PL?
● 问问java认证	● 招聘启事
● 问个问题	● [转载] oracle能开多大的连接池？
● Spring 2.5 vs. EJB3.0	● c，java, 数据库内核，数据库应用
● Java Swing Application都用哪些Framework	● 搞不懂为什么hibernate为什么这么流行？
● any good j2ee book?	● 云计算如何应用到传统的web server应用
● 新手学习java spring, hibernate或者struts的问题	● 请教个ec2 + nosql 的问题

相关话题的讨论汇总
话题: db话题: oodb话题: relational话题: xml话题: approach

进入Java版参与讨论

(共1页)

t**2
发帖数: 75

关于数据存储的，
设计一个web application, 数据在数据库中的存储问题，
一个观点是把domain object都做成单独的数据库table.
另一个观点是建立尽可能少的table或table column, 把大部分信息用XML的形式存在数
据库里，xml 中的数据的处理在application 层来实现。
比如 Order 这个table, 用第一个方法就需要把order的所有信息做成table column,
第二个方法会在数据 order table 里面存储一些必要的信息，如customerId,
orderDate等，而其他大部分信息都用一个 text field 存储一个 OrderXml.
这两种存储方法各有什么利弊呢？
谢谢

g*****g
发帖数: 34805

I don't see any benefit with the 2nd approach, it'd be really slow
if you need to do search.

【在 t**2 的大作中提到】

: 关于数据存储的，
: 设计一个web application, 数据在数据库中的存储问题，
: 一个观点是把domain object都做成单独的数据库table.
: 另一个观点是建立尽可能少的table或table column, 把大部分信息用XML的形式存在数
: 据库里，xml 中的数据的处理在application 层来实现。
: 比如 Order 这个table, 用第一个方法就需要把order的所有信息做成table column,
: 第二个方法会在数据 order table 里面存储一些必要的信息，如customerId,
: orderDate等，而其他大部分信息都用一个 text field 存储一个 OrderXml.
: 这两种存储方法各有什么利弊呢？
: 谢谢

t**2
发帖数: 75

需要search的东西就提出来用个column存上。
支持第二种方法的人的主要理由是尽量使用简单的数据库design. 这样可以对未来的可
维护性有好处。

在数
column,

【在 g*****g 的大作中提到】

: I don't see any benefit with the 2nd approach, it'd be really slow
: if you need to do search.

g*****g
发帖数: 34805

I don't see how it makes it simpler, you have to write code to transform
to and from XML, performance is gonna be a big issue. Over the years of
archtecture design, one thing I learn is, worry about future when future
comes

【在 t**2 的大作中提到】

: 需要search的东西就提出来用个column存上。
: 支持第二种方法的人的主要理由是尽量使用简单的数据库design. 这样可以对未来的可
: 维护性有好处。
:
: 在数
: column,

c*****t
发帖数: 1879

The second approach is a bit problematic since you would have hard time
querying fields within the XML document. Integrity is not much of an
issue since you can always verify the integrity of the XML document
using triggers. This approach is really for cases where you could
potentially get thousands of columns without clear idea of the schema
and the table is sparse (some e-commerce data could be like this).
In those cases, unless you are doing research, use native XML databases
instead, such

【在 t**2 的大作中提到】

g**e
发帖数: 6127

赞最后一句

【在 g*****g 的大作中提到】

: I don't see how it makes it simpler, you have to write code to transform
: to and from XML, performance is gonna be a big issue. Over the years of
: archtecture design, one thing I learn is, worry about future when future
: comes

F****n
发帖数: 3271

The second approach is actually not as bad as it looks. It equals to save a
big chunck of serialized object in a database. Since XML parsing /
serialization is standardized, this approach eliminated any future evolution
on the DB part, which can be expensive.
For example, if you have a new property in your data, in the first approach,
you need to revise your DB design, but in the second approach, you don't
need to do nothing because XML wraps all your crap:)

【在 g*****g 的大作中提到】

g*****g
发帖数: 34805

Unless you have a design in which you allow users to add new column
dynamically, I don't see the pros overweights cons.

a
evolution
approach,

【在 F****n 的大作中提到】

: The second approach is actually not as bad as it looks. It equals to save a
: big chunck of serialized object in a database. Since XML parsing /
: serialization is standardized, this approach eliminated any future evolution
: on the DB part, which can be expensive.
: For example, if you have a new property in your data, in the first approach,
: you need to revise your DB design, but in the second approach, you don't
: need to do nothing because XML wraps all your crap:)

A**o
发帖数: 1550

and it's debugging hell once you have data corruption in the xml column.
the plain old table design on the other hand is much easier to debug.

【在 g*****g 的大作中提到】

: Unless you have a design in which you allow users to add new column
: dynamically, I don't see the pros overweights cons.
:
: a
: evolution
: approach,

F****n
发帖数: 3271

Simply put, the 1st approach applies to a closed domain with limited types.
The second approach is suitable to open domains. The idea is to only index
those fields you have to, and leave others as serialized objects. I believe
most applications use some variants of this approach.

【在 g*****g 的大作中提到】

: Unless you have a design in which you allow users to add new column
: dynamically, I don't see the pros overweights cons.
:
: a
: evolution
: approach,

相关主题
● Java Swing Application都用哪些Framework	● Hibernate question
● any good j2ee book?	● Job with Oracle PL?
● 新手学习java spring, hibernate或者struts的问题	● 招聘启事
进入Java版参与讨论

c*****t
发帖数: 1879

check out templatedb.sourceforge.net .

.
believe

【在 F****n 的大作中提到】

: Simply put, the 1st approach applies to a closed domain with limited types.
: The second approach is suitable to open domains. The idea is to only index
: those fields you have to, and leave others as serialized objects. I believe
: most applications use some variants of this approach.

k***r
发帖数: 4260

Second this. If the order details are likely to change and
the changing fields are less likely to be search upon, the
second approach makes a lot of sense. No db schema changes.

.
believe

【在 F****n 的大作中提到】

F****n
发帖数: 3271

What's the difference between the so called template db and predicate logic
or other knowledge representation systems?

【在 c*****t 的大作中提到】

: check out templatedb.sourceforge.net .
:
: .
: believe

c*****t
发帖数: 1879

templatedb is a relational database, except that that relational schema
is not required up front. It takes on the fact that SQL queries have
absolutely nothing to do with relational schema. Many limitations of
RDBMS as we know it (fixed number of columns etc) are artifacts of
relational schema requirement, which is in turn due to optimization
(which has a lot of assumptions that may not hold).
So templatedb is purely a modification to the data storage system, which
brought the change that elim

【在 F****n 的大作中提到】

: What's the difference between the so called template db and predicate logic
: or other knowledge representation systems?

g*****g
发帖数: 34805

Why don't we just use object DB instead.
I agree DB schema is just for optimization.
I would like to see ORM, or any object-storage mapping
is implemented and optimized in database. And I believe
that will be the case sooner or later.

【在 c*****t 的大作中提到】

: templatedb is a relational database, except that that relational schema
: is not required up front. It takes on the fact that SQL queries have
: absolutely nothing to do with relational schema. Many limitations of
: RDBMS as we know it (fixed number of columns etc) are artifacts of
: relational schema requirement, which is in turn due to optimization
: (which has a lot of assumptions that may not hold).
: So templatedb is purely a modification to the data storage system, which
: brought the change that elim

A**o
发帖数: 1550

sounds like m$ has been playing around with the idea?
when will the java version being implemented?

【在 g*****g 的大作中提到】

: Why don't we just use object DB instead.
: I agree DB schema is just for optimization.
: I would like to see ORM, or any object-storage mapping
: is implemented and optimized in database. And I believe
: that will be the case sooner or later.

F****n
发帖数: 3271

Relational DB is a special case of predicate logic. Predicate Logic is like
relational DB without the constraints of tabular structure or schema. So
this templatedb really sounds like a predicate logic based knowledge
representation system, which has been studied for 50 years.

【在 c*****t 的大作中提到】

F****n
发帖数: 3271

Because in an open domain, you cannot assume you know the structure of all
your objects. Such structure may be extremely expensive to learn and very
difficult to maintain. And you still want to index some properties that you
are sure about. Just imagine that you only need to index on 5 properties, it
is pretty easy to extract these 5 from all objects and throw others to
serialization. On the contrary, it will be unnecessarily difficult to
develop an ORM for all objects, not to mention the much h

【在 g*****g 的大作中提到】

F****n
发帖数: 3271

OODB was technically mature 10 years ago. However big companies and "stupid"
users rejected it. It is thought to be too complicated compared with
relational DB.

【在 A**o 的大作中提到】

: sounds like m$ has been playing around with the idea?
: when will the java version being implemented?

g*****g
发帖数: 34805

I understand that DB is not smart enough and I don't mind giving it
some hint. e.g. Add an annotation on domain object to index certain
properties. But I'd like to see ORM, plus schema, plus most of EJB3
annotation being maintained automatically by Db.

you
it
for

【在 F****n 的大作中提到】

: Because in an open domain, you cannot assume you know the structure of all
: your objects. Such structure may be extremely expensive to learn and very
: difficult to maintain. And you still want to index some properties that you
: are sure about. Just imagine that you only need to index on 5 properties, it
: is pretty easy to extract these 5 from all objects and throw others to
: serialization. On the contrary, it will be unnecessarily difficult to
: develop an ORM for all objects, not to mention the much h

相关主题
● [转载] oracle能开多大的连接池？	● 云计算如何应用到传统的web server应用
● c，java, 数据库内核，数据库应用	● 请教个ec2 + nosql 的问题
● 搞不懂为什么hibernate为什么这么流行？	● JSF, Wicket, and Vaadin
进入Java版参与讨论

m******t
发帖数: 2416

.
believe
The core domain of our current applications is serialized to xml.
It's hard to query, hard to troubleshoot, and (counterintuitively)
hard to morph because of class version compatibility issues.

【在 F****n 的大作中提到】

m******t
发帖数: 2416

Err... I thought the whole point of having an ORM is to abstract away
the database?

【在 g*****g 的大作中提到】

: I understand that DB is not smart enough and I don't mind giving it
: some hint. e.g. Add an annotation on domain object to index certain
: properties. But I'd like to see ORM, plus schema, plus most of EJB3
: annotation being maintained automatically by Db.
:
: you
: it
: for

g*****g
发帖数: 34805

abstract away the DB? Why not DB is abstract itself.

【在 m******t 的大作中提到】

:
: Err... I thought the whole point of having an ORM is to abstract away
: the database?

m******t
发帖数: 2416

Or abstract away the DB _vendors_, to be exact. So it's kind of
hard to abstract somebody away if you also let them maintain the
abstraction layer itself...

【在 g*****g 的大作中提到】

: abstract away the DB? Why not DB is abstract itself.

g*****g
发帖数: 34805

EJB3 already has the entity bean/manager JSR. A little bit more
effort should get it done. I would like to see
1. No DB schema needed
2. Standard save/update/query api in JDK
3. Minimize entity bean annotation so you don't have to specify table/column,
also provides indexing annotation.

【在 m******t 的大作中提到】

:
: Or abstract away the DB _vendors_, to be exact. So it's kind of
: hard to abstract somebody away if you also let them maintain the
: abstraction layer itself...

F****n
发帖数: 3271

10 years ago there was this idea called "persistent programming" where every
object can be persistent and managed by an OODB. The idea as well as OODB
was rejected by mainstream. That's why we have Object Relational MAPPING
instead of direct Object Persistence Model.
It's interesting that people are trying to use ORM to re-create those
features. Personally I don't think they will be very useful.

【在 m******t 的大作中提到】

:
: Or abstract away the DB _vendors_, to be exact. So it's kind of
: hard to abstract somebody away if you also let them maintain the
: abstraction layer itself...

g*****g
发帖数: 34805

I don't see why they shouldn't be very useful. The problem with OODB is
more of big company like Oracle has an ecosystem around Oracle, too much
investment and they don't want people going away from it. The tool support
is what's been lack in OODB.
OODB can solve many classic 1+N query and you don't need DBA guru to write
those join queries. It's definitely better for navigational logic which is
the case for most j2ee apps.

【在 F****n 的大作中提到】

: 10 years ago there was this idea called "persistent programming" where every
: object can be persistent and managed by an OODB. The idea as well as OODB
: was rejected by mainstream. That's why we have Object Relational MAPPING
: instead of direct Object Persistence Model.
: It's interesting that people are trying to use ORM to re-create those
: features. Personally I don't think they will be very useful.

F****n
发帖数: 3271

What you said about 1+N query is absolutely true. But in many cases there is
simpler solution, which is to avoid those queries in the first place.
Simplify your data. 1 Table. No joins.

【在 g*****g 的大作中提到】

: I don't see why they shouldn't be very useful. The problem with OODB is
: more of big company like Oracle has an ecosystem around Oracle, too much
: investment and they don't want people going away from it. The tool support
: is what's been lack in OODB.
: OODB can solve many classic 1+N query and you don't need DBA guru to write
: those join queries. It's definitely better for navigational logic which is
: the case for most j2ee apps.

c*****t
发帖数: 1879

I think that you misunderstood.
Predicate logic is merely the fundamental of the query mechanism.
It has nothing to do with physical representation.
However, in practice, triples are used to store the knowledges, as in
prolog, decl, RDF etc. Triples, in a way, are EAV models (entity,
attribute, values). Sometimes it may be represented as target>, or action.
If you consider entities as tuples in the database, and consider all
attributes as columns for the tuples

【在 F****n 的大作中提到】

: Relational DB is a special case of predicate logic. Predicate Logic is like
: relational DB without the constraints of tabular structure or schema. So
: this templatedb really sounds like a predicate logic based knowledge
: representation system, which has been studied for 50 years.

F****n
发帖数: 3271

There is a branch in AI called knowledge representation, which studies logic
at both conceptual and system level (e.g. automated theorem proving). One
example is Prolog. Another example is expert system. It is in the same
category with DB in the sense that they both look for methods to fast index
/ search structured data.
In AI terms, query on Relational DBMS, is essentially a special case of
backward chaining (in the same camp with Prolog).
Indexing on relational DBMS takes advantages of the fi

【在 c*****t 的大作中提到】

: I think that you misunderstood.
: Predicate logic is merely the fundamental of the query mechanism.
: It has nothing to do with physical representation.
: However, in practice, triples are used to store the knowledges, as in
: prolog, decl, RDF etc. Triples, in a way, are EAV models (entity,
: attribute, values). Sometimes it may be represented as : target>, or action.
: If you consider entities as tuples in the database, and consider all
: attributes as columns for the tuples

相关主题
● 关于Java Bean的一个有趣问题	● Spring 工作机会好象不多啊!
● Multi-tenant SaaS 的2种部署方式	● 问问java认证
● Ejb 3.0 deployment descriptor?	● 问个问题
进入Java版参与讨论

c*****t
发帖数: 1879

Hmm, as I pointed out, storing triples as tuples is one way doing it
with sacrifices. AFAIK, there are 3 categories of storing RDF data
(w/ considering the solving mechanism) and tuple is just one, storing
as triple is another.
I am not particularly familiar with the knowledge representations you
mentioned. It is a big topic that overs EE, AI and SAT problems.
Although there are quite a bit common backgrounds between AI and
databases, I don't think DB people would normally link the two as far

【在 F****n 的大作中提到】

: There is a branch in AI called knowledge representation, which studies logic
: at both conceptual and system level (e.g. automated theorem proving). One
: example is Prolog. Another example is expert system. It is in the same
: category with DB in the sense that they both look for methods to fast index
: / search structured data.
: In AI terms, query on Relational DBMS, is essentially a special case of
: backward chaining (in the same camp with Prolog).
: Indexing on relational DBMS takes advantages of the fi

m******t
发帖数: 2416

column,
I am not sure I'd like that. It might work well when a Java app uses
the db exclusively, but if the db is shared among apps - especially
among heterogenous systems, then you have to have an explicit
schema definition as the contract for everyone.

【在 g*****g 的大作中提到】

: EJB3 already has the entity bean/manager JSR. A little bit more
: effort should get it done. I would like to see
: 1. No DB schema needed
: 2. Standard save/update/query api in JDK
: 3. Minimize entity bean annotation so you don't have to specify table/column,
: also provides indexing annotation.

g*****g
发帖数: 34805

It will work well as an embedded DB for sure.
And you can have a driver to do the conversion
for other languages.

【在 m******t 的大作中提到】

:
: column,
: I am not sure I'd like that. It might work well when a Java app uses
: the db exclusively, but if the db is shared among apps - especially
: among heterogenous systems, then you have to have an explicit
: schema definition as the contract for everyone.

m******t
发帖数: 2416

every
IIUC, the reason OODB never really took off was because of performance
penalty was simply too high.
So it seems to me that we have got the appropriate mix right here -
RDBMS's have got so sophisticate that they are extremely efficient as far
as tabular data processing goes, and then for queries that are truly complex
and dynamic enough to justify an OO-based approach, we have the ORM's.

【在 F****n 的大作中提到】

g*****g
发帖数: 34805

I kind of feel this may change with cloud computing. The share nothing
DB clustering strategy will work extremely well with OODB.
complex

【在 m******t 的大作中提到】

:
: every
: IIUC, the reason OODB never really took off was because of performance
: penalty was simply too high.
: So it seems to me that we have got the appropriate mix right here -
: RDBMS's have got so sophisticate that they are extremely efficient as far
: as tabular data processing goes, and then for queries that are truly complex
: and dynamic enough to justify an OO-based approach, we have the ORM's.

F****n
发帖数: 3271

I don't think performance is an issue. Most indexing techniques used in
relational DB can also been used in OODB.
The main reason, I believe, is simply because most users familiar with
tables refused to switch to OO world, and big companies did not promote it
hard due to their own reasons. In late 1990s and early 2000s, almost all DB
classes in America taught OODB and mentioned relational DB as a legacy and
soon-to-be-obsolete technology. This is because from a pure technical level,
OODB's super

【在 m******t 的大作中提到】

c*****t
发帖数: 1879

OODB is not technologically superior to RDBMS at all.
1. OODB performance is only suitable for certain types of queries.
Even for the queries OODB actually do better, the gain is relatively
small.
2. Behaviors of OODB can be mimiced using ORM tools, and some features
can be added to RDBMS with a bit effort without dramatic changes.
Examples: OID, composite type, type/table inheritance etc.
3. OODB still does not solve the more critical problems of RDBMS,
like schema issues, types

【在 F****n 的大作中提到】

: I don't think performance is an issue. Most indexing techniques used in
: relational DB can also been used in OODB.
: The main reason, I believe, is simply because most users familiar with
: tables refused to switch to OO world, and big companies did not promote it
: hard due to their own reasons. In late 1990s and early 2000s, almost all DB
: classes in America taught OODB and mentioned relational DB as a legacy and
: soon-to-be-obsolete technology. This is because from a pure technical level,
: OODB's super

F****n
发帖数: 3271

~~~~~~~~~~~~~~~~~~~~~
Query performance is mainly dependent on indexing. As I said before most indexing techniques used in relational DB can also be used in OODB. The consensus is that OODB is only significantly faster than relational DB in certain queries. But you obviously misinterpret this fact. It does NOT mean OODB is slower than relational DB in relational queries. In fact in those cases OODB's performance is at the same level with relational DB. In other words, OODB is BETTER or EQUIVALE

【在 c*****t 的大作中提到】

: OODB is not technologically superior to RDBMS at all.
: 1. OODB performance is only suitable for certain types of queries.
: Even for the queries OODB actually do better, the gain is relatively
: small.
: 2. Behaviors of OODB can be mimiced using ORM tools, and some features
: can be added to RDBMS with a bit effort without dramatic changes.
: Examples: OID, composite type, type/table inheritance etc.
: 3. OODB still does not solve the more critical problems of RDBMS,
: like schema issues, types

c*****t
发帖数: 1879

Since the replies are too long, I will just use # here.
1. See http://cs.wisc.edu/~cs764-1/buckybenchmark.pdf . Like all
benchmarks, implementation is certainly an issue. I think that
another paper mentioned OODB performances are quite inconsistent.
So, you could always argue that's not the best implementation ^_^
Relying on indexes does not always work. For one thing, indexes
themselves are extra tables of data to load. Secondary indexes
are also going to be slower than pr

F****n
发帖数: 3271

I found it would be too expensive to discuss all those questions:))
But as to the performance issue, the key thing is, all relational operations can be implemented with same or better performance in OODB with proper storage / indexing strategy. In other words, conceptually RDBMS can be regarded as a special case of OODB. Downgrading an OODB to RDBMS is very easy, but upgrading a RDBMS to OODB would be difficult, requiring ugly patching such as ORM. That's why 10 years ago most people in universi

【在 c*****t 的大作中提到】

: Since the replies are too long, I will just use # here.
: 1. See http://cs.wisc.edu/~cs764-1/buckybenchmark.pdf . Like all
: benchmarks, implementation is certainly an issue. I think that
: another paper mentioned OODB performances are quite inconsistent.
: So, you could always argue that's not the best implementation ^_^
: Relying on indexes does not always work. For one thing, indexes
: themselves are extra tables of data to load. Secondary indexes
: are also going to be slower than pr

(共1页)

进入Java版参与讨论

相关主题
● 请教个ec2 + nosql 的问题	● Spring 2.5 vs. EJB3.0
● JSF, Wicket, and Vaadin	● Java Swing Application都用哪些Framework
● 关于Java Bean的一个有趣问题	● any good j2ee book?
● Multi-tenant SaaS 的2种部署方式	● 新手学习java spring, hibernate或者struts的问题
● Ejb 3.0 deployment descriptor?	● Hibernate question
● Spring 工作机会好象不多啊!	● Job with Oracle PL?
● 问问java认证	● 招聘启事
● 问个问题	● [转载] oracle能开多大的连接池？

相关话题的讨论汇总
话题: db话题: oodb话题: relational话题: xml话题: approach

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天