请教真正了解nosql的大牛个问题 - Programming版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Programming版 - 请教真正了解nosql的大牛个问题

相关主题
● Hbase new column 存储问题	● AWS cloud 内部做log，大家怎么设计
● cassandra 的插入性能问题 (转载)	● Mongo, Cassandra又干上了
● mongoDB跟传统关系数据库比有什么优势?	● 看来我的感觉不错，Hbase下降明显呀
● 鄙视芒果的被打脸了	● HBase的标准应用框架是什么？
● Why You Should Never Use MongoDB	● 大牛能不能讨论下cassandra， Hbase， MongoDB的对比
● 拿Cassandra当MQ用，证明你连Cassandra也不懂	● Re: 问Zhaoce个问题 (转载)
● 本版现在主题就是战啊。。。	● Nosql is not for everyone.
● 做后台，有什么open source 项目平时可以参与？ (转载)	● 感觉nosql那个什么三驾马车完全是以讹传讹

相关话题的讨论汇总
话题: cassandra话题: sstable话题: index话题: leveldb

进入Programming版参与讨论

(共1页)

k****i
发帖数: 128

hbase/big table/cassandra 单机上存储都是memtable和sst
同样leveldb和rocksdb这种embedded也是memtable和sst，但是leveldb在内存中有
index，而以上的nosql db都没有，所以read 要check所有的memtable和sst(一般用
bloomfilter优化). 为什么不能维护个index呢？

w**z
发帖数: 8232

太占内存。

【在 k****i 的大作中提到】

: hbase/big table/cassandra 单机上存储都是memtable和sst
: 同样leveldb和rocksdb这种embedded也是memtable和sst，但是leveldb在内存中有
: index，而以上的nosql db都没有，所以read 要check所有的memtable和sst(一般用
: bloomfilter优化). 为什么不能维护个index呢？

k****i
发帖数: 128

那leveldb为什么要加个index？

【在 w**z 的大作中提到】

: 太占内存。

w**z
发帖数: 8232

I don't know much of leveldb, I am only familiar with Cassandra. There are
different knobs you can turn in Cassandra:
Key Cache
Row Cache
BloomFilter
index_interval within the SSTable (It does have index with in the SSTable to
speed up the lookup)
It's the trade off between memory consumption and read performance. For
Cassandra, you don't want to use more than 8G of Heap to avoid long GC pause
. Cassandra is built in a way that horizontal scale is easy and efficient.
For individual node, normally it holds less than 1T of data. Of course, it
depends on your individual use cases.

【在 k****i 的大作中提到】

: 那leveldb为什么要加个index？

p*****2
发帖数: 21240

nosql不是一种数据库不能这么比

【在 k****i 的大作中提到】

: 那leveldb为什么要加个index？

w**z
发帖数: 8232

Another thing to add, if Cassandra node compaction is not behind too much,
the number of SSTables for each read operation should be less than 5 or 3.
It also depends on the compaction strategy. So bloomfilter + sstable index
are sufficient enough.

to
pause

【在 w**z 的大作中提到】

: I don't know much of leveldb, I am only familiar with Cassandra. There are
: different knobs you can turn in Cassandra:
: Key Cache
: Row Cache
: BloomFilter
: index_interval within the SSTable (It does have index with in the SSTable to
: speed up the lookup)
: It's the trade off between memory consumption and read performance. For
: Cassandra, you don't want to use more than 8G of Heap to avoid long GC pause
: . Cassandra is built in a way that horizontal scale is easy and efficient.

B*****g
发帖数: 34098

大妞在datastax？

【在 w**z 的大作中提到】

: Another thing to add, if Cassandra node compaction is not behind too much,
: the number of SSTables for each read operation should be less than 5 or 3.
: It also depends on the compaction strategy. So bloomfilter + sstable index
: are sufficient enough.
:
: to
: pause

k****i
发帖数: 128

意思是这种基于LSM idea的nosql

【在 p*****2 的大作中提到】

: nosql不是一种数据库不能这么比

k****i
发帖数: 128

5 or 3 disk read for each read request can be a huge number

【在 w**z 的大作中提到】

k****i
发帖数: 128

if we have index for each SSTable, why not put it as a whole. what the
memory consumption difference between these two scenarios?

to
pause

【在 w**z 的大作中提到】

相关主题
● 拿Cassandra当MQ用，证明你连Cassandra也不懂	● AWS cloud 内部做log，大家怎么设计
● 本版现在主题就是战啊。。。	● Mongo, Cassandra又干上了
● 做后台，有什么open source 项目平时可以参与？ (转载)	● 看来我的感觉不错，Hbase下降明显呀
进入Programming版参与讨论

w**z
发帖数: 8232

You can have keycache which typically will be 10% of your heap. 800M can
store a lot of keys.
For us, the 99 percentile read < 10ms. 50 percentile < 3ms. If you are
looking for something subms, Cassandra is not for you.

【在 k****i 的大作中提到】

: 5 or 3 disk read for each read request can be a huge number

w**z
发帖数: 8232

It doesn't index the whole SSTable, it has index interval of 128 by default
which is tunable. For reads, it might access more than one SSTables which
can be optimized using Bloomfilter. If you combine the indexes from
different SSTable in the memory, it makes it very complicated. Remember
SSTables can be compacted during compaction. Maintaining a global indices is
not practical.

【在 k****i 的大作中提到】

: if we have index for each SSTable, why not put it as a whole. what the
: memory consumption difference between these two scenarios?
:
: to
: pause

w**z
发帖数: 8232

不是。只是工作需要，built a few Cassandra Clusters from scratch. 完事后，还
被逼着maintain 它们。

【在 B*****g 的大作中提到】

: 大妞在datastax？

c******o
发帖数: 1277

我觉得Cassandra maintain 起来比 Mongodb 容易多了。

【在 w**z 的大作中提到】

: 不是。只是工作需要，built a few Cassandra Clusters from scratch. 完事后，还
: 被逼着maintain 它们。

g*****g
发帖数: 34805

所有p2p的都相对容易。

【在 c******o 的大作中提到】

: 我觉得Cassandra maintain 起来比 Mongodb 容易多了。

w**z
发帖数: 8232

It's OK in most of cases. But there were a few dramatic moments when doing
repair.

【在 c******o 的大作中提到】

: 我觉得Cassandra maintain 起来比 Mongodb 容易多了。

p*****2
发帖数: 21240

mongo麻烦在哪里

【在 c******o 的大作中提到】

: 我觉得Cassandra maintain 起来比 Mongodb 容易多了。

(共1页)

进入Programming版参与讨论

相关主题
● 感觉nosql那个什么三驾马车完全是以讹传讹	● Why You Should Never Use MongoDB
● Cassandra vs MongoDB	● 拿Cassandra当MQ用，证明你连Cassandra也不懂
● cluster环境里怎么做测试	● 本版现在主题就是战啊。。。
● Play2 vs Vert.x 是什么情况？	● 做后台，有什么open source 项目平时可以参与？ (转载)
● Hbase new column 存储问题	● AWS cloud 内部做log，大家怎么设计
● cassandra 的插入性能问题 (转载)	● Mongo, Cassandra又干上了
● mongoDB跟传统关系数据库比有什么优势?	● 看来我的感觉不错，Hbase下降明显呀
● 鄙视芒果的被打脸了	● HBase的标准应用框架是什么？

相关话题的讨论汇总
话题: cassandra话题: sstable话题: index话题: leveldb

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天