由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Java版 - Twitter Search is Now 3x Faster using Java server
相关主题
最近node.js real time web 很火为啥RDBMS只用一个Index?
Netflix用什么Java Framework (转载)c,java, 数据库内核,数据库应用
话说java nio 库用的多不?a faster sortedMap than treemap ?
再请教一个lucene的问题关于syncronized语句的问题
还是lucene的问题发现 synchronized 的一个问题
这叫啥名词?synchronization for counters
多线程真头疼,但也挺有趣synchronization 锁住了什么?
java SOAP比restful难学吗?新手问个multi-threading关于synchronized和volatile的问题
相关话题的讨论汇总
话题: end话题: request话题: server话题: back话题: java
进入Java版参与讨论
1 (共1页)
T*o
发帖数: 363
r*****l
发帖数: 2859
2
"changing our back-end from MySQL to a real-time version of Lucene"
This may contribute quite a lot to the performance gain.

【在 T*o 的大作中提到】
: http://engineering.twitter.com/2011/04/twitter-search-is-now-3x
g*****g
发帖数: 34805
3
Don't get this part, MySql is a DB, Lucene is a search engine.
how is this replacible?

【在 r*****l 的大作中提到】
: "changing our back-end from MySQL to a real-time version of Lucene"
: This may contribute quite a lot to the performance gain.

g*****g
发帖数: 34805
4
Reading the blog, it seems they get this by changing the architecture
from synchronous mode to asynchrnous mode, that's where the most
gain is coming from. They also imply Ruby on Rail is getting unmaintainable
to do this kind of change, or lack of NIO libraries. I am surprised they
didn't do it using Scala though.

【在 T*o 的大作中提到】
: http://engineering.twitter.com/2011/04/twitter-search-is-now-3x
l******e
发帖数: 12192
5
indexing?

【在 g*****g 的大作中提到】
: Don't get this part, MySql is a DB, Lucene is a search engine.
: how is this replacible?

i**e
发帖数: 6810
6
NIO是异步process web requests。有什么web server能异步
到后台取data,然后回到原来的socket connection去serve page?

【在 g*****g 的大作中提到】
: Reading the blog, it seems they get this by changing the architecture
: from synchronous mode to asynchrnous mode, that's where the most
: gain is coming from. They also imply Ruby on Rail is getting unmaintainable
: to do this kind of change, or lack of NIO libraries. I am surprised they
: didn't do it using Scala though.

F****n
发帖数: 3271
7
If you think of DBMS as nothing but indexing, Lucene has its own indexing
managing & access mechanism, which is much faster than other DBs for Lucene'
s own specific tasks.

【在 g*****g 的大作中提到】
: Don't get this part, MySql is a DB, Lucene is a search engine.
: how is this replacible?

r*****l
发帖数: 2859
8
Yes. My feeling is that the index engine and new architecture help directly.
The title implies Java is the main reason though.

unmaintainable

【在 g*****g 的大作中提到】
: Reading the blog, it seems they get this by changing the architecture
: from synchronous mode to asynchrnous mode, that's where the most
: gain is coming from. They also imply Ruby on Rail is getting unmaintainable
: to do this kind of change, or lack of NIO libraries. I am surprised they
: didn't do it using Scala though.

z***e
发帖数: 5393
9
自己实现啊,web server只是frontend接受请求,后面就自己处理了吧。
原理不复杂,给每个connection一个ID,然后就可以随便怎么折腾了,等数据回来了,
根据ID再写回
去,就好像一个proxy server一样。
或者client发送请求后就把connection断了,靠client不断poll来取数据。

【在 i**e 的大作中提到】
: NIO是异步process web requests。有什么web server能异步
: 到后台取data,然后回到原来的socket connection去serve page?

g*****g
发帖数: 34805
10
In java's term, they create a Future in servlet, and block
on Future to return. In the Future, they do all kinds of
async processing. On a loaded system, there'll be less CPU
cycles blocking on IO, and they can achieve better throughput.
Though they don't really use servlet, that part is in RoR.

【在 i**e 的大作中提到】
: NIO是异步process web requests。有什么web server能异步
: 到后台取data,然后回到原来的socket connection去serve page?

相关主题
这叫啥名词?为啥RDBMS只用一个Index?
多线程真头疼,但也挺有趣c,java, 数据库内核,数据库应用
java SOAP比restful难学吗?a faster sortedMap than treemap ?
进入Java版参与讨论
s***o
发帖数: 2191
11
Will Oracle sue this?

faster_1656.html

【在 T*o 的大作中提到】
: http://engineering.twitter.com/2011/04/twitter-search-is-now-3x
i**e
发帖数: 6810
12
嗯。应该是你说的这样。我是想知道现在有哪个(open source)
web framework implement了这个

【在 g*****g 的大作中提到】
: In java's term, they create a Future in servlet, and block
: on Future to return. In the Future, they do all kinds of
: async processing. On a loaded system, there'll be less CPU
: cycles blocking on IO, and they can achieve better throughput.
: Though they don't really use servlet, that part is in RoR.

i**e
发帖数: 6810
13
土办法大概就是把request(连同connection, headers, etc.)
放在一个hashtable里,然后发一个基于NIO的异步request到后台。
但是,跟前台的servlet结合怎么弄?servlet是同步的by design吧。

【在 z***e 的大作中提到】
: 自己实现啊,web server只是frontend接受请求,后面就自己处理了吧。
: 原理不复杂,给每个connection一个ID,然后就可以随便怎么折腾了,等数据回来了,
: 根据ID再写回
: 去,就好像一个proxy server一样。
: 或者client发送请求后就把connection断了,靠client不断poll来取数据。

i**e
发帖数: 6810
14
你说的这个貌似仍然是block的,只不过blocked on Future?
这个跟在memory/thread里面block有啥区别捏?

【在 g*****g 的大作中提到】
: In java's term, they create a Future in servlet, and block
: on Future to return. In the Future, they do all kinds of
: async processing. On a loaded system, there'll be less CPU
: cycles blocking on IO, and they can achieve better throughput.
: Though they don't really use servlet, that part is in RoR.

c******n
发帖数: 4965
15
this is essentially the thread-vs-msg processing argument
Cassandra does exactly what you said: every request creates a handler and
Cassandra shoves it in a huge map , with that request ID, when reply msg
comes back, the ID is used to look up the request handler. so overall
there are very few "processor " threads, but there can be many many more
requests on the queue

【在 z***e 的大作中提到】
: 自己实现啊,web server只是frontend接受请求,后面就自己处理了吧。
: 原理不复杂,给每个connection一个ID,然后就可以随便怎么折腾了,等数据回来了,
: 根据ID再写回
: 去,就好像一个proxy server一样。
: 或者client发送请求后就把connection断了,靠client不断poll来取数据。

g*****g
发帖数: 34805
16
You can use plain servlet to hook up netty or mina. They use netty
here.

【在 i**e 的大作中提到】
: 嗯。应该是你说的这样。我是想知道现在有哪个(open source)
: web framework implement了这个

i**e
发帖数: 6810
17
Hmm. I must be missing something. In the case we are
discussing, there are two web servers involve, one
front-end server serving web requests, which in turn
calls a back-end server for mashing up data.
I thought netty or mina used async network handling.
But for the servlet running on the front-end server,
the requests going to back-end servers are still
blocking?

【在 g*****g 的大作中提到】
: You can use plain servlet to hook up netty or mina. They use netty
: here.

g*****g
发帖数: 34805
18
Http is a request/response protocol, unless you are using a long
poll (comet like framework) in web layer, it has to be blocking
in front end. You can, however,do the heavy lifting in another
component.

【在 i**e 的大作中提到】
: Hmm. I must be missing something. In the case we are
: discussing, there are two web servers involve, one
: front-end server serving web requests, which in turn
: calls a back-end server for mashing up data.
: I thought netty or mina used async network handling.
: But for the servlet running on the front-end server,
: the requests going to back-end servers are still
: blocking?

i**e
发帖数: 6810
19
Isn't this what they did at twitter? I think they
made the front-end async. When a request is received
by front-end, it sends a request to back-end service
and continue on. When the back-end response is back,
someone picks up the response and mash it up and
send to the original front-end client.
"Creating a fully asynchronous aggregation service.
No thread waits on network I/O to complete."

【在 g*****g 的大作中提到】
: Http is a request/response protocol, unless you are using a long
: poll (comet like framework) in web layer, it has to be blocking
: in front end. You can, however,do the heavy lifting in another
: component.

g*****g
发帖数: 34805
20
They made the heavy lifting part async, that's all.
Http protocol is a synchronous protocol and you can't
change that. It's not like there's a connection open,
and the server can push data to client whenever it wants.

【在 i**e 的大作中提到】
: Isn't this what they did at twitter? I think they
: made the front-end async. When a request is received
: by front-end, it sends a request to back-end service
: and continue on. When the back-end response is back,
: someone picks up the response and mash it up and
: send to the original front-end client.
: "Creating a fully asynchronous aggregation service.
: No thread waits on network I/O to complete."

i**e
发帖数: 6810
21
The request handler code can be async, though.
Traditionally the request handling thread is blocked
(as in servlets) waiting for back-end I/O (file system
or network). It sounds like twister has made this non-blocking,
which means the thread is freed to do other things. When
back-end I/O is done, the back-end response thread sends
data back to the front-end client.

【在 g*****g 的大作中提到】
: They made the heavy lifting part async, that's all.
: Http protocol is a synchronous protocol and you can't
: change that. It's not like there's a connection open,
: and the server can push data to client whenever it wants.

1 (共1页)
进入Java版参与讨论
相关主题
新手问个multi-threading关于synchronized和volatile的问题还是lucene的问题
Talk a little more about How to lock a file这叫啥名词?
请教一个多线程lock机制的问题多线程真头疼,但也挺有趣
怎么synchronize时间捏java SOAP比restful难学吗?
最近node.js real time web 很火为啥RDBMS只用一个Index?
Netflix用什么Java Framework (转载)c,java, 数据库内核,数据库应用
话说java nio 库用的多不?a faster sortedMap than treemap ?
再请教一个lucene的问题关于syncronized语句的问题
相关话题的讨论汇总
话题: end话题: request话题: server话题: back话题: java