We're considering moving our own lucene based distributed index and search
system to elasticsearch.
From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.
I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).
I improved elasticsearch performance in 40% by changing the java on all my
machines to 64bit and setting the JVM memory to 8G.
On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:
Hi,
We're considering moving our own lucene based distributed index and search
system to elasticsearch.
From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.
I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).
A search of 100 words takes 3m:48s if the max returned results are 2000 per
search word, while this run time drops to only 12 seconds if the max
returned results is 20 per search word.
On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:
Hi,
We're considering moving our own lucene based distributed index and search
system to elasticsearch.
From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.
I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).
Same search of 100 words with Lucene takes 55s if the max returned results
are 2000 (while it takes 3m:48s in elasticsearch).
On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:
Hi,
We're considering moving our own lucene based distributed index and search
system to elasticsearch.
From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.
I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).
to be honest 55 seconds is not acceptable for 100 terms even in plain
lucene. can you explain your usecase a bit. this sounds odd.
Also what does 100 words query mean, disjunction / conjunction? are you
paging deeply, how many shards are you using, which fetch method / search
type?
simon
On Monday, March 25, 2013 10:06:14 AM UTC+1, Ophir Michaeli wrote:
Same search of 100 words with Lucene takes 55s if the max returned results
are 2000 (while it takes 3m:48s in elasticsearch).
On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:
Hi,
We're considering moving our own lucene based distributed index and
search system to elasticsearch.
From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.
I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).
Thanks!
On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:
Hi,
We're considering moving our own lucene based distributed index and search
system to elasticsearch.
From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.
I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).
I improved the time dramatically by moving from a client that uses 1 thread
to 50 threads.
Now a 100 words search on 26 million indexed words takes 9 seconds using
elasticsearch (4 nodes, 2 shards, 2 replicas).
On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:
Hi,
We're considering moving our own lucene based distributed index and search
system to elasticsearch.
From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.
I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.