Slow elasticsearch results when data is big

Hi,

We're considering moving our own lucene based distributed index and search
system to elasticsearch.

From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.

I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).

Thanks, Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I improved elasticsearch performance in 40% by changing the java on all my
machines to 64bit and setting the JVM memory to 8G.

On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:

Hi,

We're considering moving our own lucene based distributed index and search
system to elasticsearch.

From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.

I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).

Thanks, Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

A search of 100 words takes 3m:48s if the max returned results are 2000 per
search word, while this run time drops to only 12 seconds if the max
returned results is 20 per search word.

On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:

Hi,

We're considering moving our own lucene based distributed index and search
system to elasticsearch.

From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.

I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).

Thanks, Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Same search of 100 words with Lucene takes 55s if the max returned results
are 2000 (while it takes 3m:48s in elasticsearch).

On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:

Hi,

We're considering moving our own lucene based distributed index and search
system to elasticsearch.

From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.

I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).

Thanks, Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

to be honest 55 seconds is not acceptable for 100 terms even in plain
lucene. can you explain your usecase a bit. this sounds odd.
Also what does 100 words query mean, disjunction / conjunction? are you
paging deeply, how many shards are you using, which fetch method / search
type?
simon

On Monday, March 25, 2013 10:06:14 AM UTC+1, Ophir Michaeli wrote:

Same search of 100 words with Lucene takes 55s if the max returned results
are 2000 (while it takes 3m:48s in elasticsearch).

On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:

Hi,

We're considering moving our own lucene based distributed index and
search system to elasticsearch.

From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.

I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).

Thanks, Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

The lucene test has the following configuration: 26 million indexed
documents divided into 2 lucene instances and searched at the same time.

100 words search means each word is searched separately. No paging.

For the elasticsearch test I use 2 shards and 2 replicas.

The json is something like this:

http://localhost:9200/pinterest_index/_search

{

"size": 2000,

"query": {

"query_string": {

  "query": "air",

  "fields": [

    "board^5",

    "user^1",

    "description^10"

  ],

  "analyzer": "snowball",

  "phrase_slop": 1000.0

}

},

"fields": [

"iDPin",

"iDPicture"

]

}

Thanks!
On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:

Hi,

We're considering moving our own lucene based distributed index and search
system to elasticsearch.

From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.

I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).

Thanks, Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I improved the time dramatically by moving from a client that uses 1 thread
to 50 threads.

Now a 100 words search on 26 million indexed words takes 9 seconds using
elasticsearch (4 nodes, 2 shards, 2 replicas).

On Sunday, March 24, 2013 3:29:04 PM UTC+2, Ophir Michaeli wrote:

Hi,

We're considering moving our own lucene based distributed index and search
system to elasticsearch.

From tests I ran for 1 million indexed documents I get that elasticsearch
is running faster, but when running tests on 26 million I get that
elasticsearch is much slower.

I ran the 1 million and 25 million tests in the following search
configuration: 1 machine with 1 elasticsearch http node, 4 machines with 1
elasticsearch data node per machine (1 elasticsearch index with 2 shards
and 2 replicas).

Thanks, Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.