Performance tuning ES for in-memory

Hi I downloaded the latest ES 1.1.1

I have a 200GB RAM with 2 x 8 cores hyper threaded. "32" cores total
machine 1.6T of disk space.

I start elastic search as follows...

./elasticsearch -Xms100g -Xmx100g -Des.index.store.type=memory
Using Java 1.7_51

I then create my index as follows...

$ curl -XPUT http://localhost:9200/myindex/ -d
'
index :
store:
type: memory
'

And my Java web app (Using vertx.io)

// On app startup... Ensure we have one instance of client. Regardless how
many app threads may write to the index.
synchronized(clientCreated) {
if(clientCreated.compareAndSet(false, true)) {
node = nodeBuilder().clusterName("elasticsearch").client(true).node();
client = node.client();
}
}

// Per request coming into my web application. Using vertx for the web
framework.
// For each request we use the one client instance.
client.prepareIndex("myindex", "doc", request.getString("id"))
.setSource(bodyStr) // Already sending Json so no need to convert
it!
.execute(new ActionListener(){

@Override
public void onFailure(Throwable t) {
req.response().end("Error: " + t.toString());
}

@Override
public void onResponse(IndexResponse res) {
req.response().end(res.getIndex());
}});

Both the webapp and ES running on same server. So all write/read requests
are localhost.

Testing as follows

JMeter (100 users, running on my desktop) ------ Remote ----> WebApp -----
localhost ----> ES

I get about 6000 writes/sec and it seems to get lower as the number of docs
that get indexed increases.
Average request/response latency is about 15-20ms.
Network Time/Jmeter data generation( Each document is about 1000 bytes)/web
app is about 5 ms. I know this because I also have a simple hello world
response to test the average latency of those 3 "parameters".
So it seems that in-memory takes average 15ms I would think ES can do much
better then that?

Is there any tuning settings I can try for strictly in-memory index?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dc7c69f8-d4ab-42f7-88dc-c165472c2892%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

For starters, Java doesn't compress any pointers over 32GB, you are way
over that limit and losing efficiency there.

What version of java are you using?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 23 April 2014 07:09, John Smith java.dev.mtl@gmail.com wrote:

Hi I downloaded the latest ES 1.1.1

I have a 200GB RAM with 2 x 8 cores hyper threaded. "32" cores total
machine 1.6T of disk space.

I start Elasticsearch as follows...

./elasticsearch -Xms100g -Xmx100g -Des.index.store.type=memory
Using Java 1.7_51

I then create my index as follows...

$ curl -XPUT http://localhost:9200/myindex/ -d
'
index :
store:
type: memory
'

And my Java web app (Using vertx.io)

// On app startup... Ensure we have one instance of client. Regardless how
many app threads may write to the index.
synchronized(clientCreated) {
if(clientCreated.compareAndSet(false, true)) {
node = nodeBuilder().clusterName("elasticsearch").client(true).node();
client = node.client();
}
}

// Per request coming into my web application. Using vertx for the web
framework.
// For each request we use the one client instance.
client.prepareIndex("myindex", "doc", request.getString("id"))
.setSource(bodyStr) // Already sending Json so no need to convert
it!
.execute(new ActionListener(){

@Override
public void onFailure(Throwable t) {
req.response().end("Error: " + t.toString());
}

@Override
public void onResponse(IndexResponse res) {
req.response().end(res.getIndex());
}});

Both the webapp and ES running on same server. So all write/read requests
are localhost.

Testing as follows

JMeter (100 users, running on my desktop) ------ Remote ----> WebApp -----
localhost ----> ES

I get about 6000 writes/sec and it seems to get lower as the number of
docs that get indexed increases.
Average request/response latency is about 15-20ms.
Network Time/Jmeter data generation( Each document is about 1000
bytes)/web app is about 5 ms. I know this because I also have a simple
hello world response to test the average latency of those 3 "parameters".
So it seems that in-memory takes average 15ms I would think ES can do much
better then that?

Is there any tuning settings I can try for strictly in-memory index?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/dc7c69f8-d4ab-42f7-88dc-c165472c2892%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/dc7c69f8-d4ab-42f7-88dc-c165472c2892%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624babfptgf1m8H3Lh4n53ST5UV_dKHj%2BPkB-jKF7nrX%2B7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I wrote in my post 1.7_51

And the docs seem to mention that an in-memory index can be as big as ram.

And I have ran an app up to 196GB with another "indexing" api called cq-engine.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0c1e43a6-cc48-497b-bc28-078cff291495%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

1.7_51 but i dont see how their could be a limitation.

I used java up to 200GB easily and with no issues either...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cf8950c9-923e-4cea-90f5-c51a09bf1b16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The ES "memory" or "ram" store (Lucene RAMDirectory) puts enormous pressure
on JVM garbage collection.

You can not expect that standard JVM with CMS GC can give the best
performance.

More info in this great article by Mike McCandless

Maybe Java 8 with G1 GC is giving slightly better numbers. But do not
expect too much.

Jörg

On Wed, Apr 23, 2014 at 3:19 PM, John Smith java.dev.mtl@gmail.com wrote:

1.7_51 but i dont see how their could be a limitation.

I used java up to 200GB easily and with no issues either...

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cf8950c9-923e-4cea-90f5-c51a09bf1b16%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG9AFoxAW1kFtr70dXbWjw8qLHszX7F2rcXZ1DAVb1eUA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

On a 32 core machine? Plus I think 1.7_51 uses G1

I have tested another "indexing" api up to 190GB or so with 30,000,000
objects and my latency was 3ms overall including network and app logic.

And I haven't tested that many records with Elasticsearch yet :wink:

On Wednesday, 23 April 2014 10:20:20 UTC-4, Jörg Prante wrote:

The ES "memory" or "ram" store (Lucene RAMDirectory) puts enormous
pressure on JVM garbage collection.

You can not expect that standard JVM with CMS GC can give the best
performance.

More info in this great article by Mike McCandless

Changing Bits: Lucene index in RAM with Azul's Zing JVM

Maybe Java 8 with G1 GC is giving slightly better numbers. But do not
expect too much.

Jörg

On Wed, Apr 23, 2014 at 3:19 PM, John Smith <java.d...@gmail.com<javascript:>

wrote:

1.7_51 but i dont see how their could be a limitation.

I used java up to 200GB easily and with no issues either...

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cf8950c9-923e-4cea-90f5-c51a09bf1b16%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b67173b8-cccf-454c-97ce-dc8062e34690%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ok so I decided to skip in-memory for now just to test bassic functionality.

I'm running Elasticsearch with defaults as

./elasticsearch -Xms32g -Xmx32g

I also got bigdesk installed.

Either I'm not getting something... But why as I write more documents to
the index the

Indexing requests per second (Δ)

goes down and the

Indexing time per second (Δ)

Is going up

So basically it's getting slower and slower.

Is their any sensible tuning parameters. I would expect that the insertion
should be stable and not getting slower.

It's a 32 core machine and enough ram, standard drives though. There has to
be a way to setup buffers and queues to alliviate the issue of disks "being
slow"

On Wednesday, April 23, 2014 10:42:37 AM UTC-4, John Smith wrote:

On a 32 core machine? Plus I think 1.7_51 uses G1

I have tested another "indexing" api up to 190GB or so with 30,000,000
objects and my latency was 3ms overall including network and app logic.

And I haven't tested that many records with Elasticsearch yet :wink:

On Wednesday, 23 April 2014 10:20:20 UTC-4, Jörg Prante wrote:

The ES "memory" or "ram" store (Lucene RAMDirectory) puts enormous
pressure on JVM garbage collection.

You can not expect that standard JVM with CMS GC can give the best
performance.

More info in this great article by Mike McCandless

Changing Bits: Lucene index in RAM with Azul's Zing JVM

Maybe Java 8 with G1 GC is giving slightly better numbers. But do not
expect too much.

Jörg

On Wed, Apr 23, 2014 at 3:19 PM, John Smith java.d...@gmail.com wrote:

1.7_51 but i dont see how their could be a limitation.

I used java up to 200GB easily and with no issues either...

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cf8950c9-923e-4cea-90f5-c51a09bf1b16%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c6e486c7-6076-4c9e-9073-0f0b808a36e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

High sustainable bulk indexing is very stable here.

I have 3x HP DL165 G7 32 core machines and can index for hours at same
speed with this settings

Jörg

On Wed, Apr 23, 2014 at 5:52 PM, John Smith java.dev.mtl@gmail.com wrote:

Ok so I decided to skip in-memory for now just to test bassic
functionality.

I'm running Elasticsearch with defaults as

./elasticsearch -Xms32g -Xmx32g

I also got bigdesk installed.

Either I'm not getting something... But why as I write more documents to
the index the

Indexing requests per second (Δ)

goes down and the

Indexing time per second (Δ)

Is going up

So basically it's getting slower and slower.

Is their any sensible tuning parameters. I would expect that the insertion
should be stable and not getting slower.

It's a 32 core machine and enough ram, standard drives though. There has
to be a way to setup buffers and queues to alliviate the issue of disks
"being slow"

On Wednesday, April 23, 2014 10:42:37 AM UTC-4, John Smith wrote:

On a 32 core machine? Plus I think 1.7_51 uses G1

I have tested another "indexing" api up to 190GB or so with 30,000,000
objects and my latency was 3ms overall including network and app logic.

And I haven't tested that many records with Elasticsearch yet :wink:

On Wednesday, 23 April 2014 10:20:20 UTC-4, Jörg Prante wrote:

The ES "memory" or "ram" store (Lucene RAMDirectory) puts enormous
pressure on JVM garbage collection.

You can not expect that standard JVM with CMS GC can give the best
performance.

More info in this great article by Mike McCandless

http://blog.mikemccandless.com/2012/07/lucene-index-in-
ram-with-azuls-zing-jvm.html

Maybe Java 8 with G1 GC is giving slightly better numbers. But do not
expect too much.

Jörg

On Wed, Apr 23, 2014 at 3:19 PM, John Smith java.d...@gmail.com wrote:

1.7_51 but i dont see how their could be a limitation.

I used java up to 200GB easily and with no issues either...

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/cf8950c9-923e-4cea-90f5-c51a09bf1b16%
40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c6e486c7-6076-4c9e-9073-0f0b808a36e8%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c6e486c7-6076-4c9e-9073-0f0b808a36e8%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGQVzxFMK7TQuNxUAGte4S61kYgeN%2BmcBxTUcEqraQXQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi J

On Wednesday, April 23, 2014 12:04:05 PM UTC-4, Jörg Prante wrote:

High sustainable bulk indexing is very stable here.

I have 3x HP DL165 G7 32 core machines and can index for hours at same
speed with this settings

Elasticsearch configuration for high sustainable bulk feed · GitHub

At what rate, though? Segment merges will eventually slow you down if you
have but a single index, won't they?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wed, Apr 23, 2014 at 5:52 PM, John Smith <java.d...@gmail.com<javascript:>

wrote:

Ok so I decided to skip in-memory for now just to test bassic
functionality.

I'm running Elasticsearch with defaults as

./elasticsearch -Xms32g -Xmx32g

I also got bigdesk installed.

Either I'm not getting something... But why as I write more documents to
the index the

Indexing requests per second (Δ)

goes down and the

Indexing time per second (Δ)

Is going up

So basically it's getting slower and slower.

Is their any sensible tuning parameters. I would expect that the
insertion should be stable and not getting slower.

It's a 32 core machine and enough ram, standard drives though. There has
to be a way to setup buffers and queues to alliviate the issue of disks
"being slow"

On Wednesday, April 23, 2014 10:42:37 AM UTC-4, John Smith wrote:

On a 32 core machine? Plus I think 1.7_51 uses G1

I have tested another "indexing" api up to 190GB or so with 30,000,000
objects and my latency was 3ms overall including network and app logic.

And I haven't tested that many records with Elasticsearch yet :wink:

On Wednesday, 23 April 2014 10:20:20 UTC-4, Jörg Prante wrote:

The ES "memory" or "ram" store (Lucene RAMDirectory) puts enormous
pressure on JVM garbage collection.

You can not expect that standard JVM with CMS GC can give the best
performance.

More info in this great article by Mike McCandless

http://blog.mikemccandless.com/2012/07/lucene-index-in-
ram-with-azuls-zing-jvm.html

Maybe Java 8 with G1 GC is giving slightly better numbers. But do not
expect too much.

Jörg

On Wed, Apr 23, 2014 at 3:19 PM, John Smith java.d...@gmail.comwrote:

1.7_51 but i dont see how their could be a limitation.

I used java up to 200GB easily and with no issues either...

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/cf8950c9-923e-4cea-90f5-c51a09bf1b16%
40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c6e486c7-6076-4c9e-9073-0f0b808a36e8%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c6e486c7-6076-4c9e-9073-0f0b808a36e8%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b2cbfa21-a87d-4036-aac6-e1e0c69f2409%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The right segment merge setting balances resources and speeds up indexing.

I have mixed search/index workload, with the settings I run ~10k term
queries per sec and bulk indexing of 4k docs per sec. After 91 minutes, an
index of 22 mio docs is created.

With the default segment merge settings of ES 1.1.0 (with serial segment
merge, 1.1.1 has concurrent segment merge again), same routine runs for
many hours, it starts fast, but gets slower then.

Jörg

On Fri, Apr 25, 2014 at 4:13 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

At what rate, though? Segment merges will eventually slow you down if you
have but a single index, won't they?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGaXHmh8bmC%2B9yY7jVWBEW-c%2BA8E6qKDo6k%3DTpzP9owaQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Jörg,

On Friday, April 25, 2014 10:52:15 AM UTC-4, Jörg Prante wrote:

The right segment merge setting balances resources and speeds up indexing.

I have mixed search/index workload, with the settings I run ~10k term
queries per sec and bulk indexing of 4k docs per sec. After 91 minutes, an
index of 22 mio docs is created.

What about after 2-3-4 days? Still indexing at the same rate with just 1
index?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

With the default segment merge settings of ES 1.1.0 (with serial segment

merge, 1.1.1 has concurrent segment merge again), same routine runs for
many hours, it starts fast, but gets slower then.

Jörg

On Fri, Apr 25, 2014 at 4:13 PM, Otis Gospodnetic <otis.gos...@gmail.com<javascript:>

wrote:

At what rate, though? Segment merges will eventually slow you down if
you have but a single index, won't they?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/34674b02-c2d0-4d2a-921b-abb40d523b00%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I've had loads of issues with hard to reproduce out of memory errors
using es.index.store.type=memory. It seems like a nice idea but it's rather
fragile. We were using this for tests. In the end after some feedback in
this mailinglist I made our tests create a random directory and used a
file based approach with that.

In any case, you should probably be using bulkindexing instead of feeding
elasticsearch single document updates. When bulkindexing, elasticsearch
scales a lot better. Alternatively, you might want to use something like
logstash in combination with redis to do this for you. You then simply
write your new documents to redis and have logstash index what gets written
there.

IMHO, support for es.index.store.type=memory should probably be removed or
at least strongly discouraged unless it can be properly stabilized. Part of
the problem is that the memory for in memory indices lives outside the
normal java heap (direct memory) and things get quite hairy when you run
out of that.

Jilles

On Tuesday, April 22, 2014 11:09:16 PM UTC+2, John Smith wrote:

Hi I downloaded the latest ES 1.1.1

I have a 200GB RAM with 2 x 8 cores hyper threaded. "32" cores total
machine 1.6T of disk space.

I start Elasticsearch as follows...

./elasticsearch -Xms100g -Xmx100g -Des.index.store.type=memory
Using Java 1.7_51

I then create my index as follows...

$ curl -XPUT http://localhost:9200/myindex/ -d
'
index :
store:
type: memory
'

And my Java web app (Using vertx.io)

// On app startup... Ensure we have one instance of client. Regardless how
many app threads may write to the index.
synchronized(clientCreated) {
if(clientCreated.compareAndSet(false, true)) {
node = nodeBuilder().clusterName("elasticsearch").client(true).node();
client = node.client();
}
}

// Per request coming into my web application. Using vertx for the web
framework.
// For each request we use the one client instance.
client.prepareIndex("myindex", "doc", request.getString("id"))
.setSource(bodyStr) // Already sending Json so no need to convert
it!
.execute(new ActionListener(){

@Override
public void onFailure(Throwable t) {
req.response().end("Error: " + t.toString());
}

@Override
public void onResponse(IndexResponse res) {
req.response().end(res.getIndex());
}});

Both the webapp and ES running on same server. So all write/read requests
are localhost.

Testing as follows

JMeter (100 users, running on my desktop) ------ Remote ----> WebApp -----
localhost ----> ES

I get about 6000 writes/sec and it seems to get lower as the number of
docs that get indexed increases.
Average request/response latency is about 15-20ms.
Network Time/Jmeter data generation( Each document is about 1000
bytes)/web app is about 5 ms. I know this because I also have a simple
hello world response to test the average latency of those 3 "parameters".
So it seems that in-memory takes average 15ms I would think ES can do much
better then that?

Is there any tuning settings I can try for strictly in-memory index?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8df90b9a-ebe0-422e-a7a9-f046833fd18c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Data ready for ES is limited here, for more real data I need to ramp up
something like a FreeBase feeder
https://developers.google.com/freebase/datafor single node, with the
index on 4*2TB spindle disks.

I have not much SSD for long runs available now. More intensive tests over
days would take synthetic data (docs with random data in random fields with
statistical models).

Jörg

On Fri, Apr 25, 2014 at 5:32 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

What about after 2-3-4 days? Still indexing at the same rate with just 1
index?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFprCQEnR-mavaU6my5j_wqk1wx56a4AqOKPMq8oMcpkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Yeah, I think the long-running indexing is where you start seeing segment
merges is action. Time-based sharding is not good only for making queries
faster. It's essential for keeping indexing fast as well.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Sat, Apr 26, 2014 at 1:10 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Data ready for ES is limited here, for more real data I need to ramp up
something like a FreeBase feeder
https://developers.google.com/freebase/data for single node, with the
index on 4*2TB spindle disks.

I have not much SSD for long runs available now. More intensive tests over
days would take synthetic data (docs with random data in random fields with
statistical models).

Jörg

On Fri, Apr 25, 2014 at 5:32 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

What about after 2-3-4 days? Still indexing at the same rate with just 1
index?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/IhI89Vkn48c/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFprCQEnR-mavaU6my5j_wqk1wx56a4AqOKPMq8oMcpkg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAKdsXoFprCQEnR-mavaU6my5j_wqk1wx56a4AqOKPMq8oMcpkg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANNBgPLOkdkFPVheZ_E0A3U%2BMrEXzLQL%3DEtgC1yLmsD9L3a76g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Yes, I see thousands of segment merges while bulk indexing.

Time-based sharding is not making indexing or queries automatically faster

  • in fact, the less segments, the less CPU/IO is required and the faster
    the search and merging, as long as segments files fit into memory and CPU
    cores are able to run threads on segments concurrently.

Sharding is an easy method to achieve scaling out, but scaling out does not
necessarily mean all things go faster. Many servers with many CPU cores can
easily handle small segments, but the higher the number of segments, and
the higher the segment tiers size,, the slower is search, and also indexing
gets slower if segment merging is not able to keep up with bulk indexing.

Jörg

On Sun, Apr 27, 2014 at 12:53 AM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Yeah, I think the long-running indexing is where you start seeing segment
merges is action. Time-based sharding is not good only for making queries
faster. It's essential for keeping indexing fast as well.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF%2BmMGx22G%2B2r%3DD23%3Ds7khpRH6a4-VsnMjhFAnbG_maaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.