Performance issues when sending documents to multiple indexes at the same time

We are experiencing some performance issues or anomalies on a elasticsearch
specifically on a system we are currently building.

The requirements:

We need to capture data for multiple of our customers, who will query and
report on them on a near real time basis. All the documents received are
the same format with the same properties and are in a flat structure (all
fields are of primary type and no nested objects). We want to keep each
customer’s information separate from each other.

Frequency of data received and queried:

We receive data for each customer at a fluctuating rate of 200 to 700
documents per second – with the peak being in the middle of the day.

Queries will be mostly aggregations over around 12 million documents per
customer – histogram/percentiles to show patterns over time and the
occasional raw document retrieval to find out what happened a particular
point in time. We are aiming to serve 50 to 100 customer at varying rates
of documents inserted – the smallest one could be 20 docs/sec to the
largest one peaking at 1000 docs/sec for some minutes.

How are we storing the data:

Each customer has one index per day. For example, if we have 5 customers,
there will be a total of 35 indexes for the whole week. The reason we break
it per day is because it is mostly the latest two that get queried with
occasionally the remaining others. We also do it that way so we can delete
older indexes independently of customers (some may want to keep 7 days,
some 14 days’ worth of data)

How we are inserting:

We are sending data in batches of 10 to 2000 – every second. One document
is around 900bytes raw.

Environment

AWS C3-Large – 3 nodes

All indexes are created with 10 shards with 2 replica for the test purposes

Both Elasticsearch 1.3.2 and 1.4.1

What we have noticed:

If I push data to one index only, Response time starts at 80 to 100ms for
each batch inserted when the rate of insert is around 100 documents per
second. I ramp it up and I can reach 1600 before the rate of insert goes
to close to 1sec per batch and when I increase it to close to 1700, it will
hit a wall at some point because of concurrent insertions and the time will
spiral to 4 or 5 seconds. Saying that, if I reduce the rate of inserts,
Elasticsearch recovers nicely. CPU usage increases as rate increases.

If I push to 2 indexes concurrently, I can reach a total of 1100 and CPU
goes up to 93% around 900 documents per second.

If I push to 3 indexes concurrently, I can reach a total of 150 and CPU
goes up to 95 to 97%. I tried it many times. The interesting thing is that
response time is around 109ms at the time. I can increase the load to 900
and response time will still be around 400 to 600 but CPU stays up.

Question:

Looking at our requirements and findings above, is the design convenient
for what’s asked? Are there any tests that I can do to find out more? Is
there any setting that I need to check (and change)?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You've got too many replicas and shards. One shard per node (maybe 2) and
one replica is enough.

You should be using the bulk API as well.

What's your heap set to?

Also consider combining customers into one index, it'll reduce the work you
need to do.
On 17/01/2015 4:07 am, "Nawaaz Soogund" nawaaz.soogund@gmail.com wrote:

We are experiencing some performance issues or anomalies on a
elasticsearch specifically on a system we are currently building.

The requirements:

We need to capture data for multiple of our customers, who will query and
report on them on a near real time basis. All the documents received are
the same format with the same properties and are in a flat structure (all
fields are of primary type and no nested objects). We want to keep each
customer’s information separate from each other.

Frequency of data received and queried:

We receive data for each customer at a fluctuating rate of 200 to 700
documents per second – with the peak being in the middle of the day.

Queries will be mostly aggregations over around 12 million documents per
customer – histogram/percentiles to show patterns over time and the
occasional raw document retrieval to find out what happened a particular
point in time. We are aiming to serve 50 to 100 customer at varying rates
of documents inserted – the smallest one could be 20 docs/sec to the
largest one peaking at 1000 docs/sec for some minutes.

How are we storing the data:

Each customer has one index per day. For example, if we have 5 customers,
there will be a total of 35 indexes for the whole week. The reason we break
it per day is because it is mostly the latest two that get queried with
occasionally the remaining others. We also do it that way so we can delete
older indexes independently of customers (some may want to keep 7 days,
some 14 days’ worth of data)

How we are inserting:

We are sending data in batches of 10 to 2000 – every second. One document
is around 900bytes raw.

Environment

AWS C3-Large – 3 nodes

All indexes are created with 10 shards with 2 replica for the test purposes

Both Elasticsearch 1.3.2 and 1.4.1

What we have noticed:

If I push data to one index only, Response time starts at 80 to 100ms
for each batch inserted when the rate of insert is around 100 documents per
second. I ramp it up and I can reach 1600 before the rate of insert goes
to close to 1sec per batch and when I increase it to close to 1700, it will
hit a wall at some point because of concurrent insertions and the time will
spiral to 4 or 5 seconds. Saying that, if I reduce the rate of inserts,
Elasticsearch recovers nicely. CPU usage increases as rate increases.

If I push to 2 indexes concurrently, I can reach a total of 1100 and CPU
goes up to 93% around 900 documents per second.

If I push to 3 indexes concurrently, I can reach a total of 150 and CPU
goes up to 95 to 97%. I tried it many times. The interesting thing is that
response time is around 109ms at the time. I can increase the load to 900
and response time will still be around 400 to 600 but CPU stays up.

Question:

Looking at our requirements and findings above, is the design convenient
for what’s asked? Are there any tests that I can do to find out more? Is
there any setting that I need to check (and change)?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9C2S2iBF5ohDBpQ304AZD3a6kCnfhgspWw6wDpWHvTCQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Mark.

Thanks for getting back to us. What are the options should we want to keep
our customers' data separate - like a chinese wall strategy? Although it is
technically possible to have them together, we have other operational and
business reasons to have them separate.

I'll try with the one replica x 3 shards with the setup we have on one
customer only and post the findings :slight_smile:

Thanks

On Friday, January 16, 2015 at 11:15:29 PM UTC, Mark Walkom wrote:

You've got too many replicas and shards. One shard per node (maybe 2) and
one replica is enough.

You should be using the bulk API as well.

What's your heap set to?

Also consider combining customers into one index, it'll reduce the work
you need to do.
On 17/01/2015 4:07 am, "Nawaaz Soogund" <nawaaz....@gmail.com
<javascript:>> wrote:

We are experiencing some performance issues or anomalies on a
elasticsearch specifically on a system we are currently building.

The requirements:

We need to capture data for multiple of our customers, who will query
and report on them on a near real time basis. All the documents received
are the same format with the same properties and are in a flat structure
(all fields are of primary type and no nested objects). We want to keep
each customer’s information separate from each other.

Frequency of data received and queried:

We receive data for each customer at a fluctuating rate of 200 to 700
documents per second – with the peak being in the middle of the day.

Queries will be mostly aggregations over around 12 million documents per
customer – histogram/percentiles to show patterns over time and the
occasional raw document retrieval to find out what happened a particular
point in time. We are aiming to serve 50 to 100 customer at varying rates
of documents inserted – the smallest one could be 20 docs/sec to the
largest one peaking at 1000 docs/sec for some minutes.

How are we storing the data:

Each customer has one index per day. For example, if we have 5 customers,
there will be a total of 35 indexes for the whole week. The reason we break
it per day is because it is mostly the latest two that get queried with
occasionally the remaining others. We also do it that way so we can delete
older indexes independently of customers (some may want to keep 7 days,
some 14 days’ worth of data)

How we are inserting:

We are sending data in batches of 10 to 2000 – every second. One document
is around 900bytes raw.

Environment

AWS C3-Large – 3 nodes

All indexes are created with 10 shards with 2 replica for the test
purposes

Both Elasticsearch 1.3.2 and 1.4.1

What we have noticed:

If I push data to one index only, Response time starts at 80 to 100ms
for each batch inserted when the rate of insert is around 100 documents per
second. I ramp it up and I can reach 1600 before the rate of insert goes
to close to 1sec per batch and when I increase it to close to 1700, it will
hit a wall at some point because of concurrent insertions and the time will
spiral to 4 or 5 seconds. Saying that, if I reduce the rate of inserts,
Elasticsearch recovers nicely. CPU usage increases as rate increases.

If I push to 2 indexes concurrently, I can reach a total of 1100 and CPU
goes up to 93% around 900 documents per second.

If I push to 3 indexes concurrently, I can reach a total of 150 and CPU
goes up to 95 to 97%. I tried it many times. The interesting thing is that
response time is around 109ms at the time. I can increase the load to 900
and response time will still be around 400 to 600 but CPU stays up.

Question:

Looking at our requirements and findings above, is the design convenient
for what’s asked? Are there any tests that I can do to find out more? Is
there any setting that I need to check (and change)?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0c441669-4f65-4f99-8402-d16814adc23e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.