Ingestion speed slow down after some time

Hi All,

we are indexing continuous stream of data, the size of each record is 22
KB.

At the starting hour, we are getting speed of 180 records/sec.

After 3 hours, we are getting speed of 85 records/sec.

After 6 hours, we are getting speed of 40 records/sec.

According to over observation, the ingestion speed is continuous
decreasing. What are the ways to maintain the constant good ingestion rate.

Please suggest some indexing tuning parameters.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Do you see anything in logs?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 28 juin 2013 à 09:35, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi All,

we are indexing continuous stream of data, the size of each record is 22 KB.

At the starting hour, we are getting speed of 180 records/sec.

After 3 hours, we are getting speed of 85 records/sec.

After 6 hours, we are getting speed of 40 records/sec.

According to over observation, the ingestion speed is continuous decreasing. What are the ways to maintain the constant good ingestion rate.

Please suggest some indexing tuning parameters.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

No, there is not much details in log file.

Regards,
Ankit

On Friday, 28 June 2013 13:23:18 UTC+5:30, David Pilato wrote:

Do you see anything in logs?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 28 juin 2013 à 09:35, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Hi All,

we are indexing continuous stream of data, the size of each record is 22
KB.

At the starting hour, we are getting speed of 180 records/sec.

After 3 hours, we are getting speed of 85 records/sec.

After 6 hours, we are getting speed of 40 records/sec.

According to over observation, the ingestion speed is continuous
decreasing. What are the ways to maintain the constant good ingestion rate.

Please suggest some indexing tuning parameters.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I've seen that in the past when my documents were adding a new field in mapping for each insert.

So I was seeing ´update mapping´ every time.

Did you try to disable refresh?
Could it be relative to the way you fetch data?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 28 juin 2013 à 11:00, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi David,

No, there is not much details in log file.

Regards,
Ankit

On Friday, 28 June 2013 13:23:18 UTC+5:30, David Pilato wrote:

Do you see anything in logs?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 28 juin 2013 à 09:35, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

we are indexing continuous stream of data, the size of each record is 22 KB.

At the starting hour, we are getting speed of 180 records/sec.

After 3 hours, we are getting speed of 85 records/sec.

After 6 hours, we are getting speed of 40 records/sec.

According to over observation, the ingestion speed is continuous decreasing. What are the ways to maintain the constant good ingestion rate.

Please suggest some indexing tuning parameters.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

Thanks for the your reply.

We have required the data should be available soon for searching after
indexing.

Our input doc contains 80 fields and number of fields constant for each doc.

Also, we are using parent/child approach during indexing.

Regards,
Ankit

On Friday, 28 June 2013 14:51:07 UTC+5:30, David Pilato wrote:

I've seen that in the past when my documents were adding a new field in
mapping for each insert.

So I was seeing ´update mapping´ every time.

Did you try to disable refresh?
Could it be relative to the way you fetch data?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 28 juin 2013 à 11:00, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Hi David,

No, there is not much details in log file.

Regards,
Ankit

On Friday, 28 June 2013 13:23:18 UTC+5:30, David Pilato wrote:

Do you see anything in logs?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 28 juin 2013 à 09:35, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

we are indexing continuous stream of data, the size of each record is 22
KB.

At the starting hour, we are getting speed of 180 records/sec.

After 3 hours, we are getting speed of 85 records/sec.

After 6 hours, we are getting speed of 40 records/sec.

According to over observation, the ingestion speed is continuous
decreasing. What are the ways to maintain the constant good ingestion rate.

Please suggest some indexing tuning parameters.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Here are some of the most important points from my experience:

  • check if your client ingest process is located at remote hosts, to
    make more resources available to ES server side

  • use bulk indexing, check if you have enough heap on client side to
    build and push bulk inserts. Use also concurrent bulk indexing.

  • start monitoring ES nodes, especially the server side heap. The more
    heap, the more CPU, the higher the throughput per node

  • observe segment merging, study how segments grow, and learn to control
    the size of segments and throttling with ES merge API. Spread load over
    many nodes so segment merging load is distributed well.

  • take care that ES nodes can react within under 5 seconds (GC may
    exceed this). Adjust segment merge or concurrency, or change timeout in
    client

  • and, most important of all, use the fastest write disk subsystem for
    ES data, since most of the wasted time on server side is waiting for I/O
    write responses. Use SSD and you want never go back to spindle disks.

Jörg

Am 28.06.13 09:35, schrieb Ankit Jain:

What are the ways to maintain the constant good ingestion rate.

Please suggest some indexing tuning parameters.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I've been observing a slowdown of ingestion speed on my two node cluster. I've been generating monthly indices that have been roughly the same size over the past several months, yet the ingestion speed was getting gradually slower.

I somehow stumbled upon the "close" command to temporarily unload some older indices from memory. The Sense console has a convenient command to do it in batches, so I can "close" extra indices during ingestion time and re-open them when I'm done.

post /my-index-pattern*/_close

post /my-index-pattern*/_open

You can also use the command below to see the open/close status of indices.
get _cat/indices

I'm assuming that my system was low on memory when all of the indices are loaded. Does elasticsearch provide a way to tag certain indices as candidates for auto-unloading and on-demand loading so I don't have to take these manual steps?

Curator is available to auto-close the indicies based on its age. IMK there is no such things in ES which meets your requirement. But ES has modules in almost all scripting languages. You can write a simple script to do this job.