Le 28 juin 2013 à 09:35, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :
Hi All,
we are indexing continuous stream of data, the size of each record is 22
KB.
At the starting hour, we are getting speed of 180 records/sec.
After 3 hours, we are getting speed of 85 records/sec.
After 6 hours, we are getting speed of 40 records/sec.
According to over observation, the ingestion speed is continuous
decreasing. What are the ways to maintain the constant good ingestion rate.
Please suggest some indexing tuning parameters.
Regards,
Ankit
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.
Here are some of the most important points from my experience:
check if your client ingest process is located at remote hosts, to
make more resources available to ES server side
use bulk indexing, check if you have enough heap on client side to
build and push bulk inserts. Use also concurrent bulk indexing.
start monitoring ES nodes, especially the server side heap. The more
heap, the more CPU, the higher the throughput per node
observe segment merging, study how segments grow, and learn to control
the size of segments and throttling with ES merge API. Spread load over
many nodes so segment merging load is distributed well.
take care that ES nodes can react within under 5 seconds (GC may
exceed this). Adjust segment merge or concurrency, or change timeout in
client
and, most important of all, use the fastest write disk subsystem for
ES data, since most of the wasted time on server side is waiting for I/O
write responses. Use SSD and you want never go back to spindle disks.
Jörg
Am 28.06.13 09:35, schrieb Ankit Jain:
What are the ways to maintain the constant good ingestion rate.
I've been observing a slowdown of ingestion speed on my two node cluster. I've been generating monthly indices that have been roughly the same size over the past several months, yet the ingestion speed was getting gradually slower.
I somehow stumbled upon the "close" command to temporarily unload some older indices from memory. The Sense console has a convenient command to do it in batches, so I can "close" extra indices during ingestion time and re-open them when I'm done.
post /my-index-pattern*/_close
post /my-index-pattern*/_open
You can also use the command below to see the open/close status of indices.
get _cat/indices
I'm assuming that my system was low on memory when all of the indices are loaded. Does elasticsearch provide a way to tag certain indices as candidates for auto-unloading and on-demand loading so I don't have to take these manual steps?
Curator is available to auto-close the indicies based on its age. IMK there is no such things in ES which meets your requirement. But ES has modules in almost all scripting languages. You can write a simple script to do this job.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.