Suggestion needed on Indexing Performance

ananth · September 15, 2014, 5:05pm

Hi,

Here is our ES ( elasticsearch-1.2.1 ) cluster details

20 Data Nodes (physical machines 2TB normal disk,32GB RAM,8 cpu
cores). Both Xms & Xmx 14GB. Xmn 7GB
2 Master Nodes (virtual machines 16GB RAM)
Cluster Size ~17 TB (out of 20 * 2 = 40TB)
Number of records ~70 billions
Number of indices ~3500
Number of shards ~10000

Indices config

We create 100 daily indices. We keep indices maximum of 30 days.

100000 - 150000 search queries per day.

Problem (Suggestions needed)

indexing time for bulk request(1000 per req) for first 5 to 8 requests
is less than 500 ms .
Then next 5 to 8 indexing bulk requests(1000 per req), response time
is 2000ms to 5000ms (Some times 20000 ms or above)
Then next 5 to 8 indexing bulk requests(1000 per req), response time
is 1000ms to 2000ms
Then next 5 to 8 indexing bulk requests(1000 per req), response time
is less than 500ms

This variation in response time continues as cycle only in large indices
(12 shards).

As a result it takes nearly 10 to 15 mins to index a file which contains
~400000 (4 lakh) docs on large indices (12 shards)

I don't change any settings related with Lucene Merge .

Over the time indexing process lagging far behind, then i update
index.number_of_replicas as 0 to large indices. (But we need
HighAvailability.)

What makes indexing slow ? ( Lucene Merge ? ES flush ? ES refresh ?
) How to identify ?
12 shards - too big ?
I dont want to change index.number_of_replicas as 0 each time.
Suggestions ?
Some times even i reduced the copy, indexing not getting catch up.
10% of memory for indexing 1400mb for ~20 shards on each node for
active indices (Obviously today indices.)
Therefore 70mb for each index. Whether i need to increase ?

FYI : I went through Mike's recent blog on performance for ES indexing.

Regards,
Anantha Govindarajan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d49106f0-3169-4bf6-83f4-c2782d51ab4d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Bulk import response times Elasticsearch	4	1822	July 5, 2017
The indexing speed become more and more slowly Elasticsearch	4	428	July 6, 2017
Slow bulk indexing Elasticsearch	4	2104	July 5, 2017
Bulk indexing slow down when data amount increase Elasticsearch	6	2996	July 6, 2017
Increase in Indexing Time and big Merges Elasticsearch	3	840	July 6, 2017