Hi,
Here is our ES ( elasticsearch-1.2.1 ) cluster details
- 20 Data Nodes (physical machines 2TB normal disk,32GB RAM,8 cpu
cores). Both Xms & Xmx 14GB. Xmn 7GB - 2 Master Nodes (virtual machines 16GB RAM)
 - Cluster Size ~17 TB (out of 20 * 2 = 40TB)
 - Number of records ~70 billions
 - Number of indices ~3500
 - Number of shards ~10000
 
Indices config
- index.refresh_interval: 40s
 - index.compound_format: true
 - index.compound_on_flush: true
 - index.merge.policy.max_merged_segment: 500mb
 - index.codec.postings_format.no_bloom.type: default
 - indices.cache.filter.size: 1000mb
 - indices.store.throttle.type: merge
 - indices.store.throttle.max_bytes_per_sec: 30mb
 - index.routing.allocation.total_shards_per_node: 2
 
We create 100 daily indices. We keep indices maximum of 30 days.
- Small size indices (1GB to 5GB / per day - one shard & one replica) .
 - Large size indices (95GB to 125GB / per day - 12 shards & one replica)
 - We issue bulk indexing request . 1000 docs / per request.
 - docs may be simple tomcat logs / exceptions
 
100000 - 150000 search queries per day.
- Queries may be simple term query or phrase query.
 - 80% of the queries response time fall under 10s. (Which is acceptable).
 - Queries on large indices (12 shards) , response time fall under 20s to
30s 
Problem (Suggestions needed)
- indexing time for bulk request(1000 per req) for first 5 to 8 requests
is less than 500 ms . - Then next 5 to 8 indexing bulk requests(1000 per req), response time
is 2000ms to 5000ms (Some times 20000 ms or above) - Then next 5 to 8 indexing bulk requests(1000 per req), response time
is 1000ms to 2000ms - Then next 5 to 8 indexing bulk requests(1000 per req), response time
is less than 500ms 
This variation in response time continues as cycle only in large indices
(12 shards).
As a result it takes nearly 10 to 15 mins to index a file which contains
~400000 (4 lakh) docs on large indices (12 shards)
I don't change any settings related with Lucene Merge .
Over the time indexing process lagging far behind, then i update
index.number_of_replicas as 0 to large indices. (But we need
HighAvailability.)
- What makes indexing slow ?  ( Lucene Merge ? ES flush ?  ES refresh ?
) How to identify ? - 12 shards - too big ?
 - I dont want to change index.number_of_replicas as 0 each time.
Suggestions ? - Some times even i reduced the copy, indexing not getting catch up.
 - 10% of memory for indexing 1400mb for ~20 shards on each node for
active indices (Obviously today indices.) - Therefore 70mb for each index. Whether i need to increase ?
 
FYI : I went through Mike's recent blog on performance for ES indexing.
Regards,
Anantha Govindarajan
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d49106f0-3169-4bf6-83f4-c2782d51ab4d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.