High latency on search

Hey All,

we are using elasticsearch for searching with sort.
We perform sorts on 5 fields off which one is a string field which is analyzed. I know its recommended not to sort on analyzed fields but we have to support natural sort on numbers.

We have allocated 50% of our 30Gb of heap to fielddata, but when i see the utilization of fielddata it does not go beyond 10GB. and still have very high evictions.

We are seeing latencies upto 20Sec for search results .
Elasticsearch version - 1.5.2 (in process to upgrade to 1.7)

2.0 is not an option right now.

30 billion documents, with a index size of 24TB
500 shards and 2 replicas.

Why is the fielddata not been used entirely and still have high evictions.

You potentially have some filters which are cached but never used again.

2.x improves that a lot.

How many nodes do you have?

We have a cluster with 370 data nodes, 150 client nodes and 3 master nodes
we index at a rate of 3000 calls per sec and each call will index upto 100 documents

we run 3000 queries per second
which include just filters, filter with sort and aggs

That seems like a lot of nodes for that much data and search rate.

The only reason we added host to the cluster is get more cache for sorting. We have really good performance on filter queries with latency less than 1 sec, but as we add sort to the query performance degrades. Latency for the query spike up to 15-20 secs.

Right, but 370 nodes for 24TB of data is only ~64GB of data per node.
That seems highly inefficient.

What information would be needed to find out why we are getting this high latency and also it would be great if we could reduce the no of nodes.

Are you monitoring ES at all?

Yes we monitor it.
Cluster settings

{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"allow_rebalance" : "indices_all_active",
"node_concurrent_recoveries" : "1000",
"cluster_concurrent_rebalance" : "1",
"exclude" : {
"_name" : ""
},
"awareness" : {
"force" : {
"AZ" : {
//Awareness based on AZ
}
}
},
"enable" : "all"
}
}
},
"threadpool" : {
"refresh" : {
"size" : "8",
"type" : "scaling"
},
"search" : {
"queue_size" : "128",
"size" : "32"
},
"bulk" : {
"size" : "8",
"queue_size" : "128"
},
"merge" : {
"type" : "scaling",
"size" : "8"
}
},
"indices" : {
"cache" : {
"filter" : {
"size" : "15%"
}
},
"recovery" : {
"concurrent_streams" : "1000"
},
"store" : {
"throttle" : {
"max_bytes_per_sec" : "150mb"
}
}
}
},
"transient" : {
"indices" : {
"cache" : {
"filter" : {
"size" : "25%"
}
}
},
"logger" : {
"index" : {
"search" : {
"slowlog" : "DEBUG"
}
},
"_root" : "INFO"
}
}
}

Cluster status

"timestamp" : 1457735896764,
"cluster_name" : "elasticsearch",
"uuid" : "Vxv3TygqQVuLq-sSM9luYg",
"status" : "green",
"indices" : {
"count" : 1,
"shards" : {
"total" : 1500,
"primaries" : 500,
"replication" : 2.0,
"index" : {
"shards" : {
"min" : 1500,
"max" : 1500,
"avg" : 1500.0
},
"primaries" : {
"min" : 500,
"max" : 500,
"avg" : 500.0
},
"replication" : {
"min" : 2.0,
"max" : 2.0,
"avg" : 2.0
}
}
},
"docs" : {
"count" : 31462133426,
"deleted" : 13086343696
},
"store" : {
"size_in_bytes" : 28952466994056,
"throttle_time_in_millis" : 518656551
},
"fielddata" : {
"memory_size_in_bytes" : 3053415346328,
"evictions" : 284738711
},
"filter_cache" : {
"memory_size_in_bytes" : 3201933183052,
"evictions" : 3369817227
},
"id_cache" : {
"memory_size_in_bytes" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 38405,
"memory_in_bytes" : 284030642242,
"index_writer_memory_in_bytes" : 156242334,
"index_writer_max_memory_in_bytes" : 804836605952,
"version_map_memory_in_bytes" : 7518960,
"fixed_bit_set_memory_in_bytes" : 0
},
"percolate" : {
"total" : 0,
"time_in_millis" : 0,
"current" : 0,
"memory_size_in_bytes" : -1,
"memory_size" : "-1b",
"queries" : 0
}
},
"nodes" : {
"count" : {
"total" : 568,
"master_only" : 3,
"data_only" : 410,
"master_data" : 0,
"client" : 0
},
"versions" : [ "1.5.2" ],
"os" : {
"available_processors" : 4544,
"mem" : {
"total_in_bytes" : 37401955250176
},
"cpu" : [ {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2500,
"total_cores" : 8,
"total_sockets" : 1,
"cores_per_socket" : 8,
"cache_size_in_bytes" : 25600,
"count" : 568
} ]
},
"process" : {
"cpu" : {
"percent" : 83590
},
"open_file_descriptors" : {
"min" : 15235,
"max" : 16081,
"avg" : 15627
}
},
"jvm" : {
"max_uptime_in_millis" : 8719843882,
"versions" : [ {
"version" : "1.8.0_51",
"vm_name" : "Java HotSpot(TM) 64-Bit Server VM",
"vm_version" : "25.51-b03",
"vm_vendor" : "Oracle Corporation",
"count" : 413
}, {
"version" : "1.8.0_66",
"vm_name" : "Java HotSpot(TM) 64-Bit Server VM",
"vm_version" : "25.66-b17",
"vm_vendor" : "Oracle Corporation",
"count" : 155
} ],
"mem" : {
"heap_used_in_bytes" : 11582235827984,
"heap_max_in_bytes" : 17789091840000
},
"threads" : 131409
},
"fs" : {
"total_in_bytes" : 70323596075008,
"free_in_bytes" : 39562196209664,
"available_in_bytes" : 35989967147008,
"disk_reads" : 41684304586,
"disk_writes" : 482283527045,
"disk_io_op" : 523967831631,
"disk_read_size_in_bytes" : 861052555039744,
"disk_write_size_in_bytes" : 9073652993454080,
"disk_io_size_in_bytes" : 9934705548493824
},
}
}

With what?

We use r3.2xlarge host, allocated 30Gb of heap to elasticsearch.
50% of the heap to fielddata
25% of the heap to filtercache

We use the kopf plugin to see cluster status and other information.
Do you mean something like marvel ?

We have a running metrics agent which polls elasticsearch for cluster stats and emit various information from the cluster stats, node stats api.
the information collected form these metrics are

  1. threadpools
  2. cache size (Fitlercache, fielddata)
  3. cache evictions
  4. segments
  5. shard status
  6. CPU per host

So what do they tell you about what is happening on the nodes? Are they overloaded? What does heap/fielddata/fieldcache/threadpool/etc use look like?

Here is the breakdown of the information

  1. heap - 60% average across the cluster, Some host have heap upto 80% utilized
  2. fielddata cache - between 5gb to 10Gb across the fleet
    fielddata evictions - we have one cache eviction every second on an average.
  3. filtercache - 7.5Gb across all the host
  4. threadpools - max threads 32 for search, we have a surge in the search queue upto 120 but that is only when our traffic spikes
  5. threadpool indexing - peak of 8 threads per host, queues upto 256

Hey one of things i have noticed with the information from our monitoring is that our fielddata cache max usage is 10GB, it does not go beyong 10GB on any host even when i have assigned 50% of the heap to fielddata which should be around 15GB.
We have a regular fielddata evictions on our graphs

Is there something which is causing this issue ?

Removal of sort on analyzed field and using doc_values for sorting, aggregations reduce the latency by a significant value.
Also reduced the number of host.