Very poor performance on relatively small index

My index is not very large: 1 main index, 3M documents, 1 node, 1 shard, 0 replicas, 5GB disk size. Disk is spinning.
Documents have indexed fields: about 12 integer fields, 3 short text fields, 4 date fields, plus some more unindexed fields.

My use pattern is like this: every day about 10k-15k documents are added to the main index in a background job which lasts for about 4 hours. All queries are run against this index, 24h.

I am having many slow queries: about 30% of the queries are above 800ms and 7% above 1000ms.

The queries have filters on several integer fields, and aggregations to count documents on several integer fields as well, with occasional text search in one short text field of about 80-120 chars. The query has 3 nested aggregations, with facets on three integer fields.

I have set up all the recommended settings for production and spinning disks, and the refresh interval to 30s, as it is not critical to have new documents immediately available for search.

What I do see is that the number of segments is quite high.
"segments" : { "count" : 27, "memory_in_bytes" : 5027011, "terms_memory_in_bytes" : 3123095, "stored_fields_memory_in_bytes" : 1815368, "term_vectors_memory_in_bytes" : 0, "norms_memory_in_bytes" : 14528, "doc_values_memory_in_bytes" : 74020, "index_writer_memory_in_bytes" : 541878, "index_writer_max_memory_in_bytes" : 124688793, "version_map_memory_in_bytes" : 2086, "fixed_bit_set_memory_in_bytes" : 0 },

The machine has 16GB memory of which 7 have been assigned to ES heap . CPU load is very very low: less than 10%.
Disk I/O does not seem to be a problem: iostat reports tipically under 10% util with occacional peaks of 20%.

ES version is 2.3.3.
OS is Ubuntu 14.04.

Any tips? How can I detect if there's a problem?

Not really relevant, if you had lots of shards then it may.

That version doesn't exist, can you check again?

Can you please show your query?

Also, the field mapping configuration would be interesting.

Here it goes.

Query:

Index mapping:

Thanks for responding!

Fixed: 2.3.3

This is an example of a very bad query.

Try to reconsider your data structure and your query to the following principles:

  • avoid missing filters like hell. They are very slow. Most use cases can be changed, prefer to index special filler terms and filter for them.

  • Order the filters in the and clause to make them as efficient as they can be. The first filter should filter out the highest number of docs compared to the following filters and so on. missing filter are much slower than term filters. Also, understand filter optimization: think about caching filters instead of forcing ES to compute them again and again in each query.

  • Depending on the field cardinality, doc values might be a good solution for aggregation in your case. You could try something like this on fields to be aggregated

"aFieldToAggregate" : {
  "type": "long",
  "store":"yes",
  "index": "no",
  "doc_values": true
},

Also, the field type byte is a bit slower than long, since bytes have to be converted internally.

  • of course, a three level aggregation is slow.

  • and at last: Do not sort. Sorting is slow as hell. Use relevance scoring wherever you can.

Thanks for the tips.
I'm going to reindex with filler terms to replace the missing filters and will apply the ordering of filters as you suggest.
Will see what can I do for the rest.

I have been able to remove missing filters.
However I cannot see a real improvement.
Removing sorting is not an option in my case.
Tried to order filters so first one is the most selective but did not see any improvement either.

Is is possible for a given query to see in what is the time spent?
I would like to know if it will be worth it the effort in rewriting queries for aggregation caching.

Yes, try the profiler API Profile API | Elasticsearch Guide [8.11] | Elastic

Aggregation was the culprit of the low performance.
I was even thinking about adding two more aggs but it wasn't an option with such low performance.

I have ended splitting queries in two: one for docs, one for aggregations and no docs. The latter one can be cached.
This together with custom app caching for most costly and frequent aggs has improved performance significantly: only 10% of queries take above 0.6s now.

Thanks!

@jogaco you might like Hits + query_cache=true + aggs in 1 round trip: _msearch?