Painfully slow queries

111106 · December 16, 2015, 11:41am

I'm experiencing problems with query time. The longer the word count gets in the query, the longer it takes, up to a whopping 30s when unwarmed.

My config:
i5, 8GB RAM
ES 2.1.0
1 node, 1 shard, 0 replicas, 1 index (don't plan on having more)
ES_HEAP_SIZE set to 4g

All I have in elasticsearch.yml:
script.engine.groovy.inline.search: on
bootstrap.mlockall: true
indices.cache.filter.size: 20%
indices.memory.index_buffer_size: 30%
index.refresh_interval: 30s
index.translog.durability: request

My configuration (PHP): http://pastebin.com/TXFVkd9T
My mapping (PHP): http://pastebin.com/LFg71XSm

There currently sit ~4000 documents.

An example query:
{"size":20,"query":{"filtered":{"query":{"bool":{"should":[{"multi_match":{"query":" гостиница парковка","use_dis_max":false,"type":"cross_fields","fuzziness":"1","slop":"1","operator":"and","fields":["specializationsNames","address^3","city","name^8","description^2","subcategoriesNames^3","categoryName^3","tags^5","services^3"]}},{"multi_match":{"use_dis_max":false,"type":"best_fields","fuzziness":"2","slop":"1","operator":"and","fields":["specializationsNames","address^3","city","name^8","description^2","subcategoriesNames^3","categoryName^3","tags^5","services^3"],"query":" gostinica parkovka"}}]}},"filter":{"bool":{"must":[{"bool":{"should":[{"range":{"filter.price":{"le":500}}}]}}]}}}},"sort":[{"_score":{"order":"desc"}}]}

I really need help here. Been trying to tackle what's wrong for two days now. Appreciate any help I can get.

Thank you.

111106 · December 17, 2015, 7:18am

Anyone? Still can't figure out what's wrong.

inzanez · December 17, 2015, 7:53am

Well I don't know about the whole query stuff you're doing there, but just to check nevertheless: Is your machine swapping by any chance?

111106 · December 17, 2015, 8:47am

No, the swapping is turned off, the free -m shows me zeroes for swap, plus there's the mlockall.

The indexing is also kinda slow, to be honest.

inzanez · December 17, 2015, 9:02am

Could you install hq plugin to watch the performance metrics in there for once? I find that quite useful,...I see what's going on on the different machines regarding indexing, queries, I/O, memory usage and so on, maybe that will give you some hint on what's going on?

I for one thing had disk i/o problems with my cluster, and it was really slow,...and I saw that in the HQ,...

jimczi · December 17, 2015, 10:00am

That's strange. Can you share the current mapping of your index/type:
https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-get-field-mapping.html

I would also suggest to try without the fuzziness to see if it speed up the query.

111106 · December 17, 2015, 10:22am

Sure. Here're the current mappings. The filter property is dynamic, so not all documents have the same nested fieldset there.
http://pastebin.com/McgXawHu

The fuzziness indeed affects the response time very positively, but I'm not sure if I'll be able to do without it. (Though any tips here will help).

jimczi · December 17, 2015, 11:27am

The fuzzy part of your query seems problematic. You try to apply a fuzzy query on all fields but some fields use the ngram tokenizer with a min size of 4 (tags for instance). For small words, a fuzzy query with a max distance of 2 has a big chance to match a large number of entries. You may want to change the max_expansions parameter in order to limit the number of expanded words in your query:
https://www.elastic.co/guide/en/elasticsearch/guide/2.x/fuzzy-query.html
Though a fuzzy query on a field with ngrams is not a good idea, what are you trying to solve here ? I guess you could apply a fuzzy query only on a subset of your fields (the ones that use a standard tokenizer) by adding another should clause.

111106 · December 17, 2015, 11:37am

While max expansions didn't do anything in terms of query time, your suggestions got me thinking.

I guess what I'm essentially doing here is putting the burden of error correction in user input on query time. Now that I think about it, maybe it's not the best idea (especially since it's also a slow idea). Do you think it would be more reasonable to put this responsibility on autocomplete/suggest functionality instead? Maybe I can implement a better analyzer setup instead of doing a fuzzy search?

jimczi · December 17, 2015, 11:41am

Exactly, the autocomplete/suggest is what you're looking for, it's fast and it provides fuzziness support out of the box.

111106 · December 17, 2015, 11:44am

Thank you!

Topic		Replies	Views
Why is my query slow? Elasticsearch	9	7294	July 5, 2017
Random very slow queries Elasticsearch	8	2498	July 6, 2017
Many slow query with high load after a hour Elasticsearch	12	626	July 6, 2017
Really slow speed search ES 7.8 Elasticsearch	10	787	August 17, 2020
Slow Query performance on small data Elasticsearch	13	2268	July 6, 2017

Painfully slow queries

Related topics