Painfully slow queries

I'm experiencing problems with query time. The longer the word count gets in the query, the longer it takes, up to a whopping 30s when unwarmed.

My config:
i5, 8GB RAM
ES 2.1.0
1 node, 1 shard, 0 replicas, 1 index (don't plan on having more)
ES_HEAP_SIZE set to 4g

All I have in elasticsearch.yml:
script.engine.groovy.inline.search: on
bootstrap.mlockall: true
indices.cache.filter.size: 20%
indices.memory.index_buffer_size: 30%
index.refresh_interval: 30s
index.translog.durability: request

My configuration (PHP): http://pastebin.com/TXFVkd9T
My mapping (PHP): http://pastebin.com/LFg71XSm

There currently sit ~4000 documents.

An example query:
{"size":20,"query":{"filtered":{"query":{"bool":{"should":[{"multi_match":{"query":" гостиница парковка","use_dis_max":false,"type":"cross_fields","fuzziness":"1","slop":"1","operator":"and","fields":["specializationsNames","address^3","city","name^8","description^2","subcategoriesNames^3","categoryName^3","tags^5","services^3"]}},{"multi_match":{"use_dis_max":false,"type":"best_fields","fuzziness":"2","slop":"1","operator":"and","fields":["specializationsNames","address^3","city","name^8","description^2","subcategoriesNames^3","categoryName^3","tags^5","services^3"],"query":" gostinica parkovka"}}]}},"filter":{"bool":{"must":[{"bool":{"should":[{"range":{"filter.price":{"le":500}}}]}}]}}}},"sort":[{"_score":{"order":"desc"}}]}

I really need help here. Been trying to tackle what's wrong for two days now. Appreciate any help I can get.

Thank you.

Anyone? Still can't figure out what's wrong.

Well I don't know about the whole query stuff you're doing there, but just to check nevertheless: Is your machine swapping by any chance?

No, the swapping is turned off, the free -m shows me zeroes for swap, plus there's the mlockall.

The indexing is also kinda slow, to be honest.

Could you install hq plugin to watch the performance metrics in there for once? I find that quite useful,...I see what's going on on the different machines regarding indexing, queries, I/O, memory usage and so on, maybe that will give you some hint on what's going on?

I for one thing had disk i/o problems with my cluster, and it was really slow,...and I saw that in the HQ,...

That's strange. Can you share the current mapping of your index/type:
https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-get-field-mapping.html

I would also suggest to try without the fuzziness to see if it speed up the query.

1 Like

Sure. Here're the current mappings. The filter property is dynamic, so not all documents have the same nested fieldset there.
http://pastebin.com/McgXawHu

The fuzziness indeed affects the response time very positively, but I'm not sure if I'll be able to do without it. (Though any tips here will help).

The fuzzy part of your query seems problematic. You try to apply a fuzzy query on all fields but some fields use the ngram tokenizer with a min size of 4 (tags for instance). For small words, a fuzzy query with a max distance of 2 has a big chance to match a large number of entries. You may want to change the max_expansions parameter in order to limit the number of expanded words in your query:
https://www.elastic.co/guide/en/elasticsearch/guide/2.x/fuzzy-query.html
Though a fuzzy query on a field with ngrams is not a good idea, what are you trying to solve here ? I guess you could apply a fuzzy query only on a subset of your fields (the ones that use a standard tokenizer) by adding another should clause.

1 Like

While max expansions didn't do anything in terms of query time, your suggestions got me thinking.

I guess what I'm essentially doing here is putting the burden of error correction in user input on query time. Now that I think about it, maybe it's not the best idea (especially since it's also a slow idea). Do you think it would be more reasonable to put this responsibility on autocomplete/suggest functionality instead? Maybe I can implement a better analyzer setup instead of doing a fuzzy search?

Exactly, the autocomplete/suggest is what you're looking for, it's fast and it provides fuzziness support out of the box.

1 Like

Thank you!