Painfully slow queries


(nainy) #1

I'm experiencing problems with query time. The longer the word count gets in the query, the longer it takes, up to a whopping 30s when unwarmed.

My config:
i5, 8GB RAM
ES 2.1.0
1 node, 1 shard, 0 replicas, 1 index (don't plan on having more)
ES_HEAP_SIZE set to 4g

All I have in elasticsearch.yml:
script.engine.groovy.inline.search: on
bootstrap.mlockall: true
indices.cache.filter.size: 20%
indices.memory.index_buffer_size: 30%
index.refresh_interval: 30s
index.translog.durability: request

My configuration (PHP): http://pastebin.com/TXFVkd9T
My mapping (PHP): http://pastebin.com/LFg71XSm

There currently sit ~4000 documents.

An example query:
{"size":20,"query":{"filtered":{"query":{"bool":{"should":[{"multi_match":{"query":" гостиница парковка","use_dis_max":false,"type":"cross_fields","fuzziness":"1","slop":"1","operator":"and","fields":["specializationsNames","address^3","city","name^8","description^2","subcategoriesNames^3","categoryName^3","tags^5","services^3"]}},{"multi_match":{"use_dis_max":false,"type":"best_fields","fuzziness":"2","slop":"1","operator":"and","fields":["specializationsNames","address^3","city","name^8","description^2","subcategoriesNames^3","categoryName^3","tags^5","services^3"],"query":" gostinica parkovka"}}]}},"filter":{"bool":{"must":[{"bool":{"should":[{"range":{"filter.price":{"le":500}}}]}}]}}}},"sort":[{"_score":{"order":"desc"}}]}

I really need help here. Been trying to tackle what's wrong for two days now. Appreciate any help I can get.

Thank you.


(nainy) #2

Anyone? Still can't figure out what's wrong.


(RA) #3

Well I don't know about the whole query stuff you're doing there, but just to check nevertheless: Is your machine swapping by any chance?


(nainy) #4

No, the swapping is turned off, the free -m shows me zeroes for swap, plus there's the mlockall.

The indexing is also kinda slow, to be honest.


(RA) #5

Could you install hq plugin to watch the performance metrics in there for once? I find that quite useful,...I see what's going on on the different machines regarding indexing, queries, I/O, memory usage and so on, maybe that will give you some hint on what's going on?

I for one thing had disk i/o problems with my cluster, and it was really slow,...and I saw that in the HQ,...


(Jimferenczi) #6

That's strange. Can you share the current mapping of your index/type:
https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-get-field-mapping.html

I would also suggest to try without the fuzziness to see if it speed up the query.


(nainy) #7

Sure. Here're the current mappings. The filter property is dynamic, so not all documents have the same nested fieldset there.
http://pastebin.com/McgXawHu

The fuzziness indeed affects the response time very positively, but I'm not sure if I'll be able to do without it. (Though any tips here will help).


(Jimferenczi) #8

The fuzzy part of your query seems problematic. You try to apply a fuzzy query on all fields but some fields use the ngram tokenizer with a min size of 4 (tags for instance). For small words, a fuzzy query with a max distance of 2 has a big chance to match a large number of entries. You may want to change the max_expansions parameter in order to limit the number of expanded words in your query:
https://www.elastic.co/guide/en/elasticsearch/guide/2.x/fuzzy-query.html
Though a fuzzy query on a field with ngrams is not a good idea, what are you trying to solve here ? I guess you could apply a fuzzy query only on a subset of your fields (the ones that use a standard tokenizer) by adding another should clause.


(nainy) #9

While max expansions didn't do anything in terms of query time, your suggestions got me thinking.

I guess what I'm essentially doing here is putting the burden of error correction in user input on query time. Now that I think about it, maybe it's not the best idea (especially since it's also a slow idea). Do you think it would be more reasonable to put this responsibility on autocomplete/suggest functionality instead? Maybe I can implement a better analyzer setup instead of doing a fuzzy search?


(Jimferenczi) #10

Exactly, the autocomplete/suggest is what you're looking for, it's fast and it provides fuzziness support out of the box.


(nainy) #11

Thank you!


(system) #12