Is my response time is ok?


(Art Polikarpov) #1

Hi there! I am a novice in ES. And I want to know if I’m doing something wrong or trying to achieve impossible things.

I think my searches are slow. I have index messages (22.9 GB, 77m docs). Mapping is simple:

esClient.indices.create({
    index: 'messages',
    body: {
        mappings: {
            messages: {
                properties: {
                    hostId: {type: 'keyword'},
                    clientId: {type: 'keyword'},
                    userId: {type: 'keyword'},
                    createdAt: {type: 'long'},
                    message: {type: 'text',
                        fields: {
                            ngrams: {
                                type: 'text',
                                analyzer: 'ngrams'
                            }
                        }
                    }
                }
            }
        },
        settings: {
            analysis: {
                analyzer: {
                    default: {
                        type: 'custom',
                        filter: ['lowercase'],
                        tokenizer: 'whitespace'
                    },
                    ngrams: {
                        type: 'custom',
                        filter: ['lowercase', 'custom_edge_ngram'],
                        tokenizer: 'whitespace'
                    }
                },
                filter: {
                    custom_edge_ngram: {
                        type: 'edge_ngram',
                        min_gram: 1,
                        max_gram: 20,
                        token_chars: [
                            'letter',
                            'digit'
                        ]
                    }
                }
            }
        }
    }
});

Query is simple:

esClient.search({
    index: 'messages',
    from: 0,
    size: 10,
    terminate_after: 1000,
    body: {
        profile,
        query: {
            bool: {
                must: {
                    simple_query_string: {
                        query,
                        fields: ['message', 'message.ngrams'],
                        analyzer: 'whitespace',
                        default_operator: 'and'
                    }
                },
                filter: {
                    term: {hostId}
                }
            }
        },
        sort: {
            _score: {order: 'desc'},
            createdAt: {order: 'desc'}
        }
    }
}

But average response time for this index is about 1-3 sec, not quite acceptable for search-as-you-type experience. I am using Elastic Cloud (192/8gb plan). Please help!


(David Pilato) #2

Please see https://www.elastic.co/cloud/as-a-service/support on how to raise a support ticket for Elastic Cloud :slight_smile:

How many shards you have? What monitoring is saying?
Is it happening only for the first query or always?


(Art Polikarpov) #3

5 shards for that index - default. It happens every time. I have another index (clients) with quite the same mapping and query but smaller in size, and searches for that index is a bit quicker - about 0.5-1.5 sec.


(David Pilato) #4

I guess you have at least 2 nodes?


(Art Polikarpov) #5

I guess only one at the moment (192/8gb). There is an option for high availability to spread nodes across regions — I’ll try to add more. But now I am only the person how run queries, so there is no load on system, this is not in production yet.


(David Pilato) #6

Could you run the query with profile: true so we can better understand where the time is spent?


(Art Polikarpov) #7

David, thank you. Here is example of query:

{
  "query":{
        "bool": {
            "must": {
                "simple_query_string": {
                    "query": "привет",
                    "fields": ["message"],
                    "analyzer": "whitespace",
                    "default_operator": "and"
                }
            },
            "filter": {
                "term": {"hostId": "hX8ihkAcyHK93ue99"}
            }
        }
    }
}

Screenshots of Kibana profiler: https://d.pr/i/4g5pe1, https://d.pr/i/xLLUcS

Here is another query, this time faster: https://d.pr/i/JZUVvH

Few json-profiles for queries limited to 10 hits:

When querying another index (clients) timing is ok, screenshot: https://d.pr/i/Xui19E


(Art Polikarpov) #8

Another query:

{
  "query":{
        "bool": {
            "must": {
                "simple_query_string": {
                    "query": "да",
                    "fields": ["message", "message.ngrams"],
                    "analyzer": "whitespace",
                    "default_operator": "and"
                }
            },
            "filter": {
                "term": {"hostId": "JJkh6K8yFERoyvQbx"}
            }
        }
    }
}

Profiler screen, 4.5 sec response: https://d.pr/i/R5zZYI


(David Pilato) #9

Interesting. So anytime you are using non ASCII characters it's slow, otherwise it's "fast".

@jpountz What do you think?


(Art Polikarpov) #10

Not really, it can be slow and for english query, "yes" for example: https://d.pr/i/0eF6jD


(Adrien Grand) #11

Your approach looks ok to me. The profiler output suggests that most time is spent in weight creation, whose main task is to look up terms in the terms dictionary to have access to terms statistics. So this might be caused by a busy disk? Would you be able to run some queries with "sort": [ "_doc" ] and share the output of a slow query?


(Art Polikarpov) #12

@jpountz, thanks for joining!

Hmm, I believe disk is not busy. I have now 2 nodes (192gb/8gb) - and I am the only client who run queries against this Cluster. Few screens from dashboard: https://d.pr/i/NQrNNY, https://d.pr/i/6pr7ms (hmm, but I don’t understand why so many search requests on picture)

Here is few profile screens with sort: ['_doc']: https://d.pr/i/UgGg28, https://d.pr/i/F1HPkL

When I run almost the same query for another index (clients), speed is better: https://d.pr/i/5eV4pi

Looks like size of the indices matters: messages — 71m (22gb), clients — 8m (5.3)


(Adrien Grand) #13

Thanks for sharing. This added sort moved the terms lookup from create_weight to build_scorer and it turns out that the latter is now the bottleneck, which strongly suggests that looking up terms is what makes your queries slow.

Are these suggestions indices read-only? If yes then a force-merge would likely help as it would decrease the number of segments that need to be looked up. If no, then can you share the number of segments that you have in your shards?

If the disk is not busy, then there is still a possibility that it is just slow. On spinning disks, random access might take ~10ms if the filesystem cache is cold. So if you have 50 segments, that could be 50x10=500ms only to look up terms in the terms dictionary.

If the size of your filesystem cache is expected to be larger than the size of your terms dictionary (accessible via the node stats API by passing include_segment_file_sizes=true) then we could look into forcing the terms dict to be loaded into the filesystem cache.


(Art Polikarpov) #14

Are these suggestions indices read-only?

No. It's a dynamic index.

On spinning disks...

Hmm, I’m not sure what disks uses Elastic Cloud.

So if you have 50 segments...

Please have a look to the API response for /messages/_segments: https://pastebin.com/3RgTniEU. Any optimising ideas? Or should I move to a more powerful hardware?


(Adrien Grand) #15

Sorry, I had forgotten this was Elastic Cloud, so you are on SSDs.

Thanks for sharing the segments output. Everything looks very sane. More powerful hardware would certainly help but I'm a bit surprised your current plan doesn't perform better than that.

One last thing I'd like to look at: could you run some slow queries in a loop and concurrently get the output of the nodes hot threads API a couple times at a couple seconds interval? This might give us indications about the bottleneck of these queries.


(Art Polikarpov) #16

Here few hot_threads requests: https://pastebin.com/aytRt7xK, https://pastebin.com/DV3VYsTq, https://pastebin.com/k6uJewU5, https://pastebin.com/s29PMDan

Finally, we decided to move away from Elastic Cloud. On our server with same indices requests took 3-9ms, not 5000-10000 :confused:

Thank you for discussion!


(David Pilato) #17

Oh. That's very good to know.
Sometimes it could happen that clicking on restart the service helps as it allocates the nodes elsewhere.

In all cases I'd definitely open a ticket to the cloud support team.


(Adrien Grand) #18

Sorry to hear that, I'm disappointed by these response times too given that documents and queries are well designed. Hot threads look almost idle, only one of them shows an indexing request.


(system) #19

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.