Is my response time is ok?

Hi there! I am a novice in ES. And I want to know if I’m doing something wrong or trying to achieve impossible things.

I think my searches are slow. I have index messages (22.9 GB, 77m docs). Mapping is simple:

esClient.indices.create({
    index: 'messages',
    body: {
        mappings: {
            messages: {
                properties: {
                    hostId: {type: 'keyword'},
                    clientId: {type: 'keyword'},
                    userId: {type: 'keyword'},
                    createdAt: {type: 'long'},
                    message: {type: 'text',
                        fields: {
                            ngrams: {
                                type: 'text',
                                analyzer: 'ngrams'
                            }
                        }
                    }
                }
            }
        },
        settings: {
            analysis: {
                analyzer: {
                    default: {
                        type: 'custom',
                        filter: ['lowercase'],
                        tokenizer: 'whitespace'
                    },
                    ngrams: {
                        type: 'custom',
                        filter: ['lowercase', 'custom_edge_ngram'],
                        tokenizer: 'whitespace'
                    }
                },
                filter: {
                    custom_edge_ngram: {
                        type: 'edge_ngram',
                        min_gram: 1,
                        max_gram: 20,
                        token_chars: [
                            'letter',
                            'digit'
                        ]
                    }
                }
            }
        }
    }
});

Query is simple:

esClient.search({
    index: 'messages',
    from: 0,
    size: 10,
    terminate_after: 1000,
    body: {
        profile,
        query: {
            bool: {
                must: {
                    simple_query_string: {
                        query,
                        fields: ['message', 'message.ngrams'],
                        analyzer: 'whitespace',
                        default_operator: 'and'
                    }
                },
                filter: {
                    term: {hostId}
                }
            }
        },
        sort: {
            _score: {order: 'desc'},
            createdAt: {order: 'desc'}
        }
    }
}

But average response time for this index is about 1-3 sec, not quite acceptable for search-as-you-type experience. I am using Elastic Cloud (192/8gb plan). Please help!

1 Like

Please see https://www.elastic.co/cloud/as-a-service/support on how to raise a support ticket for Elastic Cloud :slight_smile:

How many shards you have? What monitoring is saying?
Is it happening only for the first query or always?

5 shards for that index - default. It happens every time. I have another index (clients) with quite the same mapping and query but smaller in size, and searches for that index is a bit quicker - about 0.5-1.5 sec.

I guess you have at least 2 nodes?

I guess only one at the moment (192/8gb). There is an option for high availability to spread nodes across regions — I’ll try to add more. But now I am only the person how run queries, so there is no load on system, this is not in production yet.

Could you run the query with profile: true so we can better understand where the time is spent?

David, thank you. Here is example of query:

{
  "query":{
        "bool": {
            "must": {
                "simple_query_string": {
                    "query": "привет",
                    "fields": ["message"],
                    "analyzer": "whitespace",
                    "default_operator": "and"
                }
            },
            "filter": {
                "term": {"hostId": "hX8ihkAcyHK93ue99"}
            }
        }
    }
}

Screenshots of Kibana profiler: https://d.pr/i/4g5pe1, https://d.pr/i/xLLUcS

Here is another query, this time faster: https://d.pr/i/JZUVvH

Few json-profiles for queries limited to 10 hits:

When querying another index (clients) timing is ok, screenshot: https://d.pr/i/Xui19E

Another query:

{
  "query":{
        "bool": {
            "must": {
                "simple_query_string": {
                    "query": "да",
                    "fields": ["message", "message.ngrams"],
                    "analyzer": "whitespace",
                    "default_operator": "and"
                }
            },
            "filter": {
                "term": {"hostId": "JJkh6K8yFERoyvQbx"}
            }
        }
    }
}

Profiler screen, 4.5 sec response: https://d.pr/i/R5zZYI

Interesting. So anytime you are using non ASCII characters it's slow, otherwise it's "fast".

@jpountz What do you think?

Not really, it can be slow and for english query, "yes" for example: https://d.pr/i/0eF6jD

Your approach looks ok to me. The profiler output suggests that most time is spent in weight creation, whose main task is to look up terms in the terms dictionary to have access to terms statistics. So this might be caused by a busy disk? Would you be able to run some queries with "sort": [ "_doc" ] and share the output of a slow query?

@jpountz, thanks for joining!

Hmm, I believe disk is not busy. I have now 2 nodes (192gb/8gb) - and I am the only client who run queries against this Cluster. Few screens from dashboard: https://d.pr/i/NQrNNY, https://d.pr/i/6pr7ms (hmm, but I don’t understand why so many search requests on picture)

Here is few profile screens with sort: ['_doc']: https://d.pr/i/UgGg28, https://d.pr/i/F1HPkL

When I run almost the same query for another index (clients), speed is better: https://d.pr/i/5eV4pi

Looks like size of the indices matters: messages — 71m (22gb), clients — 8m (5.3)

Thanks for sharing. This added sort moved the terms lookup from create_weight to build_scorer and it turns out that the latter is now the bottleneck, which strongly suggests that looking up terms is what makes your queries slow.

Are these suggestions indices read-only? If yes then a force-merge would likely help as it would decrease the number of segments that need to be looked up. If no, then can you share the number of segments that you have in your shards?

If the disk is not busy, then there is still a possibility that it is just slow. On spinning disks, random access might take ~10ms if the filesystem cache is cold. So if you have 50 segments, that could be 50x10=500ms only to look up terms in the terms dictionary.

If the size of your filesystem cache is expected to be larger than the size of your terms dictionary (accessible via the node stats API by passing include_segment_file_sizes=true) then we could look into forcing the terms dict to be loaded into the filesystem cache.

Are these suggestions indices read-only?

No. It's a dynamic index.

On spinning disks...

Hmm, I’m not sure what disks uses Elastic Cloud.

So if you have 50 segments...

Please have a look to the API response for /messages/_segments: { "_shards": { "total": 10, "successful": 5, "failed": 0 }, - Pastebin.com. Any optimising ideas? Or should I move to a more powerful hardware?

Sorry, I had forgotten this was Elastic Cloud, so you are on SSDs.

Thanks for sharing the segments output. Everything looks very sane. More powerful hardware would certainly help but I'm a bit surprised your current plan doesn't perform better than that.

One last thing I'd like to look at: could you run some slow queries in a loop and concurrently get the output of the nodes hot threads API a couple times at a couple seconds interval? This might give us indications about the bottleneck of these queries.

Here few hot_threads requests: https://pastebin.com/aytRt7xK, https://pastebin.com/DV3VYsTq, https://pastebin.com/k6uJewU5, https://pastebin.com/s29PMDan

Finally, we decided to move away from Elastic Cloud. On our server with same indices requests took 3-9ms, not 5000-10000 :confused:

Thank you for discussion!

Oh. That's very good to know.
Sometimes it could happen that clicking on restart the service helps as it allocates the nodes elsewhere.

In all cases I'd definitely open a ticket to the cloud support team.

Sorry to hear that, I'm disappointed by these response times too given that documents and queries are well designed. Hot threads look almost idle, only one of them shows an indexing request.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.