Hello Elasticsearch and Elasticsearch users,
I encounter a strange issue with query performance. I am able to reproduce
this behavior with simple match queries, without using filters, facets or
sorts. In summary, the first query is always slow (like seconds). Then,
launching the very same query brings the expected performance (like under
300 milliseconds). Adding an additional parameter to the query (parameter
that narrow the number of results) leads to the same behavior: first query
slow, next query ok.
Here are some details:
I am using elasticsearch 0.90.1.
The index contains about 8 million documents and it has 3 gb. One shard,
zero replicas. For the queries reproducing the issue, I am using two
fields: firstname and lastname.
I first launch a match_all query with a terms_facet on lastname in order to
obtain the most commonly used names:
{
"from": 0,
"size": 0,
"query": {
"match_all": {}
},
"facets": {
"lastname": {
"terms": {
"field": "lastname",
"size": 1000
}
}
}
}
The I launch a match query with each of the 1000 names using JMeter:
{
"from": 0,
"size": 200,
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "${LASTNAME}",
"fields": [
"lastname"
]
}
}
]
}
}
}
For the first launch of each of the 1000 queries I have rather slow times
(the max is at about 11 seconds and every query takes at least a couple of
hundreds of milliseconds). Queries with names in the beginning of the list
are generally slower (because they bring more results).
For the second launch the maximum "took time" drops to 500 ms (which is
O.K. for our case) and every query is faster than it was on its first
launch.
For the third launch, I add a first name to the query. I just chose a
common name at random (say Jean - John)
{
"from": 0,
"size": 200,
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "${LASTNAME}",
"fields": [
"lastname"
]
}
},
{
"multi_match": {
"query": "jean",
"fields": [
"firstname"
]
}
}
]
}
}
}
Queries are again slow. Not as slow as the first launch but I feel like the
acceleration is just due to less results (because of the additional
parameter).
Relaunching the queries brings correct performance again (fourth launch).
To add one last thing. Relaunching the fourth launch after some server
inactivity (like over the weekend) brings up slow performance again (500ms
to 8 seconds).
Can this be related to some caching done by the file system ? Have you
encountered something similar ?
Thank you for your hints.
Best regards,
Lucian Precup
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.