Hello!
I am working on a search engine and I have introduced the use of the mget
API to obtain a series of documents from a series of ids.
My question is related to a series of issues that I have been able to observe in the performance of this type of search.
The average time of the requests that use mget
is higher than the rest of the requests that use other operations of the Elasticsearch API. A sample of these times and their operations could be this:
-
_mget
API ->**30-40ms**
-
_search
API ->20-30ms
-
_count
API ->16-18ms
-
get
API (for an specified document) ->5-6ms
Is this behavior normal? Before introducing the use of mget
, I thought that the performance would be at least similar to that of the search
API. Sometimes the difference between both endpoints is greater than 10ms
.
In many of these cases, information is being obtained from approximately 1 to 20 documents.
On the other hand we see that in addition to these average times there are also times that seem to be quite high. A small percentage (~1%) seem to have times between 200ms-2000ms
.
Is there a way to lower the times of these requests?
To analyze the mget
response times we had to monitor the response times of the http
layer of the microservice developed to expose a rest interface that in turn makes use of the mget
API through a Java client. Is there any utility to be able to make a direct profile of this type of request? If I'm not mistaken, the profile
API is only available for _search
.
My index has 14-16M of documents and i'm using 7.10 version. We need to retrieve the source of each document and we don't request stored fields. This is an example of the query that I'm performing:
POST http://localhost:9200/_mget
Content-Type: application/json
{
"docs": [
{
"_index": "<my_index>",
"_type": null,
"_id": "<some_id>",
"routing": null,
"stored_fields": null,
"version": -3,
"version_type": "internal",
"_source": null
}
]
}
Thanks!