Long search time with mget

TheWorkingDeveloper · June 25, 2021, 1:25pm

I currently have a single node with SSD running an elastic cluster.
4B records split over 25 primary shards in a single index (no replication).
The ID of every record has been manually set as it is unique and known beforehand by both the indexer and searcher since I was suggested that directly fetching with the ID would get the best performance. But all data must be searched every time, so no hot/cold/rolling index is available.

To make an example of this, the md5 checksum of a file (as ID) and the document contains tags and metadata about that document - just to paint a picture

Searching for 1,000 - 2,000 id's at a time takes around 4-8 seconds (respectively) which I believe to be fairly slow but am not sure what the exact bottleneck is. The primary thing I'd like to eliminate is that using mget with predefined ID's is the root cause of this and that searching with a keyword for the md5 checksum of a file would be faster (or any other method).

And if the issue is hardware, upgrading which component would yield the biggest performance increase and why? The CPU util is very low overall but disk read/write is often 500-600MB/s

Hardware:
R710 system with: 2x Intel(R) Xeon(R) CPU X5680 @ 3.33GHz
32GB DDR3 RAM with 24GB allocated to Elasticsearch
2x2TB SSD (RAID 0)

warkolm · June 28, 2021, 1:52am

Just so I am clear, are you doing an _mget indexname/id for the whole 1-2000 records?

TheWorkingDeveloper · July 5, 2021, 8:07am

GET /_mget
{
  "docs": [
    {
      "_index": "my-index-000001",
      "_id": "1"
    },
    {
      "_index": "my-index-000001",
      "_id": "2"
    }
  ]
}

More like this, all the same index, every ID specified

system · August 2, 2021, 8:07am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Mget too slow for large amount of documents Elasticsearch	9	1803	February 16, 2022
Bulk get (_mget) performance when using ES as key value store Elasticsearch	3	1719	February 27, 2019
_mget vs _search for large amount of documents Elasticsearch	5	1182	August 25, 2023
Scroll vs mget when searching by ids on alias (edit - mget doesn't work on alias) Elasticsearch	5	993	November 22, 2018
Multiget (mget) API performance Elasticsearch	5	1937	December 21, 2021

Long search time with mget

Related topics