Ids Query is slow in 2b docs

Version: 6.2.4
Documents: 1979484267
Shards: 30 in 30 hosts
Document sample:

{
          "goods_id": 2466330584,
          "cate_id": 6247,
          "property_value_ids": []
}

What we want to do is query specified docs with _id(same as goods_id). which like

{
  "_source": ["_id"],
  "profile": false,
  "query": {
    "bool": {
      "filter": [
        {
          "ids": {
            "type": "goods",
            "values": [
              2035275709,
              ... 
            ]
          }
        }
      ]
    }
  }

It's quite simple. It's about 1,000 ids per search request.

But it takes about 20ms!! If I decrease the ids size to 5, it takes ~5ms

So I try to profile them, it seems send all 1,000 ids to all shards without calculating where we belongs to, like sending [1, 3, 5] to shard 1, and [2, 4, 6] to shard 2.

Any idea to speed up this?
Thanks!

  1. Can you remove types ? by using latest es version and not using types. this will store _id as binary, using nearly ~8 bytes. This will make _id smaller/shorter and faster to query.
  2. This takes time. You have to keep on the client-side a map of {shard_id: node}. And when you do a query, you calculate the shard of each _id, you group by _node and then you do 30 single queries directly on the node.
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.