Return missing ids from given list

Hi,
Is there a way, we can return only those ids which is not found in a given list.
For eaxample
list to search [1,2,3,4,5,6,7,8,9,10] and query will return etc [2,3,8,10] list of missing ids in ES index.

Thanks in Advance
Regards
Haris Khalique

If your documents look like

"hits" : [
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "RyuJ9XUBqnoQJrfBzbLY",
        "_score" : 1.0,
        "_source" : {
          "message" : 1
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "SSuJ9XUBqnoQJrfB27Kq",
        "_score" : 1.0,
        "_source" : {
          "message" : 2
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "SyuJ9XUBqnoQJrfB5bIM",
        "_score" : 1.0,
        "_source" : {
          "message" : 3
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "TiuJ9XUBqnoQJrfB9LJw",
        "_score" : 1.0,
        "_source" : {
          "message" : 4
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "UCuJ9XUBqnoQJrfB_rIA",
        "_score" : 1.0,
        "_source" : {
          "message" : 5
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "UiuK9XUBqnoQJrfBDbJA",
        "_score" : 1.0,
        "_source" : {
          "message" : 6
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "VSuK9XUBqnoQJrfBHLIp",
        "_score" : 1.0,
        "_source" : {
          "message" : 7
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "WCuK9XUBqnoQJrfBNrLD",
        "_score" : 1.0,
        "_source" : {
          "message" : 8
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "WiuK9XUBqnoQJrfBQLIc",
        "_score" : 1.0,
        "_source" : {
          "message" : 9
        }
      },
      {
        "_index" : "numbers",
        "_type" : "_doc",
        "_id" : "WyuK9XUBqnoQJrfBRrJu",
        "_score" : 1.0,
        "_source" : {
          "message" : 10
        }
      }
    ]

Then a query like this will work

GET numbers/_search
{
  "query" : {
    "bool" : {
      "must_not" : {
        "terms" : {
          "message" : [1,2,3,5,6,7,9,10]
        }
      }
    }
  }
}

If you want just the results then you can filter all the other stuff out using

POST numbers/_search?filter_path=hits.hits._source
{
  "query" : {
    "bool" : {
      "must_not" : {
        "terms" : {
          "message" : [1,2,3,5,6,7,9,10]
        }
      }
    }
  }
}

Documents I have like this

"hits" : [
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "8wE12HUB-VjXslma3jdp",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 85459
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "igE42HUB-VjXslmaD0K4",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 85765
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "2AE42HUB-VjXslmaZUOR",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 85807
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "pQE42HUB-VjXslmal0SO",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 85835
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "4gE52HUB-VjXslmal0hZ",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 82214
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "vwE92HUB-VjXslmarVoE",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 44148
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "DAE92HUB-VjXslma9Fw4",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 50521
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "vwE02HUB-VjXslmarjHh",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 85363
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "HwE42HUB-VjXslmaA0JC",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 85758
    }
  },
  {
    "_index" : "products",
    "_type" : "_doc",
    "_id" : "SQFd2HUB-VjXslmaK-CN",
    "_score" : 1.0,
    "_source" : {
      "product_id" : 16524
    }
  }
]

I try with this query

 GET products/_search?filter_path=hits.hits._source
    {
    "_source": [
        "product_id"
        ], 
    "query": {
        "bool": {
            "must": [
            {
                "term": {
                "product_to_show": {
                    "value": 1
                }
                }
            },
            {
                "terms": {
                "product_id": [
                104477,
                104474,
                104473,
                104472,
                104471,
                104470,
                104468,
                104467,
                104466,
                104465,
                104464,
                104463,
                104462,
                104460,
                104459,
                104457,
                104456,
                104454,
                104453,
                104452
                ]
                }
            }
            ]
        }
        }
    }
    }

Following result I get

    {
    "hits" : {
        "hits" : [
        {
            "_source" : {
            "product_id" : 104472
            }
        },
        {
            "_source" : {
            "product_id" : 104471
            }
        },
        {
            "_source" : {
            "product_id" : 104464
            }
        },
        {
            "_source" : {
            "product_id" : 104459
            }
        },
        {
            "_source" : {
            "product_id" : 104468
            }
        },
        {
            "_source" : {
            "product_id" : 104474
            }
        },
        {
            "_source" : {
            "product_id" : 104477
            }
        },
        {
            "_source" : {
            "product_id" : 104456
            }
        },
        {
            "_source" : {
            "product_id" : 104454
            }
        },
        {
            "_source" : {
            "product_id" : 104460
            }
        }
        ]
    }
    }

I was expected missing ids to return for example in this query array of ids "104473" id is missing.

Looks like you are using must where I used must_not. Does the below work for you?

POST products/_search
{
  "query" : {
    "bool" : {
      "must_not" : {
        "terms" : {
          "product_id" : [104477, 104474, 104473, 104472, 104471, 104470]
        }
      }
    }
  }
}

Must_not will return all those ids which is not exists in array. Ids which we need to filter out have no document, so those ids will ignore by default.

So must_not will not work in this case.

How can we use script to check if return ids exists in given array. If not exists we can send true and in result we would have only those ids which is not found ( have no document ).

Do you have any idea?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.