Document exist check

Given the scenario where the index contains documents with ids 1,2,3,4,5,6,7. A user saves document 1, 3, and 7 in their session, and sometime later, document 3 is removed from the index.

What query can I use against the index to know the list of documents that are no longer in the index? I could fire a bulk get, so perform a get on doc with ids [1,3,7] and write custom logic in my code to see which docs didn't return. Is there a method/query that in built in elasticsearch to let me do that?

I'm using elasticsearch version 2.2.0.

Hi,

better than a bulk get would be to make use of the Ids Query something like this:

GET /i/t/_search 
{
  "query": { 
    "ids": {
        "type" : "t",
        "values" :[1,3,7]
    } 
  },
  "fields": []
}

That should be quiet fast and id you only need the ID then the fields portion makes sure only meta data gets included in the result, so you get:

"hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "i",
        "_type": "t",
        "_id": "1",
        "_score": 1
      },
      {
        "_index": "i",
        "_type": "t",
        "_id": "7",
        "_score": 1
      },
      {
        "_index": "i",
        "_type": "t",
        "_id": "3",
        "_score": 1
      }
    ]
  }

You still need to figure out which ids from your query list are missing in the result, at least there is no easy built-in way to do so I think.

@cbuescher That is currently what I'm doing. I was hoping there might be an inbuilt method based on the new aggregation module. A man has to hope :slight_smile:

@Dale_McDiarmid Is there any inbuilt method that can be used here?

@raunak sorry, but that sounds interessting, will keep digging

It's not really polite to ping people like this that aren't involved in your thread at all :slight_smile: