Inserting a document that already exists. Exception?


(javi) #1

Hi

I will need to add documents to an elasticsearch index i already have. I want to validate if the document i an adding already exists in this index, and if thats the case, dont add it

Is elasticsearch controlling that scenario returning an exception? If not, how can i validate this?

I have tried using the GET API, but you need to send the ID as parameter. i want to search inside the whole index, not using a specific ID

Thanks


(David Pilato) #2

I guess this is what you want: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#operation-type


(Thomas Dasch) #3

It looks like op_type requires the id?


(David Pilato) #4

Ha I missed that part. Sorry.

Then there is no other than doing 2 calls.

The first to search if the document exists:

GET index/_search
{
  "size": 0,
  "query": ...
}

The second to effectively index if no document exists.

Problem is that document can be added in the meantime.

Another solution would be to compute a fingerprint on whatever fields are important to you and use that as the _id.


(javi) #5

Thanks for the answer

My problem is that the id is just a number i am generating at insertion time (1,2,3...)

What i would like to do i search in an index if a document (without using the id) exists in that index, as i dont know what is the id for the existing document

Imagine i have {"name": "javi"} and it has been indexed with id 4847 on the index "test"

How can i check if {"name": "javi"} exists in "test"?


(David Pilato) #6

Has I said:

  • run a search to see if there is a match

Or

  • generate a hash based on this payload and use it as an id

(Thomas Dasch) #7

Javi,

Like Mr. Pilato said, you can run a query to search. Let's use your example in the query:

GET test/_search
{
  "query": {
    "match": {
      "name": "javi"
    }
  }
}

If the doc is present, you'll get a hit and the associated id, see example below:

{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "gFO3BGQBHKIRaf1Qrs4z",
        "_score": 0.2876821,
        "_source": {
          "name": "javi"
        }
      }
    ]
  }
}

The line you are looking for is "_id": "gFO3BGQBHKIRaf1Qrs4z", , if it exists.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.