Indexing same document twice

Suppose I have some document

id=1
{
"foo":"bar"
}

What happens if I'm trying to index the same document? Will it be reindexed? Is it the same as if some field changed? (I see that version number grows so I expect so)

id=1
{
"foo":"baz"
}

Is there a nice way to check if I'm going to index duplicate document?

  1. Use ``` to mark code blocks so discuss will preserve the indentation.
  2. Have a look at version_type, particularly internal vs external. The other version types are really more corner case things. Don't use force. It is evil.
  3. On that same page there is documentation for the _create action which will create documents if they haven't been seen. That'll catch a duplicate document.
  4. Putting id in the document isn't the same as putting it on the URL. The url is authoritative. The one in the document is just another field.

Thanks for the answer @nik9000. By adding id to json I meant setting same internal id.

Create won't add document even if it has changed as I understand. And I need that.

The task is given a large flow of documents with the comparable number of unique ones, duplicates and ones with something changed:

  1. Index document if it is not exist.
  2. Update fields if the new one have some changes.
    3. Do nothing if it is duplicate.

The third item I'm having trouble with. Update with doc_as_upsert gives me 1. and 2. but as I understand 3. isn't there – ES do reindex for duplicates (?).

Are you looking for detect_noop? That is on by default since 2.0 I believe.

The updates using a partial document merge the new document into the old document. If that works for you then I think all you have to do is use a partial document and doc_as_upsert?

What do you mean by "use a partial document"?

So if detect_noop works should version number grow when indexing same data? (using 2.3, will check if it is enabled but it should be)
As I understand reindexing is costly operation. Detecting noops isn't?
Getting document and checking by myself is worse than using detect_noop?

upd. Just tested. I've got 2.3.3.

POST test/test/1/_update
{
  "doc" : {
      "test" : "test"
  }
}

This really doesn't increase version number. So it writes nothing? And bazillion duplicate updates won't kill my SSDs?