Enforce Primary Key in ElasticSearch


Please consider the following senarion :

I have a RDBMS database which load every 5 minuts 100K rows from a CSV file into database table.
Once the the file data has been insert into the database , end user can only run select statment (no insert-update-delete are allowed).

Currently i have 5 tera of data.

I would like to move from that RDBMS to elastic search , but have one problem.
I must enforce Primary key constraint, which currently built on 3 fields.
This mean that i need to prevent old records/documents to be uploaded once again into elasticsearch.

  1. Please advise if it possible to enforce a Primary key as describe above
  2. I am wondering if the P.K limitation cause ElasticSearch to be inappropriate for my case and mybe i need to stick with the RDBMS or maybe try Mongodb .....



while there is no direct notion of a Primary Key like in RDBMS in Elasticsearch, you can probably use the document id for this. You need to find a way to create a unique "id" out of your three current pk fields. You can then either op_type=create or the _create endpoint of the index api to prevent overwrites of existing documents like so:

PUT index/type/id?op_type=create
    "foo" : "bar"


PUT index/type/id/_create
    "foo" : "bar"

All future attempts to index a document with the same "id" will result in an error. Hope this solves your use case.

Hi You may try two approaches here :

  • Create separate field say "id" and while inserting use delete by query to ensure this gets deleted.

  • Second approach is use "PUT" and while inserting record along with actual record id.

So your _id field in ES will take your db record id and will prevent duplicating.

