How cause ElasticSearch to not throw DocumentAlreadyExistsException?

Hi all,

I would like that elastic search will index only new documents, but for
existing one will not throw any exception and quietly continue

Is that possible?

I tried to index using "create" operation , but then I get
"DocumentAlreadyExistsException" - I would like not to get it at all...so
elastic search will do nothing in case that document already exists

Also I tried to set "version_type" : external and each document set same
version, but then I get exception "VersionConflictEngineException"

So question how I configure ElasticSearch or its index to just index new
docs and ignore existing once...

(I need that because I run some hadoop jobs, and don't want that elastic
search will reindex same doc data after each job execution...I want to save
queries to ES to check if doc already exists...)

Thanks
Igor

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a2491598-abc5-4261-94ca-27f6ef6ec175%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,
If you use "Create" Operation, so when your doc exists, you need to catch &
leave this exception.
To avoid it, I think you could check your document, if it's not exist then
you do index.

On Sunday, March 30, 2014 11:48:34 PM UTC+7, Igor Romanov wrote:

Hi all,

I would like that Elasticsearch will index only new documents, but for
existing one will not throw any exception and quietly continue

Is that possible?

I tried to index using "create" operation , but then I get
"DocumentAlreadyExistsException" - I would like not to get it at all...so
Elasticsearch will do nothing in case that document already exists

Also I tried to set "version_type" : external and each document set same
version, but then I get exception "VersionConflictEngineException"

So question how I configure Elasticsearch or its index to just index new
docs and ignore existing once...

(I need that because I run some hadoop jobs, and don't want that elastic
search will reindex same doc data after each job execution...I want to save
queries to ES to check if doc already exists...)

Thanks
Igor

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e36fa5c3-b566-4b23-a7bd-06ca87a87ec8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

In case of "Create" I wanted to ignore response full of errors (for example
errors on Creating 5000 docs in a batch...)
Also I want to save time by not checking the document...it is additional
query to ES, when actually ES checks same when it try to create...so better
it be in one batch upload...
I guess I am missing functionality in ES to CreateOnlyNew , so it will not
throw exceptions on existing documents ... :confused:

Maybe there is a way to configure in Elasticsearch to not throw certain
exception to client?

Igor

On Monday, March 31, 2014 7:33:06 AM UTC+3, kidkid wrote:

Hi,
If you use "Create" Operation, so when your doc exists, you need to catch
& leave this exception.
To avoid it, I think you could check your document, if it's not exist then
you do index.

On Sunday, March 30, 2014 11:48:34 PM UTC+7, Igor Romanov wrote:

Hi all,

I would like that Elasticsearch will index only new documents, but for
existing one will not throw any exception and quietly continue

Is that possible?

I tried to index using "create" operation , but then I get
"DocumentAlreadyExistsException" - I would like not to get it at all...so
Elasticsearch will do nothing in case that document already exists

Also I tried to set "version_type" : external and each document set same
version, but then I get exception "VersionConflictEngineException"

So question how I configure Elasticsearch or its index to just index new
docs and ignore existing once...

(I need that because I run some hadoop jobs, and don't want that elastic
search will reindex same doc data after each job execution...I want to save
queries to ES to check if doc already exists...)

Thanks
Igor

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9f760c27-62ca-4093-bb63-6a1bbba39b8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eventually I solved the issue in pretty ugly way - I added new "add"
command to elasticsearch, that doing what "create" does, but don't throw
exception ... bad thing about it that each new elasticsearch version that I
would want to use, I will need to merge those changes :confused:

On Monday, March 31, 2014 3:47:04 PM UTC+3, Igor Romanov wrote:

In case of "Create" I wanted to ignore response full of errors (for
example errors on Creating 5000 docs in a batch...)
Also I want to save time by not checking the document...it is additional
query to ES, when actually ES checks same when it try to create...so better
it be in one batch upload...
I guess I am missing functionality in ES to CreateOnlyNew , so it will not
throw exceptions on existing documents ... :confused:

Maybe there is a way to configure in Elasticsearch to not throw certain
exception to client?

Igor

On Monday, March 31, 2014 7:33:06 AM UTC+3, kidkid wrote:

Hi,
If you use "Create" Operation, so when your doc exists, you need to catch
& leave this exception.
To avoid it, I think you could check your document, if it's not exist
then you do index.

On Sunday, March 30, 2014 11:48:34 PM UTC+7, Igor Romanov wrote:

Hi all,

I would like that Elasticsearch will index only new documents, but for
existing one will not throw any exception and quietly continue

Is that possible?

I tried to index using "create" operation , but then I get
"DocumentAlreadyExistsException" - I would like not to get it at all...so
Elasticsearch will do nothing in case that document already exists

Also I tried to set "version_type" : external and each document set same
version, but then I get exception "VersionConflictEngineException"

So question how I configure Elasticsearch or its index to just index new
docs and ignore existing once...

(I need that because I run some hadoop jobs, and don't want that elastic
search will reindex same doc data after each job execution...I want to save
queries to ES to check if doc already exists...)

Thanks
Igor

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0c3e07fe-04db-4311-8e5d-e586174c46e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.