Elastic Search deleting some files while indexing?


(IronMike) #1

I am indexing about 5000 documents, when indexing is done, I use "HEAD"
plugin, it says it indexed 4950 docs and deleted 50 files, also verified by
curl that only 4950 indexed. I couldn't see anything in the logs, but
how/when/why does Elasticsearch decide to delete some of the docs?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0e4974a-9add-42bc-a0c3-e2aefbae2441%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

May be you updated 50 docs (same ID)?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 7 février 2014 at 15:58:15, ZenMaster80 (sabdalla80@gmail.com) a écrit:

I am indexing about 5000 documents, when indexing is done, I use "HEAD" plugin, it says it indexed 4950 docs and deleted 50 files, also verified by curl that only 4950 indexed. I couldn't see anything in the logs, but how/when/why does Elasticsearch decide to delete some of the docs?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0e4974a-9add-42bc-a0c3-e2aefbae2441%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52f4fb5c.74b0dc51.12f3e%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(IronMike) #3

No, because it is a totally new index. I tried it several times, deleted
the index, then created/indexed.

On Friday, February 7, 2014 10:27:24 AM UTC-5, David Pilato wrote:

May be you updated 50 docs (same ID)?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 7 février 2014 at 15:58:15, ZenMaster80 (sabda...@gmail.com<javascript:>)
a écrit:

I am indexing about 5000 documents, when indexing is done, I use "HEAD"
plugin, it says it indexed 4950 docs and deleted 50 files, also verified by
curl that only 4950 indexed. I couldn't see anything in the logs, but
how/when/why does Elasticsearch decide to delete some of the docs?

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b0e4974a-9add-42bc-a0c3-e2aefbae2441%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ada899f6-c035-4369-8f38-8b436f6375fa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #4

I understood that's a new index.
It does not mean that in your insert script you don't have duplicate IDs.

I would first check my documents.

BTW elasticsearch does not delete documents unless you set _ttl.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 7 février 2014 at 17:51:28, ZenMaster80 (sabdalla80@gmail.com) a écrit:

No, because it is a totally new index. I tried it several times, deleted the index, then created/indexed.

On Friday, February 7, 2014 10:27:24 AM UTC-5, David Pilato wrote:
May be you updated 50 docs (same ID)?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 7 février 2014 at 15:58:15, ZenMaster80 (sabda...@gmail.com) a écrit:

I am indexing about 5000 documents, when indexing is done, I use "HEAD" plugin, it says it indexed 4950 docs and deleted 50 files, also verified by curl that only 4950 indexed. I couldn't see anything in the logs, but how/when/why does Elasticsearch decide to delete some of the docs?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0e4974a-9add-42bc-a0c3-e2aefbae2441%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ada899f6-c035-4369-8f38-8b436f6375fa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52f51675.2eb141f2.12f3e%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(IronMike) #5

Sorry, I misunderstood the first time. It makes sense, I will take a look.

On Friday, February 7, 2014 12:23:01 PM UTC-5, David Pilato wrote:

I understood that's a new index.
It does not mean that in your insert script you don't have duplicate IDs.

I would first check my documents.

BTW elasticsearch does not delete documents unless you set _ttl.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 7 février 2014 at 17:51:28, ZenMaster80 (sabda...@gmail.com<javascript:>)
a écrit:

No, because it is a totally new index. I tried it several times, deleted
the index, then created/indexed.

On Friday, February 7, 2014 10:27:24 AM UTC-5, David Pilato wrote:

May be you updated 50 docs (same ID)?

 -- 

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 7 février 2014 at 15:58:15, ZenMaster80 (sabda...@gmail.com) a écrit:

I am indexing about 5000 documents, when indexing is done, I use "HEAD"
plugin, it says it indexed 4950 docs and deleted 50 files, also verified by
curl that only 4950 indexed. I couldn't see anything in the logs, but
how/when/why does Elasticsearch decide to delete some of the docs?

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b0e4974a-9add-42bc-a0c3-e2aefbae2441%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ada899f6-c035-4369-8f38-8b436f6375fa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/07e4262e-bfc2-4417-b6b3-d42123da46b5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(IronMike) #6

Thanks David, I had duplicates like you mentioned.

On Friday, February 7, 2014 12:23:01 PM UTC-5, David Pilato wrote:

I understood that's a new index.
It does not mean that in your insert script you don't have duplicate IDs.

I would first check my documents.

BTW elasticsearch does not delete documents unless you set _ttl.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 7 février 2014 at 17:51:28, ZenMaster80 (sabda...@gmail.com<javascript:>)
a écrit:

No, because it is a totally new index. I tried it several times, deleted
the index, then created/indexed.

On Friday, February 7, 2014 10:27:24 AM UTC-5, David Pilato wrote:

May be you updated 50 docs (same ID)?

 -- 

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 7 février 2014 at 15:58:15, ZenMaster80 (sabda...@gmail.com) a écrit:

I am indexing about 5000 documents, when indexing is done, I use "HEAD"
plugin, it says it indexed 4950 docs and deleted 50 files, also verified by
curl that only 4950 indexed. I couldn't see anything in the logs, but
how/when/why does Elasticsearch decide to delete some of the docs?

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b0e4974a-9add-42bc-a0c3-e2aefbae2441%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ada899f6-c035-4369-8f38-8b436f6375fa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ccf35bb0-e282-4b39-a6f9-5aabf34f9e1f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #7