_count differes (same index different number of shards)

Hi,

Anybody knows how different number of primary shards can affect number of
docs (_count) in same index?

I mean I have read-only index OLD, did a snap (ES 1.2) of it and restored
it as NEW. Both have same number shards and docs.
Then I'm restoring another index NEW_10, with less primary shards and I'm
getting less documents count in it?
Doesn't matter how many tried I retested snap and restore, if I don't edit
number of shards I end with good counter, so wonder what messes my case

[:~] curl localhost:9200/_cat/indices/OLD,NEW,NEW_10?v
health index pri rep docs.count docs.deleted store.size
pri.store.size
green OLD 20 2 1333718 78639 15.7gb 5.2gb
green NEW 20 1 1333480 78624 10.5gb 5.2gb
green NEW_10 10 1 666064 42696 5.2gb 2.6gb

Tried simple math, (docs.count - docs.deleted) / pri, but not getting same
result

Thank you,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d9512d7-ce4f-4a36-a0ba-bfc88a520e03%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

...some more clarification NEW and NEW_10 are restored from the same snap
taken from OLD.
What differs is before restoring NEW_10, I'm editing manualy number of
shards in the mapping in the snap (hdfs_repository plugin)

On Thursday, January 22, 2015 at 6:22:04 PM UTC-8, Daniel Gligorov wrote:

Hi,

Anybody knows how different number of primary shards can affect number of
docs (_count) in same index?

I mean I have read-only index OLD, did a snap (ES 1.2) of it and restored
it as NEW. Both have same number shards and docs.
Then I'm restoring another index NEW_10, with less primary shards and I'm
getting less documents count in it?
Doesn't matter how many tried I retested snap and restore, if I don't edit
number of shards I end with good counter, so wonder what messes my case

[:~] curl localhost:9200/_cat/indices/OLD,NEW,NEW_10?v
health index pri rep docs.count docs.deleted store.size
pri.store.size
green OLD 20 2 1333718 78639 15.7gb
5.2gb
green NEW 20 1 1333480 78624 10.5gb 5.2gb
green NEW_10 10 1 666064 42696 5.2gb 2.6gb

Tried simple math, (docs.count - docs.deleted) / pri, but not getting same
result

Thank you,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa9d4c35-6a3a-4a59-b7c9-9770eea454fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Something I don't get. You are changing the number of shards of an index? This is something you should not be able to do IMHO.
You can not restore the full backup in less shards.

That's probably the reason you are seeing less docs in your new index because you restored only half of shards with what I would call somehow a "hack".

David

Le 23 janv. 2015 à 03:23, Daniel Gligorov gligorov.daniel@gmail.com a écrit :

...some more clarification NEW and NEW_10 are restored from the same snap taken from OLD.
What differs is before restoring NEW_10, I'm editing manualy number of shards in the mapping in the snap (hdfs_repository plugin)

On Thursday, January 22, 2015 at 6:22:04 PM UTC-8, Daniel Gligorov wrote:
Hi,

Anybody knows how different number of primary shards can affect number of docs (_count) in same index?

I mean I have read-only index OLD, did a snap (ES 1.2) of it and restored it as NEW. Both have same number shards and docs.
Then I'm restoring another index NEW_10, with less primary shards and I'm getting less documents count in it?
Doesn't matter how many tried I retested snap and restore, if I don't edit number of shards I end with good counter, so wonder what messes my case

[:~] curl localhost:9200/_cat/indices/OLD,NEW,NEW_10?v
health index pri rep docs.count docs.deleted store.size pri.store.size
green OLD 20 2 1333718 78639 15.7gb 5.2gb
green NEW 20 1 1333480 78624 10.5gb 5.2gb
green NEW_10 10 1 666064 42696 5.2gb 2.6gb

Tried simple math, (docs.count - docs.deleted) / pri, but not getting same result

Thank you,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa9d4c35-6a3a-4a59-b7c9-9770eea454fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/70546D5D-2BFA-4405-8155-AA85515F7B3C%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Yup, that's what I meant. I was editing number of shards in the backup.
Thought might work that way, data size to end up same, just in less shards.
Thanks for your opinion.

I need to edit number of shards, but they are not editable. Re-indexing
will take long (3 days). So looking for a solution how to end up with
edited shards number by backing up the old and restoring in modified one.
Tried GitHub - mallocator/Elasticsearch-Exporter: A small script to export data from one Elasticsearch cluster into another. but that one
takes long time too (1 day). The snap and restore functionality does it in
1h. Have 40mil docs, 100G.

On Friday, January 23, 2015 at 1:34:21 AM UTC-8, David Pilato wrote:

Something I don't get. You are changing the number of shards of an index?
This is something you should not be able to do IMHO.
You can not restore the full backup in less shards.

That's probably the reason you are seeing less docs in your new index
because you restored only half of shards with what I would call somehow a
"hack".

David

Le 23 janv. 2015 à 03:23, Daniel Gligorov <gligoro...@gmail.com
<javascript:>> a écrit :

...some more clarification NEW and NEW_10 are restored from the same snap
taken from OLD.
What differs is before restoring NEW_10, I'm editing manualy number of
shards in the mapping in the snap (hdfs_repository plugin)

On Thursday, January 22, 2015 at 6:22:04 PM UTC-8, Daniel Gligorov wrote:

Hi,

Anybody knows how different number of primary shards can affect number of
docs (_count) in same index?

I mean I have read-only index OLD, did a snap (ES 1.2) of it and restored
it as NEW. Both have same number shards and docs.
Then I'm restoring another index NEW_10, with less primary shards and I'm
getting less documents count in it?
Doesn't matter how many tried I retested snap and restore, if I don't
edit number of shards I end with good counter, so wonder what messes my case

[:~] curl localhost:9200/_cat/indices/OLD,NEW,NEW_10?v
health index pri rep docs.count docs.deleted store.size
pri.store.size
green OLD 20 2 1333718 78639 15.7gb
5.2gb
green NEW 20 1 1333480 78624 10.5gb
5.2gb
green NEW_10 10 1 666064 42696 5.2gb 2.6gb

Tried simple math, (docs.count - docs.deleted) / pri, but not getting
same result

Thank you,

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa9d4c35-6a3a-4a59-b7c9-9770eea454fe%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa9d4c35-6a3a-4a59-b7c9-9770eea454fe%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ffc17949-bc55-4232-a9a4-794ad5ab681c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.