Deleting field from mapping


(AlexC) #1

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the existing
one, which makes me believe there is no support for deleting an existing
field.

But, if I really really need this functionality, is there a better way
other than creating the type under a new index and reindex all the
documents?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC acojocaru@pingidentity.com a écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the existing one, which makes me believe there is no support for deleting an existing field.

But, if I really really need this functionality, is there a better way other than creating the type under a new index and reindex all the documents?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/A5F2864F-7255-42C6-96E1-7CF243B6AE78%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


(AlexC) #3

Fair enough - I assume the reindex operation will use the scan/scroll API
or, to make it easier, the reindex plugin written by karussell.

I haven't been able to figure out what happens with the document(s)
updated/deleted/created after the scan process is initiated.

alex

On Friday, November 29, 2013 2:55:08 PM UTC-5, David Pilato wrote:

No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC <acoj...@pingidentity.com <javascript:>> a
écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the existing
one, which makes me believe there is no support for deleting an existing
field.

But, if I really really need this functionality, is there a better way
other than creating the type under a new index and reindex all the
documents?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3a50cdfc-b469-434e-ace5-2239d73bbfbb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #4

You basically scan a bunch of segments which are kept around until the scan is over.
New documents, deletion, updates won't be part of the scan because they are written in new segments.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 2 décembre 2013 at 21:48:16, AlexC (acojocaru@pingidentity.com) a écrit:

Fair enough - I assume the reindex operation will use the scan/scroll API or, to make it easier, the reindex plugin written by karussell.

I haven't been able to figure out what happens with the document(s) updated/deleted/created after the scan process is initiated.

alex

On Friday, November 29, 2013 2:55:08 PM UTC-5, David Pilato wrote:
No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC acoj...@pingidentity.com a écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the existing one, which makes me believe there is no support for deleting an existing field.

But, if I really really need this functionality, is there a better way other than creating the type under a new index and reindex all the documents?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3a50cdfc-b469-434e-ace5-2239d73bbfbb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.529d01d6.7fdcc233.b39a%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(AlexC) #5

That would make sense.

A new document will be at the end, so it will be included in the scan.

An updated document will be at the end too - assuming the update happened
after the scan processed the document, it will be picked up again at the
end and the old version of the document already indexed in the new index
will be rewritten with the new version.

But a document delete which happens after the scan processed the document
(and while it's still running) will not be 'replicated' into the new index

  • at least I don't see how it would be possible. The delete operation works
    by marking the document in the old index as deleted, by using an extra bit
    set.

alex

On Mon, Dec 2, 2013 at 4:55 PM, David Pilato david@pilato.fr wrote:

You basically scan a bunch of segments which are kept around until the
scan is over.
New documents, deletion, updates won't be part of the scan because they
are written in new segments.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 2 décembre 2013 at 21:48:16, AlexC (acojocaru@pingidentity.com//acojocaru@pingidentity.com)
a écrit:

Fair enough - I assume the reindex operation will use the scan/scroll API
or, to make it easier, the reindex plugin written by karussell.

I haven't been able to figure out what happens with the document(s)
updated/deleted/created after the scan process is initiated.

alex

On Friday, November 29, 2013 2:55:08 PM UTC-5, David Pilato wrote:

No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC acoj...@pingidentity.com a écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the existing
one, which makes me believe there is no support for deleting an existing
field.

But, if I really really need this functionality, is there a better way
other than creating the type under a new index and reindex all the
documents?

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3a50cdfc-b469-434e-ace5-2239d73bbfbb%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.529d01d6.7fdcc233.b39a%40MacBook-Air-de-David.local
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUBgW_ok9ZpdrmJF3vnMkf%3DXC-U42P7qkKRQbLW9ET0C6h18g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #6

No a new document will belong to a new segment. So it won't be added when you scroll.
Same for delete.
Update is behind the scene a delete + new document. Same here.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 2 décembre 2013 at 23:14:08, Alex Cojocaru (acojocaru@pingidentity.com) a écrit:

That would make sense.

A new document will be at the end, so it will be included in the scan.

An updated document will be at the end too - assuming the update happened after the scan processed the document, it will be picked up again at the end and the old version of the document already indexed in the new index will be rewritten with the new version.

But a document delete which happens after the scan processed the document (and while it's still running) will not be 'replicated' into the new index - at least I don't see how it would be possible. The delete operation works by marking the document in the old index as deleted, by using an extra bit set.

alex

On Mon, Dec 2, 2013 at 4:55 PM, David Pilato david@pilato.fr wrote:
You basically scan a bunch of segments which are kept around until the scan is over.
New documents, deletion, updates won't be part of the scan because they are written in new segments.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 2 décembre 2013 at 21:48:16, AlexC (acojocaru@pingidentity.com) a écrit:

Fair enough - I assume the reindex operation will use the scan/scroll API or, to make it easier, the reindex plugin written by karussell.

I haven't been able to figure out what happens with the document(s) updated/deleted/created after the scan process is initiated.

alex

On Friday, November 29, 2013 2:55:08 PM UTC-5, David Pilato wrote:
No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC acoj...@pingidentity.com a écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the existing one, which makes me believe there is no support for deleting an existing field.

But, if I really really need this functionality, is there a better way other than creating the type under a new index and reindex all the documents?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3a50cdfc-b469-434e-ace5-2239d73bbfbb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.529d01d6.7fdcc233.b39a%40MacBook-Air-de-David.local.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUBgW_ok9ZpdrmJF3vnMkf%3DXC-U42P7qkKRQbLW9ET0C6h18g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.529da066.238e1f29.bd3d%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(AlexC) #7

Yes, David, it makes sense. I have done some testing myself with a scan
search and I saw that it basically puts the existing segments in read-only
mode, so all new changes will go into new ones.
But now I am more confused on how a scan and scroll will help me reindex.
How do I get access to the new segments which are being created while the
scan search is active?

alex

On Tue, Dec 3, 2013 at 4:12 AM, David Pilato david@pilato.fr wrote:

No a new document will belong to a new segment. So it won't be added when
you scroll.
Same for delete.
Update is behind the scene a delete + new document. Same here.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 2 décembre 2013 at 23:14:08, Alex Cojocaru (acojocaru@pingidentity.com//acojocaru@pingidentity.com)
a écrit:

That would make sense.

A new document will be at the end, so it will be included in the scan.

An updated document will be at the end too - assuming the update happened
after the scan processed the document, it will be picked up again at the
end and the old version of the document already indexed in the new index
will be rewritten with the new version.

But a document delete which happens after the scan processed the document
(and while it's still running) will not be 'replicated' into the new index

  • at least I don't see how it would be possible. The delete operation works
    by marking the document in the old index as deleted, by using an extra bit
    set.

alex

On Mon, Dec 2, 2013 at 4:55 PM, David Pilato david@pilato.fr wrote:

You basically scan a bunch of segments which are kept around until the
scan is over.
New documents, deletion, updates won't be part of the scan because they
are written in new segments.

My 2 cents

 --

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 2 décembre 2013 at 21:48:16, AlexC (acojocaru@pingidentity.com//acojocaru@pingidentity.com)
a écrit:

Fair enough - I assume the reindex operation will use the scan/scroll
API or, to make it easier, the reindex plugin written by karussell.

I haven't been able to figure out what happens with the document(s)
updated/deleted/created after the scan process is initiated.

alex

On Friday, November 29, 2013 2:55:08 PM UTC-5, David Pilato wrote:

No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC acoj...@pingidentity.com a écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the
existing one, which makes me believe there is no support for deleting an
existing field.

But, if I really really need this functionality, is there a better way
other than creating the type under a new index and reindex all the
documents?

You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3a50cdfc-b469-434e-ace5-2239d73bbfbb%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.529d01d6.7fdcc233.b39a%40MacBook-Air-de-David.local.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHUBgW_ok9ZpdrmJF3vnMkf%3DXC-U42P7qkKRQbLW9ET0C6h18g%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.529da066.238e1f29.bd3d%40MacBook-Air-de-David.local
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUBgW9JJ%2BX82hrc5fJwu9rRpBoz9faKLkv1V91mWo_FpqoUrw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #8

I guess it would be easier if you have a quiet cluster while moving your data to a new index.

If not and if you have for example a modification date in your document fields or something which could help to identify new creation/update you could run a query based on this and scroll this query to extract all results. But for deletion it won't work.
May be using an alias on top of oldindex and newindex could help while scrolling the second time. I mean that all new index/create and delete operation should be send from a client level to newindex.
But older updates should be fetched from oldindex.

As I said, it would be much easier to do that on a quiet cluster.
If you have your source data around (database, filesystem, whatever), may be simply reindexing is the easiest option?

Do other readers have another idea?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 3 décembre 2013 at 21:22:08, Alex Cojocaru (acojocaru@pingidentity.com) a écrit:

Yes, David, it makes sense. I have done some testing myself with a scan search and I saw that it basically puts the existing segments in read-only mode, so all new changes will go into new ones.
But now I am more confused on how a scan and scroll will help me reindex. How do I get access to the new segments which are being created while the scan search is active?

alex

On Tue, Dec 3, 2013 at 4:12 AM, David Pilato david@pilato.fr wrote:
No a new document will belong to a new segment. So it won't be added when you scroll.
Same for delete.
Update is behind the scene a delete + new document. Same here.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 2 décembre 2013 at 23:14:08, Alex Cojocaru (acojocaru@pingidentity.com) a écrit:

That would make sense.

A new document will be at the end, so it will be included in the scan.

An updated document will be at the end too - assuming the update happened after the scan processed the document, it will be picked up again at the end and the old version of the document already indexed in the new index will be rewritten with the new version.

But a document delete which happens after the scan processed the document (and while it's still running) will not be 'replicated' into the new index - at least I don't see how it would be possible. The delete operation works by marking the document in the old index as deleted, by using an extra bit set.

alex

On Mon, Dec 2, 2013 at 4:55 PM, David Pilato david@pilato.fr wrote:
You basically scan a bunch of segments which are kept around until the scan is over.
New documents, deletion, updates won't be part of the scan because they are written in new segments.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 2 décembre 2013 at 21:48:16, AlexC (acojocaru@pingidentity.com) a écrit:

Fair enough - I assume the reindex operation will use the scan/scroll API or, to make it easier, the reindex plugin written by karussell.

I haven't been able to figure out what happens with the document(s) updated/deleted/created after the scan process is initiated.

alex

On Friday, November 29, 2013 2:55:08 PM UTC-5, David Pilato wrote:
No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC acoj...@pingidentity.com a écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the existing one, which makes me believe there is no support for deleting an existing field.

But, if I really really need this functionality, is there a better way other than creating the type under a new index and reindex all the documents?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3a50cdfc-b469-434e-ace5-2239d73bbfbb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.529d01d6.7fdcc233.b39a%40MacBook-Air-de-David.local.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUBgW_ok9ZpdrmJF3vnMkf%3DXC-U42P7qkKRQbLW9ET0C6h18g%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.529da066.238e1f29.bd3d%40MacBook-Air-de-David.local.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUBgW9JJ%2BX82hrc5fJwu9rRpBoz9faKLkv1V91mWo_FpqoUrw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.529ef86d.628c895d.bd3d%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(AlexC) #9

The strategy I have in mind for reindexing the data in a very active
cluster is about the same as the one you described, but with a few
differences:

  1. create the new mapping in the same index
    since the refresh interval is an index parameter, I cannot disable the
    refresh just for this new mapping (in order to speed up the index
    operations) as it will affect the other mappings

  2. scan search the objects from the old mapping and migrate them into the
    new mapping
    updates will still go into the old mapping at this point *
    deletes will have to be queue up **

  3. once the initial migration is complete, update the alias to point to the
    new mapping

  4. scan search the objects from the old mapping which have been
    created/updated while phase 2 was running ***, and migrate them into the
    new mapping ****; I need this phase in order to prevent costly document
    migrations in phase 2 (see *)
    deletes will have to be saved **

  5. replay the deletes queued up while steps 1 through 4 were being executed

  • if an update goes into the new mapping, it might be overwritten by an
    older version of the document during the mapping migration; that could be
    avoided by skipping the migration of the document if it already exists in
    the new mapping, but I will have to verify that for each document to be
    migrated, which will have a negative impact on the performance

** since I cannot detect a delete while scanning the data from the old
mapping, I will have to queue it up and replay it at the end of the whole
migration process; I still haven't figure out how to intercept deletes in a
multi-node cluster

*** I will have to somehow save the timestamp when the scan search locked
the segments, and select only the documents with the modified timestamp
greater or equal to that timestamp - I see a concurrency issue here

**** only migrate if the document doesn't exist in the new mapping or if
the modified timestamp is less

On Wed, Dec 4, 2013 at 4:39 AM, David Pilato david@pilato.fr wrote:

I guess it would be easier if you have a quiet cluster while moving your
data to a new index.

If not and if you have for example a modification date in your document
fields or something which could help to identify new creation/update you
could run a query based on this and scroll this query to extract all
results. But for deletion it won't work.
May be using an alias on top of oldindex and newindex could help while
scrolling the second time. I mean that all new index/create and delete
operation should be send from a client level to newindex.
But older updates should be fetched from oldindex.

As I said, it would be much easier to do that on a quiet cluster.
If you have your source data around (database, filesystem, whatever), may
be simply reindexing is the easiest option?

Do other readers have another idea?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 3 décembre 2013 at 21:22:08, Alex Cojocaru (acojocaru@pingidentity.com//acojocaru@pingidentity.com)
a écrit:

Yes, David, it makes sense. I have done some testing myself with a scan
search and I saw that it basically puts the existing segments in read-only
mode, so all new changes will go into new ones.
But now I am more confused on how a scan and scroll will help me reindex.
How do I get access to the new segments which are being created while the
scan search is active?

alex

On Tue, Dec 3, 2013 at 4:12 AM, David Pilato david@pilato.fr wrote:

No a new document will belong to a new segment. So it won't be added
when you scroll.
Same for delete.
Update is behind the scene a delete + new document. Same here.

Make sense?

 --

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 2 décembre 2013 at 23:14:08, Alex Cojocaru (acojocaru@pingidentity.com//acojocaru@pingidentity.com)
a écrit:

That would make sense.

A new document will be at the end, so it will be included in the scan.

An updated document will be at the end too - assuming the update happened
after the scan processed the document, it will be picked up again at the
end and the old version of the document already indexed in the new index
will be rewritten with the new version.

But a document delete which happens after the scan processed the document
(and while it's still running) will not be 'replicated' into the new index

  • at least I don't see how it would be possible. The delete operation works
    by marking the document in the old index as deleted, by using an extra bit
    set.

alex

On Mon, Dec 2, 2013 at 4:55 PM, David Pilato david@pilato.fr wrote:

You basically scan a bunch of segments which are kept around until the
scan is over.
New documents, deletion, updates won't be part of the scan because they
are written in new segments.

My 2 cents

 --

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 2 décembre 2013 at 21:48:16, AlexC (acojocaru@pingidentity.com//acojocaru@pingidentity.com)
a écrit:

Fair enough - I assume the reindex operation will use the scan/scroll
API or, to make it easier, the reindex plugin written by karussell.

I haven't been able to figure out what happens with the document(s)
updated/deleted/created after the scan process is initiated.

alex

On Friday, November 29, 2013 2:55:08 PM UTC-5, David Pilato wrote:

No. You need to reindex.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 nov. 2013 à 20:46, AlexC acoj...@pingidentity.com a écrit :

Just wondering if the API supports deleting a field from a mapping.
When a mapping is updated, the new definition is merged with the
existing one, which makes me believe there is no support for deleting an
existing field.

But, if I really really need this functionality, is there a better way
other than creating the type under a new index and reindex all the
documents?

You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed427ced-1732-444f-b8f3-763762293248%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3a50cdfc-b469-434e-ace5-2239d73bbfbb%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.529d01d6.7fdcc233.b39a%40MacBook-Air-de-David.local.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHUBgW_ok9ZpdrmJF3vnMkf%3DXC-U42P7qkKRQbLW9ET0C6h18g%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.529da066.238e1f29.bd3d%40MacBook-Air-de-David.local.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHUBgW9JJ%2BX82hrc5fJwu9rRpBoz9faKLkv1V91mWo_FpqoUrw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/dFkuWEHMdTI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.529ef86d.628c895d.bd3d%40MacBook-Air-de-David.local
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUBgW8dvLVusWwqeLh47NS_91Xx8Yfx4MZ1fhy8hEvexk_LPg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #10