Breaking Changes in 6.0 - reindex indices from 2.x - Why?

So along the lines of why does this need to happen, what needs to happen? I see that the reindex API is the recommended method, but this looks like a utility that just copies one index to another. I could be wrong, but I don't see how that could fix any problem. I could just map a "new" index that matches my old one and use a data transfer utility.

So what is the actual problem I am trying to solve here? I have several indices that were made in 2.3, that are not in 5.5. I want to be able to look at them and know where the issue will be.

Thank you for any info/links on this.

Ryan

My guess is because Lucene is guaranteed to backwards compatible for only
one major version. Elasticsearch 2.3 uses Lucene 5 [1], whereas
Elasticsearch 6 will use Lucene 7 [2]

Reindexing goes through the entire indexing process again, recreating the
Lucene indices. You simply cannot move the indices over. It would be more
efficient to simply use the Lucene index updater tool instead of
reindexing, if it exists in Lucene 7.

[1] https://github.com/elastic/elasticsearch/blob/2.3/pom.xml#L55
[2]
https://github.com/elastic/elasticsearch/blob/6.x/buildSrc/version.properties#L3

1 Like

@Ivan - so for instance in 2.3 I was using a String data type in a lot of my mapping. Will the reindex api actually change that mapping to a text or keyword now?

Based on the documentation for my current version (5.5)
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/docs-reindex.html

I see that the index you are "copying" to has to exist ahead of time, but I do not think any mapping has to be defined.

@Ivan - you talking about this upgrader tool?

https://lucene.apache.org/solr/guide/6_6/indexupgrader-tool.html

Correct. Keep in mind that there is more data stored in Elasticsearch than
simply the Lucene indices. The cluster state has additional information as
well such as index settings. I never jumped two Elasticsearch versions. If
you can upgrade first to Elasticsearch 5.x, reopening Lucene indices that
are one version behind and then doing a force merge will update the Lucene
index format. Repeat again for Elasticsearch 6.

Of course, the old indices might have older mapping settings which cannot
be updated.

@Ivan - I will be upgrading from 5.5 (or 5.6) to 6.0 when the time times. I am just posting based on that article I references in the original post. It says:

Reindex indices from Elasticseach 2.x or before
Indices created in Elasticsearch 2.x or before will need to be reindexed with Elasticsearch 5.x in order to be readable by Elasticsearch 6.x. The easiest way to reindex old indices is to use the reindex API.

I do have indicies that were originally created in 2.3. The process of updating to 5.5 "updated the indicies" but I do not know if they were reindexed. Should I assume that because the database is working fine that this happened in the update?

I know that 6.0 is only going to allow one _type per index, so this is another change that I am curently working on. Just trying to get all my ducks in a row for when 6.0 is released.

Thank you.

From a Lucene perspective, running a force merge down to one segment will
create a new segment, with the latest version, since segments are immutable
and must be recreated. There must be other reasons why a reindex is
necessary, very likely due to mapping changes. Hopefully I have not wasted
your time and someone with more knowledge, like @dadoonet, will chime in.

@Ivan I don't think you wasted anyone's time and appreciate you trying to help. Based on your previous post I am starting to think I may be ok. My older indicies were created in 2.3 but have been updated to 5.5. So I think maybe I am ok without a reindex...??

Thanks again,
Ryan

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.