Thanks Jorg for the guidance and I have am trying the suggested approach #1
and I have further question on it.
As you mentioned - "- a custom written tool could traverse the segments
and extract field information and build a rudimentary mapping (without
analyzer, without info about _all and _source and all Elasticsearch
add-ons)".
We already have a Lucene Index metadata (i.e. field names, type, analyzer
etc.) available as an xml, so I can create the mapping without traversing
the segments. Should I create segment file "segments.gen" using the mapping
file and using some dummy values and then put all the other old lucene
index files ( except "segments.gen" ) from existing lucene index files
(e.g. - segments_2,_0.cfe,_0.cfs,_0.si,_1.cfe,_1.cfs etc.)
sample mapping xml file :-
true
Standard
AddressLine1
AddressLine1
true
string
true
Standard
Building_Name
Building_Name
true
string
true
Keyword
GNAF_PID
GNAF_PID
true
string
...
Thanks
On Thu, Nov 13, 2014 at 11:59 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
It is almost impossible to use just binary-only Lucene index for
migration, because Elasticsearch needs additional info which is not
available in Lucene. The only method is to reindex data over the
Elasticsearch API.
There is a bumpy road but I don't know if one ever tried that:
-
a custom written tool could traverse the segments and extract field
information and build a rudimentary mapping (without analyzer, without info
about _all and _source and all Elasticsearch add-ons)
-
another tool could try to reconstruct docs (like the tool Luke) and
write them to a file in bulk format. Not having the source of the docs
means it must be possible to retrieve the original input from the Lucene
index (which is almost never the case)
-
the result could be re-indexed using the Elasticsearch API (assuming all
analyzers and tokenizers are in place) but a lot of work would have to be
done
The preferred way is to rewrite the code that uses the Lucene API to use
the Elasticsearch API and re-run the indexing process.
Jörg
On Thu, Nov 13, 2014 at 7:11 PM, Gaurav gupta gupta.gaurav0125@gmail.com
wrote:
Hi All,
I have an embedded Search Engine in our product which is based on Lucene
4.8.1 and now I would like to migrate it to latest Elasticsearch 1.4 for
better distributed support (sharding and replication, mainly). Could you
guide me how one should migrate the existing indexes created by Lucene to
ES.
I have referred to the mail thread - migrate lucene index into
elasticsearch
https://groups.google.com/forum/#!searchin/elasticsearch/migrating/elasticsearch/xCE7124eAL8/ZFluLXqO_IcJ.
And based on the discussion in it appears to me that it's not a easy job
or even not feasible. I am wondering if there is some plugin (river) or
tool or any work around available to migrate the existing indexes
created by Lucene to ES.
I googled that an ES plugin available for SOLR to ES migration :
Trifork Blog - Keep updated on the technical solutions Trifork is working on! .
Do we have someting similar for Lucene to ES migration.
Thanks
Gaurav
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/71c0ed2e-94d7-4b70-b581-2515856fd938%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/71c0ed2e-94d7-4b70-b581-2515856fd938%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE8%3D-6Ft0%3DQBW_%2BShF69WAVzz_Ti%3DtJZMogp%3DQjxF5suA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE8%3D-6Ft0%3DQBW_%2BShF69WAVzz_Ti%3DtJZMogp%3DQjxF5suA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALZAj3KDmA7NVZV2LcG2bcZpdOt%2Bz8%3D_2yuBw1PH1Z0odxz1kA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.