I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.
First, how do you know which documents you need to reindex? Also, do you
want to reindex them based on the data stored in elasticsearch itself, or
based on "outside" data?
If you have hte docs that need to be reindexed in a database, for example,
just fetch them from it, and index those docs again.
If its simply based on their IDs and index it from elasticsearch, you can
simply use the Get / Multi Get API to fetch the docs, change then, and index
it again (you can use versioning to make sure no other updates have happened
in between).
I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.
Thanks for the reply. Im planning on firing events whenever a forum
thread changes. An event listener will simply refetch the data for
that thread from the DB and reindex. It seems easier to get the
existing ES doc (indexed by threadID), delete it (if it exists) and re-
index a new doc with the changed data. Is this acceptable and if so is
there any additional steps I'm missing such as refreshing the index?
First, how do you know which documents you need to reindex? Also, do you
want to reindex them based on the data stored in elasticsearch itself, or
based on "outside" data?
If you have hte docs that need to be reindexed in a database, for example,
just fetch them from it, and index those docs again.
If its simply based on their IDs and index it from elasticsearch, you can
simply use the Get / Multi Get API to fetch the docs, change then, and index
it again (you can use versioning to make sure no other updates have happened
in between).
I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.
Thanks for the reply. Im planning on firing events whenever a forum
thread changes. An event listener will simply refetch the data for
that thread from the DB and reindex. It seems easier to get the
existing ES doc (indexed by threadID), delete it (if it exists) and re-
index a new doc with the changed data. Is this acceptable and if so is
there any additional steps I'm missing such as refreshing the index?
First, how do you know which documents you need to reindex? Also, do you
want to reindex them based on the data stored in elasticsearch itself, or
based on "outside" data?
If you have hte docs that need to be reindexed in a database, for example,
just fetch them from it, and index those docs again.
If its simply based on their IDs and index it from elasticsearch, you can
simply use the Get / Multi Get API to fetch the docs, change then, and index
it again (you can use versioning to make sure no other updates have happened
in between).
I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.