Suggestions for reindexing individual documents


(Shane Witbeck) #1

I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.

Thanks,
Shane


(Shay Banon) #2

First, how do you know which documents you need to reindex? Also, do you
want to reindex them based on the data stored in elasticsearch itself, or
based on "outside" data?

If you have hte docs that need to be reindexed in a database, for example,
just fetch them from it, and index those docs again.

If its simply based on their IDs and index it from elasticsearch, you can
simply use the Get / Multi Get API to fetch the docs, change then, and index
it again (you can use versioning to make sure no other updates have happened
in between).

If its based on a query, you can execute the search query, possibly using
the scan search type (
http://www.elasticsearch.org/guide/reference/java-api/search.html) and
reindex the docs that match it.

On Thu, Oct 13, 2011 at 9:55 PM, Shane Witbeck shane@digitalsanctum.comwrote:

I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.

Thanks,
Shane


(Shane Witbeck) #3

Thanks for the reply. Im planning on firing events whenever a forum
thread changes. An event listener will simply refetch the data for
that thread from the DB and reindex. It seems easier to get the
existing ES doc (indexed by threadID), delete it (if it exists) and re-
index a new doc with the changed data. Is this acceptable and if so is
there any additional steps I'm missing such as refreshing the index?

Thanks again,
Shane

On Oct 14, 8:27 am, Shay Banon kim...@gmail.com wrote:

First, how do you know which documents you need to reindex? Also, do you
want to reindex them based on the data stored in elasticsearch itself, or
based on "outside" data?

If you have hte docs that need to be reindexed in a database, for example,
just fetch them from it, and index those docs again.

If its simply based on their IDs and index it from elasticsearch, you can
simply use the Get / Multi Get API to fetch the docs, change then, and index
it again (you can use versioning to make sure no other updates have happened
in between).

If its based on a query, you can execute the search query, possibly using
the scan search type (http://www.elasticsearch.org/guide/reference/java-api/search.html) and
reindex the docs that match it.

On Thu, Oct 13, 2011 at 9:55 PM, Shane Witbeck sh...@digitalsanctum.comwrote:

I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.

Thanks,
Shane


(David Pilato) #4

You don't have to delete the old version of your document. Just push the new one.

David :wink:

Le 15 oct. 2011 à 15:28, Shane Witbeck shane@digitalsanctum.com a écrit :

Thanks for the reply. Im planning on firing events whenever a forum
thread changes. An event listener will simply refetch the data for
that thread from the DB and reindex. It seems easier to get the
existing ES doc (indexed by threadID), delete it (if it exists) and re-
index a new doc with the changed data. Is this acceptable and if so is
there any additional steps I'm missing such as refreshing the index?

Thanks again,
Shane

On Oct 14, 8:27 am, Shay Banon kim...@gmail.com wrote:

First, how do you know which documents you need to reindex? Also, do you
want to reindex them based on the data stored in elasticsearch itself, or
based on "outside" data?

If you have hte docs that need to be reindexed in a database, for example,
just fetch them from it, and index those docs again.

If its simply based on their IDs and index it from elasticsearch, you can
simply use the Get / Multi Get API to fetch the docs, change then, and index
it again (you can use versioning to make sure no other updates have happened
in between).

If its based on a query, you can execute the search query, possibly using
the scan search type (http://www.elasticsearch.org/guide/reference/java-api/search.html) and
reindex the docs that match it.

On Thu, Oct 13, 2011 at 9:55 PM, Shane Witbeck sh...@digitalsanctum.comwrote:

I'm in the process of indexing some forums via the Java API. What's
the recommended way to re-index individual documents? Could someone
explain the steps necessary for doing this? Even better, if there's an
example using the Java API showing this common use case, I'd be
interested in seeing that too.

Thanks,
Shane


(system) #5