Is it possible to change the ids of a set of documents?


(Rian Stockbower) #1

I have an id that's mapped to a path. That path corresponds with the UNC
path of a document on a drive. When that file changes, I simply re-index
the contents of that file so the index stays up to date.

But what if I have a rename operation? Or a directory rename operation?

I think file renaming should be easy:

  1. Grab the document by its old id
  2. Update the mapped field with the new path, effectively changing its
    document id
  3. There is no step 3

But a directory rename operation is a bit more complicated, I think. I see
two possibilities:

Compute the change by walking the directory tree.

  1. Recurse through the changed directory, and build a list of paths that
    will have to be updated
  2. Grab each document by its old id
  3. Update the mapped field with the new path, effectively changing its
    document id

But this can be an expensive operation in terms of disk IO, even if the
computed query is just a single bulk update query which would be quite fast.

Second possibility: have elasticsearch examine the index for ids that
contain a pattern that matches the old path, and update each to the new
path. Is such a thing possible? If so, how expensive would it be?

-Rian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Shuai Lin) #2

Hi,

You can use a prefix query[1] to get all the docs that need to be updated,
and do a partial update[2].

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html

[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html

Regards

On Sat, Nov 9, 2013 at 11:52 PM, rianjs rstockbower@gmail.com wrote:

I have an id that's mapped to a path. That path corresponds with the UNC
path of a document on a drive. When that file changes, I simply re-index
the contents of that file so the index stays up to date.

But what if I have a rename operation? Or a directory rename operation?

I think file renaming should be easy:

  1. Grab the document by its old id
  2. Update the mapped field with the new path, effectively changing its
    document id
  3. There is no step 3

But a directory rename operation is a bit more complicated, I think. I see
two possibilities:

Compute the change by walking the directory tree.

  1. Recurse through the changed directory, and build a list of paths
    that will have to be updated
  2. Grab each document by its old id
  3. Update the mapped field with the new path, effectively changing its
    document id

But this can be an expensive operation in terms of disk IO, even if the
computed query is just a single bulk update query which would be quite fast.

Second possibility: have elasticsearch examine the index for ids that
contain a pattern that matches the old path, and update each to the new
path. Is such a thing possible? If so, how expensive would it be?

-Rian

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3