I have an id that's mapped to a path. That path corresponds with the UNC
path of a document on a drive. When that file changes, I simply re-index
the contents of that file so the index stays up to date.
But what if I have a rename operation? Or a directory rename operation?
I think file renaming should be easy:
Grab the document by its old id
Update the mapped field with the new path, effectively changing its
document id
There is no step 3
But a directory rename operation is a bit more complicated, I think. I see
two possibilities:
Compute the change by walking the directory tree.
Recurse through the changed directory, and build a list of paths that
will have to be updated
Grab each document by its old id
Update the mapped field with the new path, effectively changing its
document id
But this can be an expensive operation in terms of disk IO, even if the
computed query is just a single bulk update query which would be quite fast.
Second possibility: have elasticsearch examine the index for ids that
contain a pattern that matches the old path, and update each to the new
path. Is such a thing possible? If so, how expensive would it be?
I have an id that's mapped to a path. That path corresponds with the UNC
path of a document on a drive. When that file changes, I simply re-index
the contents of that file so the index stays up to date.
But what if I have a rename operation? Or a directory rename operation?
I think file renaming should be easy:
Grab the document by its old id
Update the mapped field with the new path, effectively changing its
document id
There is no step 3
But a directory rename operation is a bit more complicated, I think. I see
two possibilities:
Compute the change by walking the directory tree.
Recurse through the changed directory, and build a list of paths
that will have to be updated
Grab each document by its old id
Update the mapped field with the new path, effectively changing its
document id
But this can be an expensive operation in terms of disk IO, even if the
computed query is just a single bulk update query which would be quite fast.
Second possibility: have elasticsearch examine the index for ids that
contain a pattern that matches the old path, and update each to the new
path. Is such a thing possible? If so, how expensive would it be?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.