How can I copy documents to same index and type?


(liny) #1

Hi,

I am indexing files to elasticsearch 5 for content searching.
A "content" field contains the content of file, which maybe has large data in mega-byte.
The JSON example is shown as below:

{
    "fileId": "unique_fileId",
    "filename": "ms_word.docx",
    "content": "<large data here>"
}

Because user will copy files to another folder, I need to copy documents to the same index and type with different document IDs.
I don't want to read files again, and the best way should be just copying theses documents within same index and type by setting new fileId.
The reindex API can't fit my requirement because the target must be another index.

I'd like to know if any better way to do above requirement?
Appreciated for any comments.

Anderson.


(Christoph) #2

Hi,

two general question:

  • Why are you storing your binary data in elasticsearch at all? Is it searchable content?

  • when files change location on the user side, why can't you just update some path information about the location of the "large data"?


(liny) #3

Hi, @cbuescher:

Why are you storing your binary data in elasticsearch at all? Is it searchable content?

Not binary data, I'll use Apache Tika to extract the content as text for searching.

.

when files change location on the user side, why can't you just update some path information about the location of the "large data"?

For copying only operation, this could be a good solution.
But if the user copies "my_word.docx" to "my_word_1.docx" in the same folder, I still have to copy as new document, because these two files are two records in searching result.
And maybe later user updated "my_word_1.docx", I only need to read the file content once for updating.

Let me know if any.
Thank you.


(Christoph) #4

Why not insert the new document again then?


(liny) #5

I think it should be the fastest way to tell ES to copy the same content instead of reading and extracting file again.
Let me know if any, thank you.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.