How does Attachment plugin interact with replicas/replication?


Hi all,

if you use the field type "attachment" implemented by the Attachment plugin and have replicas, is text extraction (for example getting plain text out of a MS Word .docx) repeated for each replica? Or is text extraction done once on the primary shard and only the extracted plain text replicated. If extraction is repeated by default, is it possible to have extraction done only once on the primary by means of an appropriate mapping, for example by not "storing" the source but only the "attachment" field?

Thanks and best regards

(Mark Walkom) #2

(EDITED - I totally got this wrong the first time sorry!) Indexing is done on the primary and replica shard. If you are doing bulk indexing it makes sense to drop replicas to zero and then add them back in once indexing is done.

You can tell ES to not store different fields if you want.

(system) #3