Practical size for attachments

Hi All,

For attachment data, is there a practical size limit to the content of the attachment?

I wish to add multiple attachments to an index in a nested sub-object and there may be a lot of data.
Is there a point where it is more practical to have an index (or multiple indices) of just attachment data?

regards,
Dave.

I'd not store blobs in elasticsearch but just the extracted text.

But there's indeed some limits.
One of them is the http max size limit which is IIRC 100mb.

Hi David,

Even for the extracted text the data could be large. I have pumped some 40Mb pdfs through the attachment ingest pipeline and they end up with 6-8 mb of text.

I'm just looking for any indication from experience of what point this is no longer a good strategy, either search performance degrades, the indexed Json objects become unwieldy or anything else ...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Sorry for the late reply.
In general, it's not efficient to store very big json documents as anytime you search for them, you will read by default the source of 10 documents from disk.
This consume time, network bandwidth...

Also, when Lucene segments need to be merged, you will end up with lot of IOS on disk.
Same when a node leaves, lot of things have to be copied over the network.

Keeping json documents as small as possible is better IMO.