Storing Very Large Text Field in Elasticsearch

Hi @kdwolf

I'm a little confused. Are you referring to a single token that is 40-50MB, or a text field containing multiple strings/tokens (this seems to be what you are implying)? If so, that is fine to store 40-50MB in a single text field in Elastic.

Perhaps you are confusing the single token limit in Lucene, which is 32K

Lucene still has a document limit (field limit) of about 2GB.

That said, there are other areas of concern, like actually returning the data due to HTTP limits, etc.

Perhaps take a look at this thread

Minor Side Note: Most of the text in a .mobi file is binary so not sure if that is just and example ... The Header etc is text but the actual text of the document is binary AFAIK.