Are there special considerations for indexing extremely long Strings


(Steve Wall) #1

I have a text field that I need to store in Elastic where it will contain some extremely long values. The values are the contents of Word documents that could be dozens of pages long.

I was not intending to use the mapper attachment plug-in as our application already have the contents of those documents extracted (we have an existing OCR process), so I don't want to encode them again just to have the mapper-attachment plug-in do it's own OCR with Apache Tikka.

Since the contents of this field are going to very some very long strings, are there any special properties I need to set on my mapping or is just declaring the type String enough?

(Currently on ES 2.1. Moving to ES late this year or early next year. I would be interested in the answer to both 2.1 and 5.)

Thanks,
Steve


(Isabel Drost-Fromm) #2

The only limit I could think of would be the one mentioned here:

Others may have more precise answers for you.


(system) #3