Index : no VS Index: not_analyzed

Hello,

I have encountered an issue in Logstash. Currently I am using LS, 2.0.0.

Now, in my mapping. Previously I have a field:

"content" : {
                   "type" : "string",
                   "index" : "not_analyzed"
                 },

I output data there that will exceed the container of 32,766 bytes.

It throws the error of:

"reason" => "Document contains at least 1 immense term in field="content" (whose UTF8..."

I found this in stackoverflow: elasticsearch - UTF8 encoding is longer than the max length 32766 - Stack Overflow

So to make this work, I have to do this:

        "content" : {
                           "type" : "string",
                           "index" : "no"
                         },

It fixes the issue. Based on that overflow thread, it says that it still makes the field of content not "filterable" or searchable.

I wanted to discuss on why did it fix the issue? I found no source material (or I did not see it) in elasticsearch doucmentation. Could anyone explain to me the pro's of using "index" : "no" against "index" : "not_analyzed"?

I plan to use the "index" : "no", in my mapping so that I can fix the issue. I just want to know why it worked and what are its pros and cons.

Thanks,

Have a read of https://www.elastic.co/guide/en/elasticsearch/reference/2.3/mapping-index.html

1 Like