Hello,
I have a pretty simple use case that has been discussed at length on this forum, I'm storing a large block of text that I just want to reside in my source document, not be indexed at all. Originally this field was index with the following:
"message_body": {
"type": "keyword",
"index": false
}
We chose keyword because it was either keyword or text and with index: false
we didn't think it would matter. However after a little while we ran into the 32K limit issue for the keyword field. I created a new index using "ignore_above": 1
on the field and to my surprise that worked.
So I migrated the data with the reindex api and that solved it. But I noticed something very surprising that I'd like help understanding, the size the of index on disk was reduced greatly. Just by adding: ignore_above: 1
, the size of my index (total.store.size_in_bytes
) has dropped from 39 GB to 15 GB in our test env. While this is really great because our production index is approach 1 TB, I don't understand the reason for this. With index: false
the data is not store in the inverted index right? If that's the case why would there be such a different with ignore_above
being set?
I also repeated this test using "type": "text"
and "index": false
. In that case the size of the index was also reduced down to 15GB. Can someone explain this to me?