Large string fields


(Yildiz) #1

Hi there,

Anyone experienced compressing large strings (gzip or 7zip) and then store it on Elasticsearch or does it make any sense to compress large strings like 100k chars (not >1 mb) ?


(David Pilato) #2

Elasticsearch uses compression on stored fields already behind the scene.


(Yildiz) #3

Thanks for the answer David. Actually, this question kinda related Elasticsearch and i was wondering if i compress the string and store it, at the run time while doing too much decompress should i be aware of high memory usage in c# or Java ?


(David Pilato) #4

So you don't want to search for the fields but just store some compressed content?

You can store a binary content in an elasticsearch field. But you need first to convert it to BASE64 so I'm not sure you really want to do that.

I don't understand what kind of problem you are trying to solve here.


(Yildiz) #5

Yes, basically i have a document on elastic that i make search on many field and each doc has a text field like "description" and while indexing docs i had some trouble about indexing large string field. So i remove that field from the doc and began to get the text from another data source. In this case, i get the document from elastic but the text. So getting the data from two different data source doesn't make sense to me. Beside Elasticsearch, as a software developer, does it make sense to you that compressing text while indexing and using only ElasticSearch as data source ?

By the way, I apologize for taking your time, related to problem below, should i index only fields that i'll make search on it, or index the whole content and use Elasticsearch as my main data source.


(David Pilato) #6

So just add your big text field but don't index it with index: false in the mapping for this big field.

In that way, elasticsearch will still have the big field as part of the _source and _source is compressed by elasticsearch.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.