I'm trying to deal with the following issue.
I'm using the real time get api from java. The json object I store in ES contains a binary field that represents an xml file. This field sometimes will be stored (PUT operation) compressed with GZIP, and sometimes not.
Since 'source' compresses content and I don't want my binary field to be compressed twice, I have excluded this binary field from the "_source" field and, at the same time, I have declared this binary field with "store=yes".
With this configuration, and an index of "mmapfs" type, I perform the following operations:
The document ID is a large string (complicated, I know) and, in this case, the xml is sent compressed with GZIP (it's the last field, "cache.response")
The first GET operation gets the document UNCOMPRESSED, whilst the second GET gets the document COMPRESSED (as it should).
From this point, any GET operation returns the document correctly compressed. The version number is always "1" for all operations.
But this DOES NOT HAPPEN consistently. After a PUT operation with a GZIP xml, sometimes I get the document uncompressed and sometimes compressed. After some tests, my impression is that just the GET operations I perform immediately after a PUT (within the same second) returns the uncompressed document. Every GET after 1,2,3 or more seconds returns the compressed document.
I don't know what it's happenning, maybe (probably, I'm new on ES) I'm doing something wrong. More information:
Just one node with one index of "mmapfs" type ("disk_idx") and one index of "memory" type ("memory_idx")
One local client:
this.node = NodeBuilder.nodeBuilder().settings(s).local(true).data(true).client(false).node();
this.client = this.node.client();
OS version: Linux version 2.6.32-431.3.1.el6.x86_64 (email@example.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014
Apache-Tomcat-7.0.57, ElasticSearch 1.7 embedded into our webapp, java jdk 1.7.0_71.
Any help will be really appreciated.