Reading _source field

Good afternoon,

I did some searching on the boards here, and I did find someone had touched
on this subject before[1], but maybe i am still missing something because
my results were not good.

I am using apache's lucene api to read the documents generated by elastic
search, I am able to iterate documents and extract the _source field... i'm
trying to convert this back to a string. According to documentation[2] the
field can be compressed with LZF, I checked our config and it seems we left
this default -> disabled.

I am getting only junk strings. :frowning: Any help would be appreciated.

BytesRef bRef = doc.getBinaryValue("_source");
byte[] decodedBytes = decoder.decode(bRef.bytes);
String tmp = new String(decodedBytes, ("UTF-8");
[1]
https://groups.google.com/forum/?fromgroups#!searchin/elasticsearch/_source$20decode/elasticsearch/pWfgxKF0GWc/GgcF1_OUEXEJ
[2] http://www.elasticsearch.org/guide/reference/mapping/source-field/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Never looked into it, but try looking at some of the logic used in
SourceFieldMapper and its relevant tests:

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/mapper/internal/SourceFieldMapper.java

https://github.com/elasticsearch/elasticsearch/tree/master/src/test/java/org/elasticsearch/test/unit/index/mapper/source

The unit tests are often useful to figure out how the code is constructed.
A static helper class called CompressorFactory is used. Maybe that would
have some clue.

--
Ivan

On Thu, Jul 11, 2013 at 10:40 AM, Arni Sumarlidason
sumarlidason@gmail.comwrote:

Good afternoon,

I did some searching on the boards here, and I did find someone had
touched on this subject before[1], but maybe i am still missing something
because my results were not good.

I am using apache's lucene api to read the documents generated by elastic
search, I am able to iterate documents and extract the _source field... i'm
trying to convert this back to a string. According to documentation[2] the
field can be compressed with LZF, I checked our config and it seems we left
this default -> disabled.

I am getting only junk strings. :frowning: Any help would be appreciated.

BytesRef bRef = doc.getBinaryValue("_source");
byte[] decodedBytes = decoder.decode(bRef.bytes);
String tmp = new String(decodedBytes, ("UTF-8");
[1]
https://groups.google.com/forum/?fromgroups#!searchin/elasticsearch/_source$20decode/elasticsearch/pWfgxKF0GWc/GgcF1_OUEXEJ
[2] http://www.elasticsearch.org/guide/reference/mapping/source-field/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.