I'm building out integration tests for my ES5.5 plugin that I wrote which pulls values from a document and generates a score based on those values. Here's what I have to build out my test cluster:
public void prepESClsuter() throws Exception {
Random random = new Random();
// Create a new index
String mapping = XContentFactory.jsonBuilder().startObject().startObject("type")
.startObject("properties")
.startObject("content").field("type", "string").endObject()
.startObject("someField").field("type", "string").endObject()
.startObject("score").field("type", "double").endObject()
.startObject("original_score").field("type", "double").endObject()
.endObject().endObject().endObject()
.string();
assertAcked(prepareCreate("test")
.addMapping("type", mapping, XContentType.JSON));
List<IndexRequestBuilder> indexBuilders = new ArrayList<IndexRequestBuilder>();
// Index 10 records (0..9)
for (int i = 0; i < CONTENT.length; i++) {
indexBuilders.add(
client().prepareIndex("test", "type", Integer.toString(i))
.setSource(XContentFactory.jsonBuilder().startObject()
.field("content", CONTENT[i])
.field("original_score", random.nextDouble())
.endObject()));
}
// Index a few records with empty content
for (int i = 0; i < 2; i++) {
indexBuilders.add(
client().prepareIndex("test", "type", Integer.toString(i + CONTENT.length))
.setSource(XContentFactory.jsonBuilder().startObject()
.field("someField", CONTENT[i])
.field("original_score", random.nextDouble())
.endObject()));
}
indexRandom(true, indexBuilders);
flush("test");
}
The CONTENT variable is a String[] containing certain sentences that I would run a query to match on. Inside the plugin (IMPORTANT: I have designed the 5.5 plugin based off of this new model of plugin building), I need to access these fields, however I'm running into a few issues. It has taken me a lot of time to get to the bottom of this, but here are my findings. The following code inside the runAsDouble() method:
IndexableField source = context.reader().document(currentDocid).getField("_source");
Produces this result:
stored<_source:[7b 22 63 6f 6e 74 65 6e 74 22 3a 22 62 65 61 63 68 20 70 61 72 74 69 65 73 22 2c 22 63 61 6d 70 61 69 67 6e 5f 69 6e 66 6f 72 6d 61 74 69 6f 6e 2e 6f 72 69 67 69 6e 61 6c 5f 7a 73 63 6f 72 65 22 3a 30 2e 34 37 35 36 35 32 33 33 36 31 36 35 39 35 30 34 7d]>
Which if translated from hex, is this value
context.reader().document(currentDocid).getField("_source").binaryValue().utf8ToString();
{"content":"beach parties","campaign_information.original_zscore":0.4756523361659504}
Hey! Excellent, those are the fields that I need. My current issue is, now it's a string, instead of a map, and Lucene/ES doesn't play well with 3rd party libraries like GSON and Jackson, and parsing through documents that will be MUCH larger than this in string format (using the method above) will be a bear to deal with.
I haven't found any methods that easily extract the Key Value pairs from the stored source of these documents, and it's the last piece of the puzzle I need in order for my plugin to work properly. I've also tried using this method to get the values:
context.reader().document(currentDocid).getValues("_source")
and it returns null.
Any suggestions?