Elastic Aggregation: How to analyze the aggregation within an analyzed aggregation

Ubeta · August 30, 2020, 10:13am

I am trying to use an analyzer within a analzyer on a field and get it's aggregation if that makes sense.

eg.

Field1.value: 'A01/00 | A02/01 | B01/00'
Field2.value: 'A01/01 | C02/01 | D01/00'

Currently, I have a custom pattern analyzer that indexes the terms by the '|' pattern, separating Field1.value and Field2.value into:

Field1.value terms: [A01/00, A02/01, B01/00]
Field2.value terms: [A01/01, C02/01, D01/00]

If I were to aggregate now, each term would give a value count of 1.

What I want to achieve is to further analyze these results through a another pattern analyzer through the forward slash ('/') pattern, creating:

Field1.value final terms: [A01, A02, B01]
Field2.value final terms: [A01, C02, D01]

This would now result in the A01 term aggregation value returning 2, which is what I want.

I have tried a a painless script into my elastic java API code:

new Script(ScriptType.INLINE, "painless",
    "def path = doc['field.analyzer'].value; " +
    "if (path != null) {" +
    "int index = path.lastIndexOf('/'); " +
    "if (index > 0) {" +
    "return path.substring(0, index);" +
    "}" +
    "}" +
    "return ''", Collections.emptyMap())

which gives me the "final" term I want but only returns the first term before the first bar '|' while I want all the final terms.

ei.

Field1.value final terms: [A01]
Field2.value final terms: [A01]
= gives a value count of 2 for 'A01' but only returns 'A01'

I also tried looping through the values and adding them to an array but then the aggregation counts get messed up and are not accurate, which defeats the whole purpose of using aggregations.

new Script(ScriptType.INLINE, "painless",
    "List returnList = new ArrayList(); " +
    "for (int i = 0; i < doc['field.analyzer'].length; i++) {" +
    "def path = doc['field.analyzer'][i]; " +
    "if (path != null) { " +
    "int index = path.lastIndexOf('/'); " +
    "if (index > 0) { " +
    "returnList.add(path.substring(0, index)); " +
    "} else { " +
    "returnList.add('');} " +
    "} else { " +
    "returnList.add('');}} " +
    "return returnList ", Collections.emptyMap()));

returns:

Field1.value final terms: [A01, A02, B01] value count of exactly one
Field2.value final terms: [A01, C02, D01] value count of exactly one

Is there anyway to do a nested analyzer index or a nested aggregation without changing the structure of the data? Or is it possible to use painless in a way to return the correct aggregated terms and value counts without having to do extra java code operations?

Any help is much appreciated.

Thanks!

system · September 27, 2020, 10:13am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.