Hi. I'm trying to aggregate on part of strings of a given field.
For example, I have 5 documents in an index "Petroglyph".
Each documents includes a field called "Site".
In "Site", input values follow pattern like 'site1<site2<site3<site4',
doc1 - "Site": "Asia<South Korea<Ulsan<Bangudae"
doc2 - "Site": "Europe<France<Dordogne<Grotte de Cussac"
doc3 - "Site": "Europe<France<Dordogne<Lascaux"
doc4 - "Site": "Asia<South Korea<Ulsan<Cheon-Jeon-Ri"
doc5 - "Site": "Africa<Namibia<Kunene<Twyfelfontein"
and I wish to do aggregation on 'site2<site3' regarding its order, which should look like:
buckets: [
{
key: 'South Korea<Ulsan',
doc_count: 2
},
{
key: 'France<Dordogne',
doc_count: 2
},
{
key: 'Namibia<Kunene',
doc_count: 1
}
]
I have read and tried node.js - Elasticsearch aggregation on part of string, not full string - Stack Overflow that seems very similar to my question, however, it did not work which I believe it was written in consideration of version less than 7.0.
Also, I have given a thought of splitting value of Site with '<', save the split result in an array, and aggregate on result 'array[1]<array[2]'.
But it seems to be creating complexity and not elasticsearch-likely according to my limited experience.
What would be the way to solve this problem?
Thank you.