I have a bunch of Elasticsearch documents that contain information about study fields. I'm trying to aggregate the studyfields field to extract the number of "study fields" instances from the job posting. e.g. data science, web, network security, etc. Instead what I'm getting are buckets that match the title as a whole instead of the each word it the study field. e.g. "data science, web, network security", "data analyst, security network", etc.
How can I tell Elasticsearch to split the aggregation based on each word in the study fields as opposed the matching the value of the whole field.
Current query:
GET /test_index/_search
{
"query": {
"match_all": {}
},
"aggs": {
"group_by_state": {
"terms": {
"field": "studyfeild"
}
}
}
}
Unwanted Output:
{
...
"hits": {
"total": 63,
"max_score": 0,
"hits": []
},
"aggregations": {
"group_by_state": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 14,
"buckets": [{
"key": "data science, web, network security",
"doc_count": 6
},{
"key": "data analyst, network security",
"doc_count": 6
},
...
]
}
}
}
Desired Output:
{
...
"hits": {
"total": 63,
"max_score": 0,
"hits": []
},
"aggregations": {
"group_by_state": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 14,
"buckets": [{
"key": "data science",
"doc_count": 12
},{
"key": "web",
"doc_count": 8
},{
"key": "network security",
"doc_count": 5
},{
"key": "data analyst",
"doc_count": 5
},
...
]
}
}
}