Hi All,
I currently meet siginificant performance issue when using script aggreation. Here is my test script which using to do some mapping work:
GET _search
{
"size": 0,
"aggs": {
"id_host_status": {
"terms": {
"size": 0,
"script": "if (doc['HOST_STATUS'].value == 'Closed_Adm' || doc['HOST_STATUS'].value == 'Closed_Full' || doc['HOST_STATUS'].value == 'Closed_LIM' ) { return 'Close'} else if(doc['HOST_STATUS'].value == 'Unavailable' || doc['HOST_STATUS'].value == 'Unavail'){return 'Unavail'} else { return doc['HOST_STATUS'].value} "
}
}
}
}
It took 17+ seconds to go through around 9M records
{
"took": 17608,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 8834017,
"max_score": 0,
"hits": []
},
"aggregations": {
"id_host_status": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "OK",
"doc_count": 8299098
},
{
"key": "Unavail",
"doc_count": 381977
},
{
"key": "busy",
"doc_count": 150535
},
{
"key": "-OK",
"doc_count": 2403
},
{
"key": "-busy",
"doc_count": 4
}
]
}
}
}
which was much slowers then using simply terms aggreation ( 400+ ms in below case):
GET _search
{
"size": 0,
"aggs": {
"id_host_status": {
"terms": {
"field": "HOST_STATUS"
}
}
}
}
Did I using a wrong way? or any other better way avaliable?
Many thanks.
Jin