Using painless with aggregation results


(Sri) #1

Hi,

I am a little new to using painless scripting. I was able to use it in a kibana visualization to convert the units of a field. Also in a query_and_update scenario.

Now I have a situation/aggregation that does not fit kibana, so trying to use the 3rd party transform plugin. I thought it should be possible to use scripted_fields to count the number of occurrences of a specific term in the aggregated results. But, I havent figured how the right syntax it seems.

Once past this problem - If statusText can be "a" "b" or "c" etc., I would like to calculate how many "a", "b" and "c" were found.

{
  "aggs": {
      "terms": {
        "field": "myfield.raw",
        "size": 10000
      },
      "aggs": {
        "latest_record": {
          "top_hits": {
            "sort": [
              {
                "@timestamp": {
                  "order": "desc"
                }
              }
            ],
            "_source": {
              "includes": [
                "statusText"
              ]
            },
            "size": 1
          }
        }
      }
    },
    size = 0,
   "scripted_fields": {
   "script": {
		"lang": "painless",
		"inline": "int cnt=0; return cnt;"
      }
   }
}

"[parsing_exception] Unknown key for a START_OBJECT in [scripted_fields]., with { line=1 & col=227 }"

Thanks


(Simon Willnauer) #2

I think your scripted fields should look like this:

"scripted_fields": {
   "my_field_name" : {
      "script": {
	  	"lang": "painless",
		"inline": "int cnt=0; return cnt;"
        }
   }
}


(Sri) #3

Thanks @s1monw Simon,

I tried the additional "nesting" now and got the same error. I believe I had tried few variants like this earlier too.

{"error":{"root_cause":[{"type":"parsing_exception","reason":"Unknown key for a START_OBJECT in [scripted_fields].","line":1,"col":227}],"type":"parsing_exception","reason":"Unknown key for a START_OBJECT in [scripted_fields].","line":1,"col":227},"status":400}

Could it be the type of aggs query in this specific example are not compatible? I have confirmed the aggs query part alone is returning results.

I'm ok to try if you have another working aggs example I could try in my env to confirm it works.


(Simon Willnauer) #4

oh well :smiley: it's script_fields not scripted_fields I am sorry I didn't see it earlier.


(Sri) #5

Oh no! cant believe I was blind :cry:

I'm still not exactly clear how script_fields works with aggregations - where does the script execute & how would response change? I was expecting a "my_field_name" added to the aggs response, but I dont see any change to the response with or without the script_fields section. I've gone through the docs and tried to understand the examples I saw in the forums, but would be good if there are other references you know.


(Sri) #6

Hi Simon @s1monw

Any clues reg my last comments - why isnt there a new field in the response body? I am not sure how to debug further either.

Thx


(Simon Willnauer) #7

can you share request and response?


(Sri) #8

Thanks @s1monw

Note: after executing the request in Sense, I've find-replaced confidential values of key and index fields before posting here.

request

{
"aggs": {
    "by_myfield": {
      "terms": {
        "field": "myfield.raw",
        "size": 10000
      },
      "aggs": {
        "latest_record": {
          "top_hits": {
            "sort": [
              {
                "@timestamp": {
                  "order": "desc"
                }
              }
            ],
            "_source": {
              "includes": [
                "currentStateText"
              ]
            },
            "size": 1
          }
        }
      }
    }  
  },
  "size": 0,
  "script_fields": {
   "my_field_name" : {
      "script": {
	  	"lang": "painless",
		"inline": "int online_cnt = 0;  return online_cnt++;"
        }
   }
}
}

response

{
	"took": 618,
	"timed_out": false,
	"_shards": {
		"total": 3,
		"successful": 3,
		"failed": 0
	},
	"hits": {
		"total": 50,
		"max_score": 0.0,
		"hits": []
	},
	"aggregations": {
		"by_apid": {
			"doc_count_error_upper_bound": 0,
			"sum_other_doc_count": 0,
			"buckets": [{
				"key": "key77",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379151_f0be6bb641cf695b10d81383c50af87b",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379151]
						}]
					}
				}
			},
			{
				"key": "key60",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379144_2b56b947c95231f260bf10d57b2bf4e1",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379144]
						}]
					}
				}
			},
			{
				"key": "keyef",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379150_b1bede849555309b9d38d3f04d46c058",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379150]
						}]
					}
				}
			},
			{
				"key": "key1d",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379151_c93456016cb07c47d873f2ab701cb09d",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379151]
						}]
					}
				}
			},
			{
				"key": "key87",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379151_147c4858295ff2f67da9cb779b851183",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379151]
						}]
					}
				}
			},
			{
				"key": "keya5",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379149_4244a20a7020c99daa954c45f9289ccc",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379149]
						}]
					}
				}
			},
			{
				"key": "keyc9",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379152_0f2c1124f79e5fbbdd4865f30a7a6f39",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379152]
						}]
					}
				}
			},
			{
				"key": "key95",
				"doc_count": 6,
				"latest_record": {
					"hits": {
						"total": 6,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1516140379149_75833d3fbbf13666a506b753b3644b43",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1516140379149]
						}]
					}
				}
			},
			{
				"key": "key28",
				"doc_count": 2,
				"latest_record": {
					"hits": {
						"total": 2,
						"max_score": null,
						"hits": [{
							"_index": ".my-index-2018.01_v1",
							"_type": "log",
							"_id": "1515712313126_34036a389510cb4970af0f9167be3c2d",
							"_score": null,
							"_source": {
								"currentStateText": "Online"
							},
							"sort": [1515712313126]
						}]
					}
				}
			}]
		}
	}
}

(Simon Willnauer) #9

if you want the script field to show up in the in top hits agg you need to place the script field request part as part of the top hits agg like this:

    "aggs": {
        "latest_record": {
          "top_hits": {
            "sort": [
              {
                "@timestamp": {
                  "order": "desc"
                }
              }
            ],
            "_source": {
              "includes": [
                "currentStateText"
              ]
            },
            "size": 1,
            "script_fields": {
                "my_field_name" : {
                    "script": {
	  	        "lang": "painless",
		        "inline": "int online_cnt = 0;  return online_cnt++;"
                    }
              }
          }
        }
      }

(Sri) #10

Thanks Simon @s1monw .

This does not solve my original problem, however.

You will notice that my aggs response could include different values of currentStatusText ("online", "offline", "waiting" etc.), one each for the myfield buckets.

I would like to calculate how many "online", "offline" were found, for example. In the sample response I pasted earlier, I would determine that there are zero "offline", nine (i.e all buckets) are "online".

Is this possible somehow with painless?

Thx


(Simon Willnauer) #11

can't you just use a terms aggregation on the currentStatusText then? https://www.elastic.co/guide/en/elasticsearch/reference/6.1/search-aggregations-bucket-terms-aggregation.html


(Sri) #12

Hi Simon @s1monw

Where would I put the terms aggs in relation to the other terms and top_hits aggs in my query?

I had tried this and (perhaps incorrectly) concluded this is not possible before switching to scripting approach. ie. added this as a sub-agg inside top_hits. I also tried adding at other "levels" as experiments and got parse errors as expected.

"aggs": {
        "statusCnts": {
            "terms": { "field": "currentStateText" }
        }
    }

{"error":{"root_cause":[{"type":"aggregation_initialization_exception","reason":"Aggregator [latest_record] of type [top_hits] cannot accept sub-aggregations"}],"type":"aggregation_initialization_exception","reason":"Aggregator [latest_record] of type [top_hits] cannot accept sub-aggregations"},"status":500}

btw, just to confirm & understand the possibilities with scripting - I take it scripting is not an option here?

Thx


(Simon Willnauer) #13

on the same level, you can't nest aggs into the top hits.


(Sri) #14

Thanks, but how does that address my use-case? My use-case involves aggregating the results of the top-hits, right? Let me know if my use-case is not clear/how I can clarify.


(system) #15

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.